Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Transactions Concurrency, Isolation & Locking These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. For more information on how you may use them, please see http://www.openlineconsult.com/db 1 Topics Isolation, Schedules & Serializability Locking Conservative & Strict 2PL Deadlock & Livelock Deadlock Prevention Shared & Exclusive Locks Lock Granularity Lock Conversions Multi-Granular Locking Phantom Reads & Index Locks Long Transactions and Sagas © Ellis Cohen, 2002-2005 2 ACID Properties of Transactions Atomicity All of the updates of a transaction are done or none are done Consistency Each transaction leaves the database in a consistent state (preferably via consistency predicates) Isolation * Each transaction, when executed concurrently with other transactions, should have the same effect as if executed by itself Durability Once a transaction has successfully committed, its changes to the database should be permanent © Ellis Cohen, 2002-2005 3 Schedules & Serializability © Ellis Cohen, 2002-2005 4 Attaining Isolation Isolation requires that changes made by a transaction – do not persist – are not visible to any other transaction until the transaction commits Isolation is achieved in two ways – Cache-Based Mechanisms – Non Cache-Based Mechanisms © Ellis Cohen, 2002-2005 5 Cache vs Non-Cache Mechanisms Cache-Based Mechanisms A separate cache is maintained by/for each client All data updates are made to the cache, which is how isolation is achieved When the transaction commits, all data updated by the transaction is written back from the client's cache to the server database state (often checking first whether this is allowed) Non-Cache-Based Mechanisms All data updates are made directly to the server database state Isolation is achieved by preventing operations from proceeding which cause problems (e.g. by using Locking) If we assume that operations can interleave in any order, we can see what kinds of problems can arise! © Ellis Cohen, 2002-2005 6 Schedules • A transaction may have multiple operations – These execute one after another • Multiple transactions may execute concurrently – Assume that only one transaction is executing an operation at a time – The operations from different transactions may interleave Transaction Transaction A B 1 update XT set x = x + 100 1 update XT set x = x * 2 2 update YT set y = y + 100 2 update YT set y = y * 2 • Schedule Schedule: – The overall sequence of the operations from these transactions A1 B1 A2 B2 Note: XT and YT both have one row and one column © Ellis Cohen, 2002-2005 7 Serial & Interleaved Schedules A: 1) update XT set x = x + 100 2) update YT set y = y + 100 B: 1) update XT set x = x *2 2) update YT set y = y *2 Consistency assertion: (select x from XT) = (select y from YT) Assume initially x = 30 and y = 30 The first two schedules are serial schedules. These transactions execute serially. They do not interleave. What's the result of A1 A2 B1 B2 B1 B2 A1 A2 A1 B1 A2 B2 A1 B1 B2 A2 © Ellis Cohen, 2002-2005 8 Serializability Equivalence [also called view equivalence] Two schedules (involving the same committed transactions) are equivalent if • They result in the same database state • The results of the corresponding queries are identical Serializability [also called view serializability] A schedule is serializable if it is equivalent to a serial schedule Executing non-serializable schedules causes a variety of problems Serializability is the strongest form of isolation © Ellis Cohen, 2002-2005 9 Reads & Writes An operation on a data item is a READ-only operation if it only reads a value select empno from Emps where sal > 1000; An operation on a data item is a WRITE-only operation (also called a blind write) if – it writes a value but doesn't read it – The result value of the data item is completely independent of its original value update XT set x = 40 © Ellis Cohen, 2002-2005 10 Updates An operation on a data item is a READ+WRITE operation (also called an UPDATE operation) if – the data item is both read & written – the resulting value depends on the initial value update XT set x = x + 100 The difference between a READ+WRITE and a WRITE may depend upon the perspective update Emps set comm = 0 – From the perspective of a a single comm cell, this is a blind write – From the perspective of a row, this is a read+write operation Before: 6924 SMITH 4500 3000 30 After: 6924 SMITH 4500 0 30 Our perspective will always be at the row or table level, so almost all updates are treated as READ+WRITE operations © Ellis Cohen, 2002-2005 11 Lost Update Problem Suppose transactions A & B simultaneously deposit $1000 into a checking account 1) select balance into curbal from checking where acctid = 30792; 2) curbal = curbal + 1000; 3) update checking set balance = curbal where acctid = 30792; Each transaction has its own local copy of curbal Serial Schedule A1, A2, A3, B1, B2, B3 ok Schedule A1, B1, A2, B2, A3, B3 is not What are balance & curbal values at each step if the balance is initially $2300 © Ellis Cohen, 2002-2005 12 Lost Update Values A's curbal balance 2300 B's curbal A1 2300 2300 B1 2300 2300 2300 2300 2300 A2 3300 B2 3300 2300 3300 3300 3300 A3 3300 B3 3300 3300 3300 Arghh, this should be 4300, not 3300 © Ellis Cohen, 2002-2005 13 Lost Update Pattern (WW) Transaction Transaction A B 1 Read data 1 Read data 2 Write data B writes over the same data written by A, without taking into account the changes made by A 2 Write data © Ellis Cohen, 2002-2005 14 Dirty Read Problem Transaction A 1) update checking set balance = balance + 1,000,000 where acctid = 30792 2) ROLLBACK Transaction B for aRec in (select * in checking where balance > 500,000) loop insert into IRS_Report( ssno, acctid, balance) values ( aRec.ssno, aRec.acctid, aRec.balance ) Serial Schedule A1, A2, B is ok Schedule A1, B, A2 is not © Ellis Cohen, 2002-2005 15 Dirty Read Pattern (WR) Transaction Transaction A B 1 Write data 1 Read data 2 2 © Ellis Cohen, 2002-2005 A Dirty Read (also called an Uncommitted Read) happens when B reads uncommitted data modified by A 16 Non Repeatable Read Problem Suppose transactions A & B simultaneously withdraw $1000 from a checking account 1) select balance into curbal from checking where acctid = 30792; 2) if curbal < 1000 then raise error 3) update checking set balance = balance - 1000 where acctid = 30792; 4) emit $1000 from ATM; Serial Schedule A1 A2 A3 A4 B1 B2 B3 B4 ok Schedule A1 A2 B1 B2 B3 B4 A3 A4 is not What are balance & curbal values at each step if the balance is initially $1200 © Ellis Cohen, 2002-2005 17 Non-Repeatable Read Values A's curbal balance 1200 B's curbal A1/A2 1200 (no error) 1200 B1/B2 1200 1200 (no error) A3/A4 200 B3/B4 -800 3300 Arghh, when B1 read balance, it was 1200 But when B3 read balance, it was 200 That's weird © Ellis Cohen, 2002-2005 18 Non-Repeatable Read Pattern (RW) Transaction Transaction A B 1 Read data 1 Write data 2 2 COMMIT Re-Read data © Ellis Cohen, 2002-2005 A NonRepeatable read occurs when B can modify and commit data read by A (which has not yet completed), and when A then re-reads the same data 19 Serialization Problem A: 1) update XT set x = x + 100 2) update YT set y = y + 100 B: 1) update XT set x = x *2 2) update YT set y = y *2 Assume initially x = 30 and y = 30 A1 A2 B1 B2 Serial schedules B1 B2 A1 A2 Which of these problems is caused by A1 B1 B2 A2 © Ellis Cohen, 2002-2005 20 Isolation Levels DB systems can maintain serializability automatically • Maintaining serializability is expensive and not always necessary • DB systems allow users to specify weaker levels of isolation to be maintained automatically Weaker levels of isolation will allow problems (e.g. Non Repeatable Read) to show up. • OK as long as they're "not a problem" • Transactions can selectively serialize explicitly (via locking) when necessary © Ellis Cohen, 2002-2005 21 SQL Isolation Levels Read Uncommitted Lost Update Problem is prevented, but other problems are possible Read Committed Lost Update Problem is prevented Dirty (Uncommitted) Read Problem is prevented Repeatable Read Lost Update Problem is prevented Dirty (Uncommitted) Read Problem is prevented Non Repeatable Read Problem is prevented Serializable All problems (yes, there are more) are prevented Complete isolation © Ellis Cohen, 2002-2005 22 Setting Isolation Level In Oracle SET TRANSACTION ISOLATION LEVEL READ COMMITTED This transaction can see other transaction's writes. Permits dirty reads and non-repeatable reads; Prevents lost updates. THIS IS ORACLE's DEFAULT, because it has very low overhead. SET TRANSACTION ISOLATION LEVEL SERIALIZABLE Prevents dirty reads, non-repeatable reads and lost updates However, not truly serializable. Permits constraint violation problems. © Ellis Cohen, 2002-2005 23 Attaining Serializability Optimistic Concurrency Mechanisms Allows transaction to proceed to completion. When it attempts to commit, abort it if it does not serialize with transactions which already completed. Pessimistic Concurrency Mechanisms Each operation in a transaction is only allowed to proceed if it serializes with overlapping transactions. Otherwise it must wait (or possibly cause its own or another transaction to abort) Lock-based concurrency is pessimistic © Ellis Cohen, 2002-2005 24 Concurrency Control Mechanisms • Locking [Pessimistic] – Transactions lock data before they use it • Optimistic [Optimistic] – Uses cache & pretends entire transaction occurs when it commits • Timestamp [Pessimistic] – Pretends entire transaction occurs when it starts • Read-Consistent [~Pessimistic] – Oracle's Concurrency Model – Uses cache which holds virtual snapshot of the DB state when the transaction starts. Not fully serializable. © Ellis Cohen, 2002-2005 25 Locking © Ellis Cohen, 2002-2005 26 Locking & Waiting Databases can implement serializability using locks – When A accesses it, checking is locked for use by A • More accurately, the DB locks checking for A – If B tries to execute while A is still executing the transaction • B will try to lock checking , but won't succeed because A already has it locked • B waits until checking is unlocked – When A commits, it unlocks checking • B now proceeds to lock checking and continue with its transaction © Ellis Cohen, 2002-2005 27 Database Locking If a transaction needs to access a data item in the database – It must request a suitable lock for it first (unless it already has been granted the lock) If NO other transaction has a conflicting lock – The DB scheduler may grant the lock to the transaction. – The lock is then acquired or held by the transaction If some other transaction holds a conflicting lock – The database scheduler will NOT grant the lock to the requesting transaction. Generally either • the requesting transaction waits, or • the requesting/conflicting transaction is aborted When a lock is released by a transaction – the DB scheduler checks to see whether it can now grant a lock to a waiting transaction which may then be able to continue. © Ellis Cohen, 2002-2005 28 Lock Acquisition In commercial databases that use locking, locks are acquired automatically. When a transaction needs to access a resource, the DB scheduler automatically tries to acquire a lock on that resource for the transaction (unless the transaction is already holding the required lock on the resource). Most databases also have a way to allow a transaction to explicitly request locks in advance (e.g. LOCK TABLE) © Ellis Cohen, 2002-2005 29 Locking Serializes Execution Suppose transactions A & B simultaneously deposit $1000 into a checking account 1) select balance into curbal from checking where acctid = 30792; 2) curbal = curbal + 1000; 3) update checking set balance = curbal where acctid = 30792; Each transaction has its own copy of curbal Consider the schedule: A1 (B1) A2 A3 B1 B2 B3 B1 was attempted, but had to wait because A1 had already locked checking © Ellis Cohen, 2002-2005 30 Releasing & Reacquiring Locks Doesn't Work Suppose transactions A & B simultaneously deposit $1000 into a checking account 1) select balance into curbal from checking where acctid = 30792; 2) curbal = curbal + 1000; 3) update checking set balance = curbal where acctid = 30792; Consider the schedule: A1 B1 A2 A3 B1 B2 B3 © Ellis Cohen, 2002-2005 What happens if the transaction acquires all locks it needs at the beginning of every request and then releases all locks at the end of every request? 31 No Database Pre-Knowledge In commercial DB systems, the DB does not "see" an entire transaction at once, and cannot base decisions on "pre-knowing" the complete sequence of operations in a transaction The database sees requests (i.e. SQL commands) one at a time, as a client makes them. The decisions a DB makes about whether to allow a transaction to proceed, to make it wait, or to abort a transaction are made solely on the requests the DB has already seen. © Ellis Cohen, 2002-2005 32 2PL: Two Phase Locking Each transaction is divided into 2 phases: • Growing phase: locks can be acquired but not released • Shrinking phase: locks can be released but not acquired. This is REQUIRED for serializability! transaction © Ellis Cohen, 2002-2005 33 Lock Release Databases automatically release all acquired locks when a transaction completes (commits or aborts) Some DB's provide a command which allows a lock to be explicitly released earlier – But, if a transaction releases a lock, and then needs to acquire any lock at all, it will be aborted (violates 2PL) – Early release of locks cause other problem, and requires additional rules to ensure serializability © Ellis Cohen, 2002-2005 34 Early Release Problem X=30 y=30 Transaction Transaction A B 1 update XT set x = x + 100 2 update YT set y = y + 100 RELEASE LOCKS 3 X=130 y=130 update XT set x = x * 2 update YT 2 set y = y * 2 1 X=260 y=260 4 Suppose A releases locks (A3) immediately after doing its updates (A1 A2) B acquires those locks and does its updates (B1 B2) Then A is aborted (A4) Is A1 A2 A3 B1 B2 A4 serializable? Is there anything the database scheduler can do to correct this problem? ABORT © Ellis Cohen, 2002-2005 35 Cascading Aborts X=30 y=30 Transaction Transaction A B 1 update XT set x = x + 100 2 update YT set y = y + 100 A database concurrency mechanism is susceptible to cascading aborts if an abort of one transaction forces abort of another (which could force abort RELEASE LOCKS 3 of another …) X=130 y=130 update XT set x = x * 2 update YT 2 set y = y * 2 1 A1 A2 A3 B1 B2 A4 is not serializable! The only thing the database can do is force B to abort If a transaction releases a lock for data it has written, allowing another transaction to read the same data, then aborts can cascade. X=260 y=260 4 ABORT ABORT Can we use recovery log information instead? © Ellis Cohen, 2002-2005 36 Unrecoverable Schedules Transaction Transaction A B update XT set x = x + 100 update YT set y = y + 100 RELEASE LOCKS update XT set x = x * 2 update YT set y = y * 2 COMMIT ABORT More serious version of previous problem because B commits before A aborts. But we can't undo B (without violating durability) because B already committed. So this schedule is unrecoverable! If we want to allow early release of locks without causing this problem, what rule must we add to the DB scheduler? © Ellis Cohen, 2002-2005 37 Avoiding Unrecoverability & Cascading Aborts If locks can be released early • To avoid unrecoverable schedules: A transaction must wait to commit if it – has read data written by a still active transaction – that released its lock on that data • To avoid cascading aborts A transaction must wait to – read data written by a still active transaction – that released its lock on that data © Ellis Cohen, 2002-2005 38 2PL & Serializability Database systems actually use S (Shared) locks, acquired for reading X (Exclusive) locks, acquired for writing For now, we won’t differentiate Using 2PL with S & X locks, and with X locks only released on commit only recoverable, serializable schedules are produced, with no cascading aborts © Ellis Cohen, 2002-2005 39 Conservative & Strict 2PL © Ellis Cohen, 2002-2005 40 Conservative 2PL (Reservations) Each data item (we'll use tables) has a lock Growing phase – When a transaction is about to start, it reserves locks for every table it might possibly use, by EXPLICITLY requesting locks for them, (typically as part of a START TRANSACTION command) – If it can't acquire all of them at once, it blocks (i.e. waits) until it can – If a transaction accesses a resource without having initially acquired the required lock for it, it is aborted! Shrinking phase – When the transaction completes, it releases all its locks transaction © Ellis Cohen, 2002-2005 41 Scheduling Conservative 2PL When a transaction starts Grant the locks if they are available, else make the transaction wait When a transaction completes and releases its locks See if any blocked transactions can now acquire all their locks If so, grant the locks to the one that has been waiting the longest Is this scheduler fair? If so, why? If not, how can it be made fair? © Ellis Cohen, 2002-2005 42 Scheduling & Starvation Multiple transactions can compete by – Having conflicting requests for locks on the same data items – Waiting until they are granted Starvation – A transaction waits forever because of the way the DB schedules competing transactions Fair DB Scheduler – Uses a policy that prevents starvation Aging – A mechanism to prevent starvation – Transactions that are older (i.e. have waited for a long time) and still waiting for all their locks can block competing transactions that could otherwise acquire all needed locks. Schedulers may schedule the transaction with the highest priority. What can determine priority? © Ellis Cohen, 2002-2005 43 Determining Scheduling Priority Wait age – how long it has been waiting Transaction age – how long the transaction has been running Resource quantity – how many DB resources are locked Resource priority – how important are the DB resources which are locked Process urgency – how important is the transaction's process – Inherent (how important is it) – Deadline (how soon must it be done) What happens when an urgent process transaction is waiting for a lock that has been granted to a low priority transaction? How do you solve this? What if the OS doesn't schedule a low priority transaction's process that has locked a frequently used resource? © Ellis Cohen, 2002-2005 44 Unnecessary Reservations We need to reserve settings, begin as well as both localDevices START TRANSACTION and remoteDevices, even LOCKING settings, though one of them will not be used. localDevices, remoteDevices; select typ, val into aTyp, aVal from settings where (id = 30472 and nam = printer); if (aTyp = 'local') then update localDevices set printer = val where id = 30472; elsif (aType = 'remote') then update remoteDevices set printer = val where id = 30472; end if; COMMIT; With Conservative 2PL end; We unnecessarily reserve resources © Ellis Cohen, 2002-2005 45 Strict 2PL Growing phase • First time a transaction needs a lock on a data item (e.g. a table), request it • Block if necessary until it is available Shrinking phase • When the transaction is complete, release all the locks transaction Allows more concurrency than Conservative approach, but can cause Deadlock © Ellis Cohen, 2002-2005 46 Strict 2 Phase Locking Example begin -- the following statement locks settings select typ, val into aTyp, aVal from settings where (id = 30472 and nam = printer); -- assume aTyp is 'local' -- then the following will lock localDevices -- but remoteDevices will not be locked if (aTyp = 'local') then update localDevices set printer = val where id = 30472; elsif (aType = 'remote') then update remoteDevices set printer = val where id = 30472; end if; commit; end; © Ellis Cohen, 2002-2005 47 Problems of Strict 2PL Deadlock Transactions A & B may each have acquired a lock for a data item that the other one needs. Neither can proceed. Convoy Phenomenon Transaction T1 has acquired a lock for D1 and is taking a long time to execute. Transaction T2 acquired a lock for D2, and is waiting to acquire a lock for D1 Transaction T3 acquired a lock for D3, and is waiting to acquire a lock for D2 … All are waiting behind T1, and in the meantime have locked data items so that other transactions cannot use them. © Ellis Cohen, 2002-2005 48 Deadlock & Livelock © Ellis Cohen, 2002-2005 49 Deadlock Example Transaction B Transaction A 1) update checking set amt = amt - 1000 where id = 30792; 1) update savings set amt = amt - 1000 where id = 30792; 2) update savings set amt = amt + 1000 where id = 30792; 2) update checking set amt = amt + 1000 where id = 30792; Consider schedule A1, B1, A2, B2 using Strict 2PL © Ellis Cohen, 2002-2005 50 Resource Allocation Graph trans action lock Transaction waiting for lock A checking savings B trans action lock Lock granted to transaction Deadlock occurs when there is a deadly embrace: a cycle in the Resource Allocation Graph Waits For Graph A A waits for B B B waits for A © Ellis Cohen, 2002-2005 51 Deadlock Detection Approaches • On each lock request, check if it causes a deadly embrace • Regularly, check if there is a deadly embrace • If a transaction has waited too long, just assume it has deadlocked (especially useful for distributed transactions) © Ellis Cohen, 2002-2005 52 Deadlock Recovery Once a deadlock is detected, one of the deadlocked transactions must be aborted. Auto Restart – When database aborts a transaction, it automatically restarts it, possibly after a short/randomized time Abort Causes Exception – When a database aborts a transaction, it raises an exception, which the process can handle, and • explicity restart the transaction, or • do something else Which of the deadlocked transactions should be aborted? © Ellis Cohen, 2002-2005 53 Deadlock Avoidance For deadlock to arise A transaction must hold one lock while requesting another To avoid deadlock Simulate Conservative 2PL – Explicitly request multiple needed table locks in advance using LOCK TABLE (Oracle) (but this can lock unneeded rows, unnecessarily locking out other transactions) – Lock selected rows using SELECT … FOR UPDATE © Ellis Cohen, 2002-2005 54 Approaches to Deadlock Deadlock Avoidance – Request all locks at once before needed (Simulate Conservative 2PL) Deadlock Detection & Recovery – Abort some deadlocked transaction once a deadlock has been detected Deadlock Prevention – Aggressively abort transactions to prevent the possibility of a deadlock – Avoids need for deadlock detection © Ellis Cohen, 2002-2005 55 Livelock (Cyclic Restart) Suppose a transaction – is forced to abort – is rescheduled – finds itself again in a situation (e.g. deadlock recovery or prevention, or cascading abort) where it is again aborted repeatedly! How can livelock be avoided? © Ellis Cohen, 2002-2005 56 Deadlock Prevention © Ellis Cohen, 2002-2005 57 Deadlock Prevention Protocols When a request cannot be granted – Wait only if it cannot cause a deadlock, or – Take an action that prevents the deadlock • DIE: Abort own transaction • WOUND: Steal lock from current holder, and abort its transaction A transaction that dies or is wounded – Will generally start again from the beginning (possibly automatically) – May also be susceptible to livelock - endless cycle of aborting and restarting Protocols – WAIT/DIE – WOUND/WAIT © Ellis Cohen, 2002-2005 58 WAIT / DIE Suppose B requests a lock which conflicts with a lock A has If B's is OLDER (B's transaction started earlier) B waits B's IS YOUNGER (B's transaction started later) B dies Why can't this lead to deadlock? How does this cause unnecessary aborts? How is livelock prevented? © Ellis Cohen, 2002-2005 59 No Wait/Die Deadlocks OLDER T1 L1 L2 T2 YOUNGER Young Transactions Die instead of Waiting for Older ones © Ellis Cohen, 2002-2005 60 Unnecessary Abort T1 UPDATE X SET … … … … (very long transaction that only uses X) T2 © Ellis Cohen, 2002-2005 UPDATE X SET … T2 is younger so it just dies right away! 61 Preventing Livelock T1 UPDATE X SET … T3 UPDATE X SET … DIE COMMIT restart T2 UPDATE X SET … COMMIT T3' UPDATE X SET … If T3' uses its new start time, it will die again w danger of cyclic restart. If T3' uses its orginal start time (that of T3), it will be older than T2 and will wait. © Ellis Cohen, 2002-2005 62 WOUND/WAIT Suppose B requests a lock which conflicts with a lock A has If B's is OLDER (B's transaction started earlier) B wounds A (preemption) B's IS YOUNGER (B's transaction started later) B waits Why can't this lead to deadlock? How does this cause unnecessary aborts? How is livelock prevented? © Ellis Cohen, 2002-2005 63 No Wait/Die Deadlocks Older Transactions Wound Younger ones instead of waiting for them OLDER T1 L1 L2 T2 © Ellis Cohen, 2002-2005 YOUNGER 64 Unnecessary Abort T1 UPDATE X SET … … … … … … UPDATE Y SET … T2 UPDATE Y SET … … … … … Wounds T2 Immediately! © Ellis Cohen, 2002-2005 65 ACQUIRE-WOUND/DIE Consider the following protocol: Suppose B requests a lock which conflicts with a lock A has If B has not yet acquired any lock B waits B's transaction started earlier B wounds A B's transaction started later B dies Does this prevent deadlock and livelock? How or why not? Compare it to WAIT/DIE and WOUND/WAIT © Ellis Cohen, 2002-2005 66 Shared & Exclusive Locks © Ellis Cohen, 2002-2005 67 Shared vs Exclusive Locks When 2 transactions read the same table, there is no problem Problems only arise when one or both write the table • Both write => Inconsistent Updates • One reads, One writes => Dirty/Non-Repeatable Reads Provide Different Locking Modes • S (shared) locks (for reading) • X (exclusive) locks (for reading/writing) © Ellis Cohen, 2002-2005 68 Grant Matrix If one transaction already has this lock on a resource Will the system grant the lock to another transaction? S X S X YES NO NO NO Using Strict 2PL with S & X locks, and with X locks released on commit only recoverable, serializable schedules are produced, with no cascading aborts © Ellis Cohen, 2002-2005 69 Lost Update Problem Suppose transactions A & B simultaneously deposit $1000 into a checking account, whose initial balance is $5000 1) select balance into curbal from checking where acctid = 30792; 2) update checking set balance = curbal + 1000 where acctid = 30792; 3) COMMIT Shared/Exclusive Locks Prevent Lost Updates © Ellis Cohen, 2002-2005 70 Preventing Lost Updates (1) Transaction A curbal balance balance t1 curbal + 1000 t0 t2 perhaps other changes to balance COMMIT A curbal A's lock balance B's 2300 2300 S 2300 2300 X 3300 2300 X 3300 B curbal Transaction B lock S (curbal balance) At t0: A acquires an S lock on the checking table in order to read balance At t1: A upgrades its S lock to an X lock so it can write balance At t2: B tries to acquire an S lock to read balance. It must wait until A commits. This is good, because A might continue to change balance! © Ellis Cohen, 2002-2005 71 Preventing Lost Updates (2) Transaction A A curbal curbal balance 2300 S 2300 2300 S 2300 S 2300 t2 curbal + 1000) 2300 X 2300 S 2300 t3 2300 S 2300 X 2300 t0 t1 (balance A's lock At t2: A tries to upgrade its S lock to an X lock so it can write balance. This conflicts with B's S lock. A waits for B with just an S lock! DEADLOCK! B curbal balance 2300 Transaction B curbal balance (balance curbal + 1000) At t3: B tries to upgrade its S lock to an X lock so it can write balance. This conflicts with A's S lock. B waits for A with just an S lock! © Ellis Cohen, 2002-2005 72 Lost Updates and S/X Conflicts Lost Updates involve UPDATEs (R+W) of the data. They are prevented because S and X locks have grant matrix conflicts. So why do X locks need to conflict with X locks? © Ellis Cohen, 2002-2005 73 The Inconsistent Update Problem Transaction A 1) UPDATE XT SET x = 10 2) UPDATE YT SET y = 10 Transaction B 1) UPDATE XT SET x = 80 2) UPDATE YT SET y = 80 Serial Schedule A1 A2 B1 B2 x:80 y:80 Serial Schedule B1 B2 A1 A2 x:10 y:10 NON-SERIALIZABLE Schedule A1 B1 B2 A2 x:80 y:10 This schedule is prevented by the X/X grant matrix conflict © Ellis Cohen, 2002-2005 74 Lock Granularity © Ellis Cohen, 2002-2005 75 Lock Granularity Granularity - size of data items to lock e.g., table, page, row, field Coarse (e.g. table) granularity implies very few locks, so little locking overhead must lock large chunks of data => increased chance of conflict => less concurrency Medium (e.g. page) granularity implies medium # of locks, so moderate locking overhead minimized because locking integrated with cacheing locking conflict occurs only when two transactions try to access the exact same page concurrently Fine (e.g. row) granularity implies many locks, so high locking overhead locking conflict occurs only when two transactions try to access the exact same tuples concurrently © Ellis Cohen, 2002-2005 76 Indexing Example Index on the job field empno ename job … 3899 Lobo ANALYST … 7301 Soni CLERK … 2119 Smith CLERK … 4023 Wesin CLERK … 4699 Bobo ENGINEER ANALYST CLERK ENGINEER An index makes it easy to find all tuples in a table with a specific value for one (or more) fields © Ellis Cohen, 2002-2005 77 … Indexing & Share Locks Consider SELECT * from Emps WHERE deptno = 10 If row locking is not supported Acquire S lock on the entire table If only row locking is supported and Emps does NOT have an index on deptno An S lock must be acquired on every row to evaluate (deptno = 10) If Emps has an index defined on deptno The index will immediately identify all rows where (deptno = 10) An S lock needs to only be acquired on just those rows What about UPDATE Emps SET … WHERE deptno = 10 © Ellis Cohen, 2002-2005 78 Indexing & Exclusive Locks Consider UPDATE Emps SET … WHERE deptno = 10 If row locking is not supported Acquire X lock on the entire table If only row locking is supported and Emps does NOT have an index on deptno An S lock must be acquired on every row to evaluate (deptno = 10) An X lock must only be acquired on rows where (deptno = 10) If Emps has an index defined on deptno The index will immediately identify all rows where (deptno = 10) An X lock needs to only be acquired on just those rows © Ellis Cohen, 2002-2005 79 Granularity & Concurrency Suppose Emps has an index on deptno A: B: … UPDATE Emps SET sal = sal + 100 WHERE deptno = 10 … … UPDATE Emps SET sal = sal + 200 WHERE deptno = 20 … Using Table Locks A & B both require X locks on Emps They conflict and cannot execute concurrently Using Row Locks A requires X locks ONLY on rows where deptno = 10 B requires X locks ONLY on rows where deptno = 20 No conflicts; they can both proceed concurrently! © Ellis Cohen, 2002-2005 80 Indexing & Mixed Queries Rows to be locked depend on access path SELECT * from Emps WHERE deptno = 10 AND sal > 1000 If row locking is not supported Acquire S lock on the entire table If only row locking is supported and Emps does NOT have an index on deptno An S lock must be acquired on every row to evaluate (deptno = 10 AND sal > 1000) If Emps has an index defined on deptno The index will immediately identify all rows where (deptno = 10) An S lock needs to only be acquired on just those rows to evaluate (sal > 1000) What about UPDATE Emps SET … WHERE deptno = 10 AND sal > 100 © Ellis Cohen, 2002-2005 81 Indexing & Mixed Updates Rows to be locked depend on access path UPDATE Emps SET … WHERE deptno = 10 AND sal > 1000 If row locking is not supported Acquire X lock on the entire table If only row locking is supported and Emps does NOT have an index on deptno An S lock must be acquired on every row to evaluate (deptno = 10 AND sal > 1000) Acquire X lock on just rows where (deptno = 10 AND sal > 1000) If Emps has an index defined on deptno The index will immediately identify all rows where (deptno = 10) An S lock needs to only be acquired on those rows to evaluate (sal > 1000) Only acquire X locks on the subset of those rows where (sal > 1000) – i.e. the rows where both (deptno = 10 AND sal > 1000) © Ellis Cohen, 2002-2005 82 Using Multiple Indices Multiple indices can be used to determine the group of rows to be locked. Suppose Emps has indices on deptno and on job, and we execute SELECT * from Emps WHERE deptno = 10 AND job = 'CLERK' Use the deptno index to get the rowids of rows where deptno = 10 Use the job index to get the rowids of rows where job = 'CLERK' Intersect the two sets of rowids to get the rowids of rows where deptno = 10 AND job = 'CLERK' Acquire an S lock on just those rows. © Ellis Cohen, 2002-2005 83 Lock Conversions © Ellis Cohen, 2002-2005 84 Upgrades and Deadlock A transaction may first acquire an S lock for a data item (or set of data items) and then upgrade it to an X lock. Primary cause of deadlock in RDB's SELECT … FROM Emps WHERE deptno = 10; S SELECT … FROM Emps WHERE deptno = 10; … … UPDATE Emps SET … WHERE deptno = 10 UPDATE Emps SET … WHERE deptno = 10 X © Ellis Cohen, 2002-2005 S X 85 Exclusive SELECT FOR UPDATE Locking Prevent upgrade deadlocks by locking FOR UPDATE on SELECT which immediately acquires X locks SELECT … FROM Emps WHERE deptno = 10 FOR UPDATE; X SELECT … FROM Emps WHERE deptno = 10 FOR UPDATE; … … UPDATE Emps SET … WHERE deptno = 10 UPDATE Emps SET … WHERE deptno = 10 This avoids deadlock, but can limit concurrency © Ellis Cohen, 2002-2005 86 X Unnecessary Loss of Concurrency Assume Emps has an index on deptno SELECT … FROM Emps WHERE deptno = 10 FOR UPDATE; X lock on rows where … deptno = 10 … X Because FOR UPDATE obtained an X lock immediately, the transaction below will unnecessarily wait SELECT … FROM Emps WHERE deptno = 10 AND sal > 1200 COMMIT UPDATE Emps SET … WHERE deptno = 10 AND sal < 1000 S lock on rows where deptno = 10 If FOR UPDATE was not used, both transactions could proceed concurrently © Ellis Cohen, 2002-2005 87 S Best of Both? Is there a way to use FOR UPDATE to prevent upgrade deadlocks And at the same avoid limiting concurrency due to nonconflicting queries Answer: Use U (Upgrade) locks © Ellis Cohen, 2002-2005 88 U locks are Mutually Exclusive SELECT … FROM Emps WHERE deptno = 10 FOR UPDATE; U SELECT … FROM Emps WHERE deptno = 10 FOR UPDATE; … … UPDATE Emps SET … WHERE deptno = 10 UPDATE Emps SET … WHERE deptno = 10 X X When the actual update occurs, the rows updated will still need to upgrade to X locks However, U locks are mutually exclusive, so both transactions CANNOT have U locks simultaneously no upgrade deadlocks! © Ellis Cohen, 2002-2005 U 89 U Locks Allow S Locks SELECT … FROM Emps WHERE deptno = 10 FOR UPDATE; U lock on rows where … deptno = 10 … U locks do not block S locks U SELECT … FROM Emps WHERE deptno = 10 AND sal > 1200 COMMIT UPDATE Emps SET … WHERE deptno = 10 AND sal < 1000 S lock on rows where deptno = 10 Both transactions can proceed concurrently © Ellis Cohen, 2002-2005 90 S U (Update) Locks Will the system grant the lock to another transaction? If one transaction already has this lock on a resource S U X S U X YES YES NO YES NO NO NO NO NO Acquire U lock when a SELECT statement uses FOR UPDATE. Upgrade U lock to X lock during write stage w less risk of deadlock. U locks are used in SQL Server and DB2 © Ellis Cohen, 2002-2005 91 Automatic Upgrading Assume Emps has an index on deptno UPDATE Emps SET comm = 0 WHERE deptno = 10 AND sal > 1000 UPGRADING HAPPENS AUTOMATICALLY! 1) Acquire S or U locks on all rows where deptno = 10 to evaluate (sal > 1000) 2) Upgrade to X locks only those rows where (sal > 1000) to set comm = 0 © Ellis Cohen, 2002-2005 92 Automatic Downgrading During Update UPDATE Emps SET comm = 0 WHERE deptno = 10 AND sal > 1000 AUTOMATIC DOWNGRADING Lock-Based Systems that do not use U locks general use Downgrading to avoid upgrade deadlocks: 1) Acquire X locks initially on rows w deptno = 10 2) Downgrade to S locks the subset of those rows WHERE sal <= 1000 or sal IS NULL Lock-Based Systems with U locks also can downgrade 1) Acquire U locks initially on rows w deptno = 10 2) Upgrade to X locks on the subset of those rows WHERE sal > 1000 3) Downgrade to S locks rows WHERE sal <= 1000 or sal IS NULL Not 2PL, but can be proven to be safe © Ellis Cohen, 2002-2005 93 Multi-Granular Locking © Ellis Cohen, 2002-2005 94 Row vs Table Locking If the DB uses row locks SELECT * FROM Emps WHERE sal > 1000 will need to get an S lock every row (assuming there's no index on sal). This is very expensive! DBs which use row locks also support table locks. The query can just get an S lock on the table Emps which is much cheaper! © Ellis Cohen, 2002-2005 95 Determining Lock Conflicts Suppose T1 gets an X lock on the table Emps due to UPDATE Emps SET comm = 0 Then, before T1 completes, T2 requests SELECT * FROM Emps WHERE deptno = 10 Which, if there's a deptno index, would first need to acquire S locks on just those rows where deptno = 10 What does the DB scheduler check to determine whether the locks T2 wants to acquire conflict with locks held by other transactions (e.g. T1)? Look for Row-Row conflicts: For every S row lock T2 wants, check whether another transaction holds an X or U lock on the same row. NO. Look for Row-Table conflicts See if any transaction holds an X lock on the entire table. YES! T1 does, so T2 must wait Now, suppose T2 started first, then T1 started. What lock conflicts arise, how must they be checked? © Ellis Cohen, 2002-2005 96 Table-Row Conflicts Suppose T2 has acquired S locks on rows where deptno = 10 due to executing SELECT * FROM Emps WHERE deptno = 10 Then, before T2 completes, T1 requests UPDATE Emps SET comm = 0 Which would just need to acquire an X table lock on Emps Look for Table-Table conflicts: See if there is an existing S or X lock on Emps. NO. Look for Table-Row conflicts: See if any transaction holds an S, X or U lock on any row of Emps. YES! But this is an expensive operation! It would be very useful, to have a way of checking, at the table level, whether a transaction holds row-level locks. © Ellis Cohen, 2002-2005 97 Intention Locks An intention lock is a lock – requested by a transaction at the table level – to indicate that it intends to immediately request some locks at the row level Table-level intention locks – IS: requesting S locks on some rows – IX: requesting X or U locks on some rows (and possibly S locks on some rows as well) Other table-level locks – S: S lock on entire table (instead of requesting S locks on every row) – X: X lock on entire table (instead of requesting X locks on every row) – SIX: Combination of S and IX lock S lock on whole table, X or U locks on some rows © Ellis Cohen, 2002-2005 98 Row-Table Conflicts with Intention Suppose T1 gets an X lock on the table Emps due to UPDATE Emps SET comm = 0 Then, before T1 completes, T2 requests SELECT * FROM Emps WHERE deptno = 10 T2 only needs S locks on rows where (deptno = 10) So, T2 first requests an IS lock on Emps. The IS request on Emps (I need to read some rows) conflicts with the lock already held by T1, the X lock on Emps (I need to write all rows) Now, suppose T2 started first, then T1 started. What lock conflicts arise, how must they be checked? © Ellis Cohen, 2002-2005 99 Table-Row Conflicts with Intention T2 has acquired an IS locks on Emps an S lock on rows where deptno = 10 due to executing SELECT * FROM Emps WHERE deptno = 10 Then, before T2 completes, T1 requests UPDATE Emps SET comm = 0 This requests an X lock on Emps But this conflicts with the IS lock already held by T2! So T1 must wait. © Ellis Cohen, 2002-2005 100 Locking Examples Consider Emps table, indexed only by empno and deptno LOCK TABLE Emps IN EXCLUSIVE MODE X lock on Emps SELECT * from Emps WHERE empno = 5479 IS lock on Emps, S lock on selected row SELECT * FROM Emps WHERE job = 'CLERK' S lock on Emps (need to look at entire table) UPDATE Emps SET salary = 40000 WHERE empno = 5479 IX lock on Emps, X lock on selected row What about: UPDATE Emps SET salary = 40000 WHERE job = 'CLERK' © Ellis Cohen, 2002-2005 101 Multi-Granular Update Locking UPDATE Emps SET salary = 40000 WHERE job = 'CLERK' SIX lock on Emps (need to look at entire table) X lock on rows where job = 'CLERK' More complicated problem: What locks are acquired for UPDATE Emps SET salary = 40000 WHERE deptno = 10 AND job = 'CLERK' Consider all combinations of systems that do/don't (a) Use U locks (b) Do donwgrading © Ellis Cohen, 2002-2005 102 Multi-Granular Locking w Conversion UPDATE Emps SET salary = 40000 WHERE deptno = 10 AND job = 'CLERK' First, get an IX lock on Emps In a DB which neither uses U locks nor downgrades Get S locks on rows with deptno = 10 Upgrade to X locks just those rows where job = 'CLERK' as well In a DB which uses U locks, and doesn't downgrade Get U locks on rows with deptno = 10 Upgrade to X locks just those rows where job = 'CLERK' as well In a DB which doesn't use U locks, but downgrades Get X locks on rows with deptno = 10 Downgrade to S locks just those rows where job <> 'CLERK' or job IS NULL In a DB which uses U locks, and downgrades Get U locks on rows with deptno = 10 Upgrade to X locks just those rows where job = 'CLERK' as well Downgrade to S locks just those rows where job <> 'CLERK' or job IS NULL © Ellis Cohen, 2002-2005 103 Concurrent Intention Locks Suppose T1 executes UPDATE Emps SET comm = 100 WHERE deptno = 10 And, before T1 completes, T2 executes UPDATE Emps SET comm = 0 WHERE deptno = 20 Will T2 wait for T1, or will it be allowed to continue? © Ellis Cohen, 2002-2005 104 No Intention Conflicts T1 requests UPDATE Emps SET comm = 100 WHERE deptno = 10 It gets IX on Emps, X on rows w deptno = 10 T2 requests UPDATE Emps SET comm = 0 WHERE deptno = 20 It needs IX on Emps, X on rows w deptno = 20 These will conflict only if the two IX on Emps ("I need to write some rows") conflict. THEY DO NOT! Intention locks NEVER conflict with each other. If there are actual conflicts, they will show up when attempting to lock the actual conflicting rows! © Ellis Cohen, 2002-2005 105 Grant Matrix with Intentions Will the system grant the lock to another transaction? If one transaction already has this lock on a resource IS IX S SIX YES YES YES YES IX YES YES IS S YES X YES SIX YES X © Ellis Cohen, 2002-2005 106 Explicit Table Locking Most databases implicitly acquire locks as needed – depending upon isolation level – depending upon other concurrency controls Many databases also allow you to request (groups of) locks explicitly – Specify table(s) – Specify locking mode (SHARE, EXCLUSIVE, ROW SHARE [IS], ROW EXCLUSIVE [IX], SHARE ROW EXCLUSIVE [SIX]) – Specify whether to wait if necessary Oracle: LOCK TABLE Emps, Depts IN EXCLUSIVE MODE NOWAIT Explicit locking – Can ensure serializability (if necessary) – Can prevent deadlock © Ellis Cohen, 2002-2005 107 Multi-Level Locking Some DB's have a multi-level lock hierarchy, with locks at – the table level – the page level (to avoid locking all tuples on the page), and – the row level Intention locks are then used at every level above the row level SELECT * FROM Emps WHERE deptno = 10 Requests IS on Emps, then IS on the page containing the rows where deptno = 10 S on the rows where deptno = 10 Suppose a page is known to only hold tuples where deptno = 10? © Ellis Cohen, 2002-2005 108 Multi-Level Locking w Clustering Using clustering, a DBA can tell the DB to store tables organized based on values of some field If Emps is organized by deptno SELECT * FROM Emps WHERE deptno = 10 Requests IS on Emps, then S on the pages containing the rows where deptno = 10 © Ellis Cohen, 2002-2005 109 Phantom Reads & Index Locks © Ellis Cohen, 2002-2005 110 Phantom Reads Phantom reads are a special case of the non-repeatable read problem Phantoms result when an INSERT or an UPDATE adds a row to a group which is already locked! Row-level locking does not prevent phantom reads and can result in non-serializable schedules © Ellis Cohen, 2002-2005 111 Phantom Read Example A: (1) SELECT avg(sal) FROM Emps B: WHERE deptno = 10 (1) UPDATE Emps SET deptno = 10 WHERE empno = 3142 (this employee was previously in dept #30) (2) COMMIT (2) SELECT avg(sal) FROM Emps WHERE deptno = 10 Suppose • The DB locks rows • Emps has indices on empno and on deptno Is A1 B1 B2 A2 legal? Show locks at each step © Ellis Cohen, 2002-2005 112 Row Locks for Phantom Problem A: (1) SELECT avg(sal) FROM Emps WHERE deptno = 10 IS lock on Emps S locks on rows in dept 10 B: (1) UPDATE Emps SET deptno = 10 WHERE empno = 3142 IX lock on Emps X lock on employee 3142 (who is in dept 30) NO CONFLICT! (2) COMMIT (2) SELECT avg(sal) FROM Emps WHERE deptno = 10 DB may lock new rows now in deptno 10 No CONFLICT, since B already committed! The result of A2 is different from A1, because Emps now contains a new employee in dept 10. This is a phantom read! © Ellis Cohen, 2002-2005 113 Phantom Reads & Index Locks Databases can only lock at the row level when an index can be used to identify the group of rows which need to be locked Phantom Reads are a special case of the non-repeatable read problem they occur when an INSERT or an UPDATE adds a row to a group of rows which is already locked Databases prevent phantoms by associating locks with indices Obtain S locks on all indices used by a SQL command For an UPDATE: Acquire X locks on all indices on fields set by the UPDATE For an INSERT: Acquire X locks on all indices © Ellis Cohen, 2002-2005 114 Using Index Locks A: (1) SELECT avg(sal) FROM Emps WHERE deptno = 10 S lock on deptno index IS lock on Emps S locks on rows in dept 10 An S lock on the deptno index must be acquired before using the index to find the rows it references B: (1) UPDATE Emps SET deptno = 10 WHERE empno = 3142 An S lock on the empno index must be acquired to find out which row has empno 3412 An X lock on the deptno index must be acquired because deptno is being changed by the UPDATE command, which will change the rows referenced by the index. S lock on empno index IX lock on Emps X lock on employee 3142 (who is in dept 30) X lock on deptno index B1 will not be able to acquire the X lock on the deptno index until A completes, since A already has an S lock on the deptno index © Ellis Cohen, 2002-2005 115 Index Value Locks Locking entire indices can be overly restrictive if A's query is based on deptno = 10, but B's update SETs deptno = 20, the index locks conflict unnecessarily. This can be solved by index value locks (e.g. separate locks on deptno:10 and deptno:20). However, this can lead to index phantoms, which are prevented by index gap locks, locks on ranges on index values for which there are no tuples. © Ellis Cohen, 2002-2005 116 Predicate Locks A: (1) SELECT avg(sal) FROM Emps WHERE deptno IN (10, 20) The S lock is associated with a predicate describing the tuples which need to be read S lock on (deptno IN (10,20)) The X lock is associated with predicates • describing the tuples which are to be affected • characterizing the new values of the field to be updated or inserted Cool, but not used in commercial DBs B: (1) UPDATE Emps SET deptno = 10 WHERE empno = 3142 X lock on (empno = 3412) X lock on (deptno = 10) Two locks conflict if one is an X lock, and the intersection of their predicates is not empty (deptno IN (10,20)) AND (deptno = 10) (deptno = 10) © Ellis Cohen, 2002-2005 117 Locking & SQL Isolation Levels Serializable Implement by acquiring S locks and holding them until the end of the transaction AND by using some mechanism (e.g. index locks) to eliminate phantoms Repeatable Read Phantom Reads are possible Implement by acquiring S locks as needed, and holding them until the end of the transaction All levels automatically acquire X locks as needed and hold them until the end of the transaction Read Committed Non-Repeatable Reads are also possible Implement by acquiring S locks as needed, but releasing them at the end of each SQL statement Read Uncommitted Dirty Reads are also possible Implement by eliminating S locks © Ellis Cohen, 2002-2005 These levels go from strongest to weakest 118 Long Transactions and Sagas © Ellis Cohen, 2002-2005 119 Long Transactions Problem if significant time elapses during transaction – Leaves resources locked for too long – Run out of logging space Often caused by user interaction Reserve flight for user Wait for user confirmation Finalize reservation single transaction Solve by implementing transactions as sagas © Ellis Cohen, 2002-2005 120 Sagas • Break transaction up into sequence of atomic sub-transactions • Add data (e.g. additional columns and/or tables) to indicate that intermediate states is not permanent • Each sub-transaction uses and updates intermediate state data as appropriate • Abort by executing a compensating sub-transaction that undoes intermediate state © Ellis Cohen, 2002-2005 121 Long Transaction Example user looks for flights start transaction reserve flight for user (X lock on flight) get user info get user confirmation set user information commit (or rollback) © Ellis Cohen, 2002-2005 Flight locked for the entire time 122 Using Sagas user looks for flights start transaction temporarily reserve flight for user (remember current time) commit T1 get user info get user confirmation if user confirmed start transaction check if reservation is still there and if expired, try to reserve again T2 make reservation permanent and set user information commit else start transaction Compensating T1x cancel reservation (if still there) transaction to commit undo T1 © Ellis Cohen, 2002-2005 123