Transaction Management and Concurre cy Overview R & G Chapter 16-17 There are three side effects of acid. Enhanced long term memory, decreased short term memory, and I forget the third. - Timothy Leary Administrivia • Homework 3 due today by end of class period • Homework 4 available on class website – Due date: April 10 (after Spring Break) • Midterm 2 is next class! Thu 3/22 – In class, covers lectures 10-17 – Review will be held Wed 3/21 7-9 pm 306 Soda Hall Components of a DBMS transaction query Query Compiler Execution Engine Data Definition We are here Transaction Manager Logging/Recovery Schema Manager Concurrency Control Buffer Manager LOCK TABLE Storage Manager BUFFERS BUFFER POOL DBMS: a set of cooperating software modules Correctness criteria: The ACID properties • • A tomicity: All actions in the Xact happen, or none happen. C onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. • I solation: Execution of one Xact is isolated from that of other Xacts. • D urability: If a Xact commits, its effects persist. We are here Review - Definitions • Transaction - a sequence of read and write operations (read(A), write(B), …) – DBMS’s abstract view of a user program • Serial schedule: Schedule that does not interleave the actions of different transactions. • Equivalent schedules: For any database state, the effect of executing the first schedule is identical to the effect of executing the second schedule. • Serializable schedule: A schedule that is equivalent to some serial execution of the transactions. (Note: If each transaction preserves consistency, every serializable schedule preserves consistency. ) Anomalies with interleaved execution • Reading Uncommitted Data (“WR”, dirty reads): – T2 reads a value A that T1 wrote but didn’t commit – e.g: T1 moves $100 from account B to account A T2 adds 6% interest to account A $1100 + $66 + 900 = $2066 total after T1 & T2 A=1000 A=A+100=1100 T1: T2: R(A), W(A), B=1000 R(A), W(A), C A=1100 ->A=1166 B=1000-100 A=? =900 B=1000 R(B), W(B), Abort A=1100*1.06 A=1166 =1166 B=1000 1166+1000 = 2166 -> Bank is out $106 because T2 read T1’s A value before it was done! Anomalies with interleaved execution • Unrepeatable Reads (“RW” Conflicts): – T1 reads a value A that is then written by T2 – e.g. T1 and T2 place orders for books, and A represents the quantity in stock (1) A=1 T1: T2: R(A), A=0 R(A), W(A), C A=1 A=0 A=0 A=-1 R(A), W(A) Hopefully T1 gets an error if there is an integrity constraint! Anomalies with interleaved execution • Overwriting Uncommitted Data (“WW”, lost update): – T2 overwrites a write by T1 – e.g. T1 assigns seat A to passenger 1 and seat B to passenger 2 T2 assigns seat A to passenger 3 and seat B to passenger 4 B=2 A=1 T1: T2: W(A), W(A), W(B), C A=3 B=4 A=3 B=4 A=3 B=2 W(B), C -> Passenger 1’s partner is sitting with passenger 3! How to prevent anomalies? Locks! • Database allows “objects” to be locked – “object” might be entire database, file, page, tuple • Two kinds of locks: – “Shared” or “Read” Lock • No one else is allowed to write the object if you have this – “Exclusive” or “Write” Lock • No one else is allowed to read or write the object Locks are not enough • If lock/unlock objects right away, anomalies are still possible – e.g. The unrepeatable read example T1 obtain ShareLock(A) T1: T2: T1 release ShareLock(A) R(A), T1 obtain ShareLock(A) R(A), W(A), C T2 obtain ExclusiveLockL(A) R(A), W(A) T2 release ExclusiveLock(A) Locks are not enough • Idea: Two Phase Locking – In a transaction, #locks • only acquire locks in one phase • only release locks in a second phase Time • once one lock has been released, can never acquire another lock during transaction T1 obtain ExclusiveLock(A) T1: T2: T1 release lock(A) R(A), R(A), W(A), …. T2 waits to obtain ExclusiveLock(A) R(A),W(A), C T2 obtains ExclusiveLock(A) Lock-Based Concurrency Control Two-phase Locking (2PL) Protocol: – Each Xact must obtain: • • – – – a S (shared) lock on object before reading, and an X (exclusive) lock on object before writing. If an Xact holds an X lock on an object, no other Xact can get a lock (S or X) on that object. System can obtain these locks automatically Two phases: acquiring locks, and releasing them • • No lock is ever acquired after one has been released “Growing phase” followed by “shrinking phase”. • Lock Manager keeps track of request for locks and grants locks on database objects when they become available. Strict 2PL • 2PL allows only serializable schedules but is subjected to cascading aborts. • Example: rollback of T1 requires rollback of T2! T1 obtain ExclusiveLock(A) T1: T2: T1 release ExclusiveLock(A) R(A), W(A), R(A), W(A), R(B), W(B) T2 obtains ExclusiveLock(A) • Abort T2 read T1’s value of A, so it must also be aborted. To avoid Cascading aborts, use Strict 2PL Strict 2PL (cont) • Strict Two-phase Locking (Strict 2PL) Protocol: – Same as 2PL, except: – All locks held by a transaction are released only when the transaction completes #locks vs • Advantage: no other transaction even reads anything you write until you commit. – e.g a transaction will only read committed data. • Disadvantage: transactions end up waiting. – e.g. inserts may lock a whole table • Why? Because of Phantom problem The Phantom Problem • Even Strict 2PL (on individual rows) will not assure serializability: • Consider T1 – “Find oldest sailor with Rating = 1” • T1 sees 2 different answers, even though it did no updates itself. Time T1 1 Obtain Share Lock on all existing Sailors with rating = 1 2 Compute oldest sailor (age = 71) T2 3 Obtain Exclusive Lock on new Sailor tuple 4 Insert Sailor age 96 rating 1 5 Commit 6 Compute oldest sailor (age = 96) Transactions in SQL • A new transaction is started with first SQL statement issued by a user SELECT S.SID, R.BID FROM SAILORS S, RESERVES R WHERE S.SID = R.SID • All statements from that point on appear in the same transaction UPDATE SAILORS SET RATING = RATING - 1 UPDATE RESERVES SET DATE = DATE+1 • A user can use commands COMMIT or ROLLBACK to explicitly end a transaction and start a new one COMMIT • Any new statements that follow will occur in a new transaction Transactions in SQL • Because of performance, some anomalies might be acceptable for some applications – Internet apps in particular! • SQL 92 supports different “Isolation Levels” for a transaction (Lost Update not allowed at any level) Isolation Level Dirty Read Unrepeatable Read Phantom Problem Read Uncommitted Maybe Maybe Maybe Read Committed No Maybe Maybe Repeatable Reads No No Maybe Serializable No No No Aborting a Transaction (i.e., Rollback) • If an xact Ti aborted, all actions must be undone. • To undo actions of an aborted transaction, DBMS maintains log which records every write. • Log is also used to recover from system crashes (which can occur as either hardware or software failure) – All active Xacts at time of crash are aborted when system comes back up. The Database Log XID 1: REDO TID 1 <data> REDO TID 2 <data> COMMIT XID 2: REDO TID 3 <data> … Buffer Data Page Data Page • Updates are logged in the database log. • Log consists of “records” that are written sequentially. – Typically chained together by Xact id – Log is often archived on stable storage. – And backed up to even more stable storage (like tape!) Log Data Pages Log The Database Log • Write-Ahead Logging protocol – Log record must go to disk before the changed page! – All log records for a transaction (including its commit record) must be written to disk before the transaction is considered “Committed”. Update TID 2 <data> Update TID 1 <data> Buffer Data Page XID 1: REDO TID 1 <data> REDO TID 2 <data> COMMIT XID 2: REDO TID 3 <data> … Data Page 3 SQL: COMMIT XID1 2 Log 1 Data Pages Log The Log • Log records are either UNDO or REDO records – Depends on the Buffer manager • UNDO required if: Buffer mgr allows uncommitted data to overwrite stable version of committed data (STEAL buffer management). • REDO required if: xact can commit before all its updates are on disk (NO FORCE buffer management). • The following actions are recorded in the log: – if Ti writes an object, write a log record with: • If UNDO required need “before image” • IF REDO required need “after image”. – Ti commits/aborts: a log record indicating this action. How Professor Roth made some traders very unhappy at market open… • Log on disk can fill up quickly for active DBMS • DBMS duplexes between two logs – Writes DB log to one – While archiving the other – When DB log is full, reverse • Must STOP all DBMS activity while you reverse XID 1: REDO TID 1 <data> REDO TID 2 <data> COMMIT XID 2: REDO TID 3 <data> … Log Log1 LogLog2 How Professor Roth made some traders very unhappy at market open… • Stock DB at market open is VERY busy • We wanted to make sure there was an empty log just before market open • I wrote a script to ‘switch’ logs 5 mins before market open If (not already writing a log) Then switch logs Write the old log to tape • Except I forgot a comma, so it was more like this: Switch logs; write the old log to tape ...and because the DB was writing out its log, the DB hung until it was done writing to tape! XID 1: REDO TID 1 <data> REDO TID 2 <data> COMMIT XID 2: REDO TID 3 <data> … Log I’m waiting… Log1 Log2 (Review) Goal: The ACID properties • • A tomicity: All actions in the Xact happen, or none happen. C onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. • I solation: Execution of one Xact is isolated from that of other Xacts. • D urability: If a Xact commits, its effects persist. What happens if system crashes between commit and flushing modified data to disk ? D Durability - Recovering From a Crash • Three phases: – Analysis: Scan the log (forward from the most recent checkpoint) to identify all Xacts that were active at the time of the crash. – Redo: Redo updates as needed to ensure that all logged updates are in fact carried out and written to disk. – Undo: Undo writes of all Xacts that were active at the crash, working backwards in the log. • At the end – all committed updates and only those updates are reflected in the database. • Some care must be taken to handle the case of a crash occurring during the recovery process! Summary • Concurrency control and recovery are among the most important functions provided by a DBMS. • Concurrency control is automatic – System automatically inserts lock/unlock requests and schedules actions of different Xacts – Property ensured: resulting execution is equivalent to executing the Xacts one after the other in some order. • Write-ahead logging (WAL) and the recovery protocol are used to: 1. undo the actions of aborted transactions, and 2. restore the system to a consistent state after a crash.