Multiversion Concurrency Control Version 2.0 Adrian Hudnott UK Ingres Users Association Conference 2010 Outline 1. 2. 3. 4. 5. 6. 7. Objectives Locking Oracle MVCC (& Ingres MVCC) Anomalies Allowable Behaviours OCC-SG Conclusions Objectives • • • • • • • Mutual Exclusion Serialized Recoverable Concurrent Current Minimum Wasted Work Starvation Free Two Phase Locking 2PL Mutual Exclusion Serialized Recoverable Concurrent Current Minimum Wasted Work Starvation Free Somewhat Deadlocks Implementation Dependent Two Phase Locking Locks Transaction 1 + Account S Read John’s balance + C/D X Write debit from John Transaction 2 Write credit to Jane + Account X Write John’s new balance - Release all . Try to read Jane’s C/D Read Jane’s balance Wait Write Jane’s new balance Wait Commit Wait + C/D S Read Jane’s C/D + Account S Read Jane’s balance - Release all Commit MVCC Transaction 1 Transaction 2 Read John’s balance Write debit from John Write credit to Jane Write John’s new balance . Read Jane’s old C/D Read Jane’s balance . . Read Jane’s balance Write Jane’s new balance . . Commit Commit Notation R(v,1); W(x,1); W(z,1); W(v,1); R(z,2); R(y,1); R(y,2); C(2); W(y,1); C(1) R(v,1) W(x,1) C(2) Transaction 1 reads v Transaction 1 writes to x Transaction 2 commits 2 1 T(2) < T(1) Transaction 2 comes “before” transaction 1 MVCC Version 1.0 R(x,1); W(x,1); R(x,2); W(x,2); C(1) Rollback 2 – first updater wins in Oracle & Ingres 10 MVCC Version 1.0: Anomalies R(x,1); R(y,1); R(x,2); R(y,2); W(x,1); W(y,2); C(1); C(2) • T(1) < T(2) • T(2) < T(1) • Contradiction! • Not prevented by first updater wins • Called “Write Skew” MVCC Version 1.0: Anomalies R(x,2); R(y,2); R(y,1); W(y,1); C(1); R(x,3); R(y,3); W(x,2); C(2) 1 2 • T(2) < T(1) • T(3) < T(2) • T(3) > T(1) 3 • Transaction 3 reads y as committed by transaction 1 • Called “Read-Only Anomaly” New Stuff MVCC Version 2.0: Allowable R(x,2); R(y,2); R(y,1); W(y,1); C(1); R(x,3); R(y,3); W(x,2); C(2) 1 2 • T(2) < T(1) • T(3) < T(2) • T(3) < T(1) 3 • Transaction 3 reads a version of y before transaction 1’s write • Transaction 3 has “time travelled” before transaction 1 • N.B. Cannot time travel before transactions in the same session MVCC Version 2.0: Allowable • Integrity constraint checking • Over limit rule: SELECT DISTINCT ‘Failed’ FROM CUSTOMER NATURAL JOIN ACCOUNT WHERE CREDIT_RATING < 100 AND BALANCE < -OVERDRAFT_LIMIT AND STATUS <> ‘Frozen’ MVCC Version 2.0: Allowable Transaction 1 Deduct money Check rule – OK Transaction 2 . . . Lower credit rating Check rule – OK Commit Pay money to… Commit R(account, 1); W(account,1); R(customer,1); R(customer,2); W(customer,2); R(account,2); C(2); W(C/D,1); C(1) MVCC Version 2.0: Allowable • Blind writes: R(x,1); W(x,1); W(x,2); C(1); C(2) – If T(1) < T(2) then write transaction 2’s value for x – If T(1) > T(2) then ignore transaction 2’s write request • Write-Write conflicts are a red herring under MVCC • Known as Thomas Write Rule OCC-SG: Basics • Record the serialization graph explicitly • For each element read or written, record: – Read list – Write list 0 1 2 3 4 5 6 7 0 0 1 1 1 T(2) 1 1 <1T(3) 1 1 0 0 0 1 1 0 1 1 2 0 0 0 1 0 1 1 1 3 0 0 0 0 0 0 0 0 graph[2][3] 4 0 0 0 0 0 0 1 1 5 0 0 0 0 0 0 1 1 6 0 0 0 0 0 0 0 1 7 0 0 0 0 0 0 0 0 OCC-SG: Writing 1. When a transaction i wants to write to x, inspect the read list. For each uncommitted transaction j: If closure[ i ][ j ] is set then choose which transaction to rollback Otherwise set graph[ j ][ i ] and update the closure Write a new version of x 2. 3. 4. 1 2 3 1 2 3 Transitive closure OCC-SG: Reading 1. 2. 3. 4. 5. When a transaction i wants to read from x, inspect the write list. For each transaction j: If i = j then skip to step 5 If closure[ j ][ i ] is set then wait for transaction j to commit or rollback Otherwise set graph[ i ][ j ] and update the closure Scan the headers of the versions of x 1. 2. 6. Classify into before versions and after versions Arrange the before versions in order. Ensure it is a total order. Add any new constraints to the graph. Read the before versions in order, progressively applying the changes from the base revision OCC-SG: Serialization • Graph should not grow too large • Serialization procedure – If a transaction i: • has rolled back, OR • has committed and there are no transactions time travelled before it – then put transaction i into the serialized list • Serialization list is emptied at each checkpoint 3 2 5 Transaction ID 0 serialized 3rd Simulation Results Execution Time 60 1559 1500 1250 1000 673 750 500 250 Number of Restarts Execution Time /s 1750 Restarts 59 50 40 34 30 20 10 0 0 OCC-DATI OCC-SG OC-DATI OCC-SG Comparison 2PL Oracle OCC-DATI OCC-SG Somewhat Better Always Really Good Much Less Bit Less Less Restarts (inc. unnec.) More Restarts Fewer Restarts Mutual Exclusion Serialized Recoverable Concurrent Current Minimum Wasted Work Starvation Free Deadlocks Implementation Dependent Conclusions • • • • • • • • Locking is safe but not fast enough Oracle style MVCC performs better but is not safe Oracle style MVCC rolls back some legitimate schedules Protocols proposed in academia (e.g. OCC-DATI) are too optimistic (don’t rollback until a transaction is ready to commit) Protocols proposed in academia lack industrial strength features such as statement level rollback OCC-SG puts the serialization graph, previously a theoretical device, at the heart of the algorithm OCC-SG combines the best features from Oracle / Ingres 10 and from academia Still at the research stage, but a working simulation has been created in Java