ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 16 – Intro. to Transactions Processing and Concurrency Control Transaction Processing • Read : – Chapter 16, sec 16.1-16.6 – Chapter 17 – ARIES papers • Purpose: – Study different algorithms to support transactions and concurrency control in a DBMS ICOM 6005 Dr. Manuel Rodriguez Martinez 2 Introduction • DBMS software and supporting server machine are a big investment • Enterprise wishes to maximize its use • If each users get to use the DBMS by itself for a short period of time, it takes a lot of time to run the tasks • Multiple user must be allowed to access the DBMS at the same time – Concurrent access • DBMS might crash – Power fails, software bugs appears, hardware fails, soda is spilled … – Need recovery mechanism to recover loss data ICOM 6005 Dr. Manuel Rodriguez Martinez 3 Multiple-Users using a DBMS T3 T4 T2 T1 Waiting Queue DBMS Users wait to get a hold on DBMS to run their tasks. Context switches make this inefficient ICOM 6005 Dr. Manuel Rodriguez Martinez 4 Multiple-Users using a DBMS (2) T1 T4 T3 T2 DBMS DBMS executes different Tasks at the same time. Maximizes system throughput ICOM 6005 Dr. Manuel Rodriguez Martinez 5 System Crash Updates are lost T1 T2 Disk is gone Data ICOM 6005 Dr. Manuel Rodriguez Martinez Data 6 System Crash (2) How to recover? T1 Updates are lost T2 Disk is gone Data ICOM 6005 Dr. Manuel Rodriguez Martinez Data 7 Concurrency and Recovery • DBMS must support – Concurrency • Allow different users to access DBMS at the same time • Control access to data to prevent inconsistencies in DBMS – Recovery • Track progress of operations by an users – Use a log for this • If a crash occurs, must use this log to recover operations that were completed • Log must be stored independently of data to prevent losing both • Transactions – unit of work used by DBMS to support concurrency and recovery ICOM 6005 Dr. Manuel Rodriguez Martinez 8 Relational DBMS Architecture Client API Client Query Parser Query Optimizer Relational Operators Execution Engine File and Access Methods Buffer Management Concurrency and Recovery Disk Space Management DB ICOM 6005 Dr. Manuel Rodriguez Martinez 9 The need for concurrency • Jil and Apu are married and share baking account A. • Jil and Apu go to the bank at the same time and use to different ATMs – Jil asks to withdraw $300 from the $500 in A – Apu ask to withdraw $400 from the $500 in A • The following might happen: – – – – – – At ATM 1: System reads $500 in A At ATM 2: System reads $500 in A At ATM 1: System deducts $300 from A At ATM 2: System deducts $400 from A At ATM 1: Systems stored $200 as balance in A At ATM 2: Systems stored $100 as balance in A • Jil and Apu got $700 out of their $500 in account A! • DBMS must prevent such events via concurrency control ICOM 6005 Dr. Manuel Rodriguez Martinez 10 The need for recovery • Tom goes to bank with a $1,000 deposit for this account A, which currently has $500 • Tom talks with teller X. • The following might happen: – – – – – Teller X reads A and finds $500 dollars Tom gives $1,000 to teller X in an envelope Teller X changes balance in A to $1,500 Teller X sends a request to DBMS to update A to $1,500 Power fails at this time • What is the balance of A? – $500 or $1,500? How do we make sure it is $1,500? • DBMS must support recovering correct balance via crash recover ICOM 6005 Dr. Manuel Rodriguez Martinez 11 Transactions and ACID properties • Transactions are the unit of work used to submit tasks to the DBMS – Selects, inserts, deletes, updates, create table, etc. • Transactions must support ACID properties – Atomicity – all operations included in a transactions are either completed as a whole or aborted as whole – Consistency – each transactions reads a consistent DB and upon completion leaves DB in another consistent state – Isolation – transactions running concurrently have the same effect on the DB as if they had been run in serial fashion • One at the other – Durability – changes made by committed (transactions) survive crashes and can be recovered. Changes made by aborted transactions are undone ICOM 6005 Dr. Manuel Rodriguez Martinez 12 Supporting Transactions at DBMS • Transaction Manager – Module in charge of supporting transaction at DBMS • Sub-components – Lock Manager • Deals with granting locks to transaction to get access to DB objects such as records, data pages, tables or whole databases – Log/Recovery Manager • Deals with tracking operations done by transactions as well as determining which ones commit and which ones abort. After a crash, it recovers work done by committed transactions. • Implementing Transaction Manager – Modules integrated with DBMS – Separate process from DBMS • TP Monitor ICOM 6005 Dr. Manuel Rodriguez Martinez 13 Schedules • We can model operations done by a transaction with a schedule – List of operations done: read, write, plus logical operations – Often, we just care about • • • • • Reads Writes Abort requests Commit requests Changes to individual objects (optional, just for clarity). – Assumptions: • Only inter-transaction interaction is via reads/writes of shared objects ICOM 6005 Dr. Manuel Rodriguez Martinez 14 Example Schedules T1 T1 T2 R(A) R(A) Schedule 2 R(B) R(B) Schedule 1 T2 W(A) W(A) R(C) R(C) W(B) W(B) Abort Commit W(C) W(C) Commit Commit Each row represent an action take a some point In time. DBMS make one action at a time ICOM 6005 Dr. Manuel Rodriguez Martinez 15 Serialization of Schedules • Serial schedule: – A schedule in which each transaction T1, T2, …, Tk is executed one after the other without interleaving • Key idea: – Transactions that interleave operations are ok as long as their schedule is equivalent to a serial schedule • Serializable schedule on transactions T1, T2, …, Tk – Its effect are equivalent to a serial schedule – Performance is better • Interleaving of operations • Not all schedules are serializable • System throughput – number of transactions completed per unit of time – Increases with serializable transactions ICOM 6005 Dr. Manuel Rodriguez Martinez 16 Example of serializability T1 T2 T1 R(A) R(A) R(B) Schedule 1 W(A) Serial equivalent R(C) W(A) R(C) W(C) W(B) Commit ICOM 6005 T2 Commit R(B) W(C) W(B) Commit Commit Dr. Manuel Rodriguez Martinez 17 Anomalies due to interleaving • You want your schedules to be serializable • Otherwise, the following things (considered bad) could happen – Write-read (WR) conflicts – Read-write (RW) conflicts – Write-write (WW) conflicts • SQL allows you to decide the level of concurrency you need – By default you get serializable support ICOM 6005 Dr. Manuel Rodriguez Martinez 18 Write-Read conflicts • Transaction T1 reads uncommitted data produced by transaction T2. – Called a dirty read – Now, if T2 aborts, the work done by T1 is inconsistent • Example: T1 and T2 access Bank account A – – – – – – – T1 reads A with balance $1000 T1 substract $100 from A T2 reads A with balance $900 T1 aborts T2 substract $200 from A T2 stores A with balance $700 T2 commits • Problem: Balance should be $800 not $700 ICOM 6005 Dr. Manuel Rodriguez Martinez 19 Read-write conflicts • Transaction T1 reads some object A, which is also read and modified by T2. • When T1 reads A, the value has changed!!! – Called unrepeatable read • Example: T1 and T2 access Bank account A – – – – – – – – – – T1 reads A with balance $1000 T1 checks balance > $500, goes to do other checks T2 reads A with balance $1000 T2 subtracts $700 T2 writes A T2 commits T1 reads A again T1 subtracts $500 from A T1 writes A T1 commits • Balance is - $300. ICOM 6005 Dr. Manuel Rodriguez Martinez 20 Write-Write Conflicts • Transactions T1 reads object A, and T2 writes a new value to object A. • T1 then writes A to the DB – Called a blind write • Example: T1 and T2 access Bank account A – – – – – – – T1 reads A with balance $1000 T2 sets A to $2000 T2 writes A T2 commits T1 subtracts $500 from A T1 writes A T1 commits • Balance is $500, but update from T2 is lost ICOM 6005 Dr. Manuel Rodriguez Martinez 21