Topics for this lecture • Transactions and Concurrency control Transactions and Concurrency Transactions • The goal of transactions – the objects managed by a server must remain in a consistent state • when they are accessed by multiple transactions and • in the presence of server crashes • Recoverable objects – can be recovered after their server crashes – objects are stored in permanent storage – – – – Transactions Concurrency control Nested Transactions Locks • Distributed Transactions – – – – – Flat and nested distributed transactions Atomic commit protocols Concurrency control Distributed Deadlocks Transaction recovery Transaction Atomicity 1. All or nothing: – Either every operation in the transaction completes OR – No operations complete at all – Failure atomicity • effects are atomic even when the server crashes – Durability • after a transaction has completed successfully • all its effects are saved in permanent storage Transaction Atomicity 2. Isolation – Each transaction must be performed without interference from other transactions – Other transactions can see a transaction's intermediate effects – Only once ALL of the operations are complete can other transactions see the results ACID properties • Atomicity - All of nothing • Consistent - A transaction moves from one consistent state to another consistent state • Isolated - Intermediate effects are isolated until the transaction has completed • Durable - Once completed a transaction’s effects are permanent 1 Concurrency Serial execution of T and U • Concurrency occurs when two or more execution flows are able to run simultaneously • Concurrency can cause problems if it is not managed • We shall examine two problems – Lost update – Inconsistent retrieval Opening balances A=100, B = 200 and C = 300 Transaction T balance = b.getBalance(); // balance = 200 b.setBalance(balance*1.1); // balance*1.1 = 220 a.withdraw(balance/10); // balance/10 = 20 Transaction U balance = b.getBalance(); // balance = 220 b.setBalance(balance*1.1); // balance = 242 c.withdraw(balance/10); // balance/10 = 22 Resulting balances A=80, B = 242 and C = 278 Interleaved execution and a lost update Transaction T Transaction U balance = b.getBalance(); b.setBalance(balance*1.1); a.withdraw(balance/10) balance = b.getBalance(); b.setBalance(balance*1.1); c.withdraw(balance/10) Opening balances A=100, B = 200 Transaction V a.withdraw(100); // a.balance = 0 b.deposit(100); // b.balance = 300 balance = b.getBalance(); $200 balance = b.getBalance(); $200 b.setBalance(balance*1.1); $220 b.setBalance(balance*1.1); $220 a.withdraw(balance/10) $80 c.withdraw(balance/10) $280 Resulting balances from this interleaving A=80, B = 220 and C = 280 Different to serial execution!! Interleaved execution and an inconsistent retrieval Transaction V a.withdraw(100); Transaction W aBranch.branchTotal(); b.deposit(100); a.withdraw(100); total=a.getBalance(); Serial execution of V and W Transaction W total=a.getBalance(); // balance = 0 total=total+b.getBalance(); // balance = 300 // total = 300 Resulting balances A=0, B = 300 and total = 300 Interleaved execution that is serially equivalent to T and U Transaction T Transaction U balance = b.getBalance(); b.setBalance(balance*1.1); a.withdraw(balance/10) balance = b.getBalance(); b.setBalance(balance*1.1); c.withdraw(balance/10) balance = b.getBalance(); $200 b.setBalance(balance*1.1); $220 balance = b.getBalance(); $220 b.setBalance(balance*1.1); $242 total=total+b.getBalance(); b.deposit(100); a.withdraw(balance/10) $80 c.withdraw(balance/10) Resulting balances from this interleaving Resulting balances from this interleaving A=0, B = 300 and total=200 <<< WRONG!! A=80, B = 242 and C = 278 $278 2 Conflicts in transactions Operations within different transactions Conflict? Reason read read No No dependency between read operations read write Yes Effects from Read and Write operations depends on their order write write Yes Effects from write and write operations depends on their ordering Commit / Abort • Commit – Makes permanent (durable) the isolated effects of the transaction • Abort – If any part fails (atomicity) – Does not make permanent the isolated effects of the transaction – Intermediate effects are undone Nested transactions • Transactions that are themselves composed of sub-transactions • Form a hierarchy with parents and subtransactions • Allow for concurrency with the transaction • Failure with a flat transaction implies that the whole transaction should be repeated • With a nested transaction only the subtransaction that aborted needs to be repeated Commit rules for nested transactions • The transaction may commit once children have completed • Sub-transactions make independent and final commit / abort decisions • Parent abort implies sub-transaction aborts • Parents can still decide to commit in the presence of sub-transaction aborts. • Top-level transaction abort implies all subtransactions abort Nested transactions T 1 Top level transaction T 1 OpenSubTransaction T 11 T 12 provisional commit abort Locks • Locks provides a means for ensuring serial equivalence • Exclusive locking is where a transaction locks objects exclusively until it commits 3 Serial equivalence with exclusive locking Strict two phase locking • Strict two phase locking Transaction T Transaction balance = b.getBalance(); b.setBalance(balance*1.1); a.withdraw(balance/10); Open transaction balance = b.getBalance(); b.setBalance(balance*1.1); a.withdraw(balance/10); Closetransaction U – Transactions are not allowed to apply more locks once they have released locks – Growing phase = applying locks – Shrinking phase = releasing locks – Locks held until commit balance = b.getBalance(); b.setBalance(balance*1.1); c.withdraw(balance/10); Lock B Lock A Unlock A B Open transaction balance = b.getBalance(); ... Waitis ... balance = b.getBalance(); b.setBalance(balance*1.1); c.withdraw(balance/10); closeTransaction Waits for T to release Lock B Lock B Lock C Unlock B C Locks Nested transaction locking • What do we lock? – Lockable unit should be as small as possible • Parents acquire locks from sub-transactions once they commit – The top level eventually holds all the locks – Aim is to prevent sub-transaction inconsistencies • Lock types – Many transactions can read without conflict – One writer implies possible conflict – We use two different locks read and write locks • Parents do not run concurrent to subtransactions – Sub-transactions acquire locks during their execution from parents if they need them – Thus, parent and sub-transaction can lock same data Lock compatibility Lock requested Read Lock present Write None OK OK Read OK Wait Write Wait Wait Nested transaction locking • Sub transactions can acquire read locks providing that no other sub-transactions hold current write locks – We could have many read locks – Once completed the parent acquires the read locks – Requests for write locks in the presence of read locks would require the sub-transactions to wait for the read locks to released • Once committed locks are passed to parents • Parents may hold write locks and may give them to subtransactions as required and in accordance with lock compatibility rules Dead locks • Two transactions waiting for each other to release locks U T U T V .. .. V T U W 4 Dead lock detection Dead lock timeouts • A deadlock manager has responsibility for detecting deadlocks • It stores details of who transactions are waiting for • Uses the details to detect deadlocks • Can abort transactions if deadlock is detected T – Locks have an initial period of invulnerability – During this time they have exclusive access to the object – Lock requests from other transactions are declined • Venerable locks – Once a time limit has exceeded locks become venerable – Subsequent requests from other transactions cuase the venerable lock to be broken V Deadlock edge graph Transactions • Invulnerable locks Wait for transaction U, W V W W T, U, V T U W Wait for graph Other locking schemes • two-version locking Concurrency • Disadvantages of locks – allows writing of tentative versions with reading of committed versions – Lock maintenance is an overhead – Locks can result in deadlock – Locks do not fully use concurrency • hierarchic locks – Uses different lockable units – i.e. the branchTotal operation locks all the accounts with one lock whereas the other operations lock individual accounts (reduces the number of locks needed) • Optimistic concurrency control – Presume that transactions to not interfere with each other – If a conflict does occur then abort and restart Concurrency • Working phase – Transactions work on tentative objects (most recently committed versions) – Stores a list of which objects are used and how (read and write lists) Transaction validation • • • Validation phase – Occurs once the working phase is complete – The read / write lists are used to determine if conflicts exist • Update phase • Transactions are assigned an incremental transaction number when they start validation The following rules then ensure serialisability of transactions when they overlap Tv Ti Rule Write Read 1 Ti must not read objects written by Tv Read Write 2 Tv must not read objects read by Ti write Write 3 Ti must not write objects written by Tv Tv must not write objects written by Ti Where: Tv is the transaction being validated Tv and Ti are overlapping transactions – tentative objects committed 5 Transaction validation Working phase validation phase Transaction validation Working phase commit phase T1 validation phase commit phase T1 T2 T2 Tv Tv Tactive1 Tactive1 Tactive2 Tactive2 Tv work phase overlaps with T2 work phase Tv work phase overlaps with Tactive1 and Tactive2 work phases Backward validation: Forward validation: Rule 2: compares write set of T2 against read set of Tv Rule 1: compares write set of Tv against read set of Ti (Tactive1 and Tactive2) Tv aborts if a conflict is found choice of actions if a conflict is found (defer, abort Tv abort Ti) Distributed Transactions • Distributed Transactions Distributed Transactions Distributed transactions • A distributed transaction is a transaction that invokes operations in several different servers • Can be either: – Flat or – Nested – – – – – Flat and nested distributed transactions Atomic commit protocols Concurrency control Distributed Deadlocks Transaction recovery Flat distributed transactions – Makes invocations to various remote servers – The flat transaction waits for each request to be serviced before continuing – The flat transaction is sequential X Y T T Z Client Flat transaction 6 Nested distributed transactions Coordination • – Makes invocations to various remote servers – Sub-transactions at the same level can process independently and concurrently – A coordinator could exist in any server – The coordinator is responsible for aborting or committing the distributed transaction – It requires a list of participants in the transaction M T11 T1 T N X T Client • T12 T21 The client then begins its transaction with a request to the coordinator – The coordinator passes back a DIS unique transaction ID (TID) – When the client makes requests of other servers it passes the TID with the request in the transaction T2 Y T22 Coordination is required when committing P Nested transaction • Servers that are accessed become participants – They register their participation using the TID when they receive invocations Atomic commit protocols • All nodes in a DIS agree to commit or all nodes abort Atomic commit protocols • Two phase commit protocol – Allows any participant to abort its part – Any abort implies all abort (atomicity) • Two phases • One phase commit protocol – Phase 1: Participant votes (commit or abort) – Coordinator keeps sending messages to participants until they acknowledge – Concurrency management can cause problems – Coordinator cannot abort once a client has requested commit Atomic commit protocols • Distributed transactions are message based and messages can get lost Coordinator canCommit? Yes doCommit haveCommitted Participent • Once voted commit they must be able to meet that obligation • To ensure this participants store their intermediate objects – Phase 2: vote execution • Coordinator collects decisions • If no failures and all votes were yes then the coordinator tells participants to commit • Otherwise it tells them all to abort Atomic commit protocols • Time-outs to detect failures – No coordinator doCommit • Participants can request decision from coordinator if they have been waiting for a long time (getDecision) – No coordinator canCommit • Participants can abort if excessive time has elapsed since it completed its portion of the transaction and the server has not requested voting – Coordinator waiting on a client vote • Decides to abort • Coordinator failure – A big problem • Only the coordinator knew the participants • Retry with new coordinator or use a cooperative protocol 7 Atomic commit protocols Atomic commit protocols • Two phase commit protocols and nested transactions M Abort T11 T1 T11 T1 Provisional commit T T N X Provisional commit T12 T Provisional commit T21 T2 T12 T21 T2 Client Y Abort Provisional commit T22 T22 P • Top level transaction is the coordinator • Sub-transactions become coordinators of other sub-transactions • Results propagate up the hierarchy • When a sub-transaction aborts it merely passes to its parent the abort decision • When they provisionally commit it passes their details to the parent Nested transaction Atomic commit protocols Concurrency Control for Distributed Transactions • “Each server manages a set of objects and is responsible for ensuring that they remain consistent when accessed by concurrent transactions Transaction status Coordinator information Coordinator T T1 T2 T11 T12,T21 T22 Child Participant transactions T 1 ,T 2 T11, T12 T21, T22 Yes Yes Provisional commit list Abort list T1, T12 T1, T12 T11, T2 T11 T2 T11 No (aborted) No (aborted) T12, not T21 No (parent aborted) T21, T12 T22 Locks • Locks are held locally by a local lock manager • This can grant locks and tell other transactions to wait • Releasing locks is less simple • Must wait for all other servers to commit their portions of a distributed transaction before releasing its locks • Lock remain through the A2PC (Atomic Two Phase Commit) – therefore, each server is responsible for applying concurrency control to its own objects. – the members of a collection of servers of distributed transactions are jointly responsible for ensuring that they are performed in a serially equivalent manner – therefore if transaction T is before transaction U in their conflicting access to objects at one of the servers then they must be in that order at all of the servers whose objects are accessed in a conflicting manner by both T and U” Coulouris et al 2005 Optimistic concurrency Control • Recall that with optimistic concurrency control transactions are validated before commit • With distributed transactions – transaction numbers assigned at start of validation – transactions serialized according to transaction numbers – validation takes place in phase 1 of 2PC protocol • Transaction is validated at many servers • Each validates its objects against other ongoing transactions 8 Commitment deadlock • • • • Optimistic concurrency Control T is validated at X, U is validated at U U cannot validate as T has not committed T cannot validate as U has not committed Thus, we have commitment deadlock T Read(A) – Rule 2 + Rule 3 is used for backward validation U at X Read(B) Write(A) Read(B) • Distributed validation can be slow as the A2PC protocol waits for validation • Parallel validation can help • This involves forward and backward validation • Parallel validation can lead to different servers using different serialised orders (U->T and T->U) • This can be avoided by checking and the use of globally unique transaction IDs at Y Write(B) at Y Read(A) Write(B) at X Write(A) Distributed deadlock Distributed deadlock • Single server transactions can suffer deadlocks – We can prevent, detect and resolve – Timeouts can be used for this but are not ideal – Detection is preferable possibly using wait-for graphs • Distributed transactions lead to distributed deadlocks – We can build global wait for graphs from local graphs – However, with distributed transactions we can obtain deadlocks that do not show in local wait for graph – This are what are referred to as distributed deadlocks. Distributed deadlock W Held by C Waits for Held by W A A Held by Z Waits for Waits for V Held by V Held by U Held by B Waits for Y U Held by a.Deposit(20) Lock A at X b.Withdraw(30) Wait at Y b.Deposit(10) c.Withdraw(20) W Lock B at Y c.Deposit(30) Lock C at Z a.Withdraw(20) Wait at X Wait at Z • Centralised deadlock detector Waits for X Z V Lock D Distributed deadlock C D U d.Deposit(10) B Waits for – Servers periodically send their local wait for graphs to the server – The server joins them together to check for distributed deadlock – It then decides how best to resolve the problem • Which transaction to abort • Phantom deadlock – Deadlocks that do not really exist – Occur due to time delays in determining deadlocks 9 Distributed deadlock detection Distributed deadlock detection • Initiation • Edge chasing – Does not construct a wait for graph – Instead uses probes to detect deadlocks • Algorithm consists of three steps – Initiation – Detection – Resolution • Resolution – This simply involves aborting one of the transactions and thus breaking the deadlock – If a server notices a transaction (TA) is waiting for another transaction TB • It sends a probe to the server that is blocked (server Sx) • The probe contains the edge TA -> TB (A waiting for B) • Detection – Receivers of probes (Sx) check whether the TB is waiting for another transaction • If it is (lets say TB is waiting on TC) then it sends a probe to its blocked server (Sy) • The new probe contains the edge TA->TB->TC – Eventually a probe might contain a cycle • TA->TB->TC->TA is a cycle as it loops back to itself Distributed transaction recovery Distributed transaction recovery • Atomicity requirements • Recovery manager responsibilities – All of the effects of completed transactions are permanent – None of the effects of partially or aborted transactions are permanent • Durability – Saving objects in permanent storage – Restoring server objects subsequent to crashes – Managing the performance through reorganisation of recovery files – Reclaiming storage space from the recover file – Objects are saved in permanent storage • Failure atomicity – Effects of transactions are permanent even when servers crash • For security we need to consider back-up recovery facilities – Preferably off-site and isolated • Both aspects are governed by the recovery manager Distributed transaction recovery Distributed transaction recovery • Intention lists • Logging technique – Held at each server for each transaction – Is a list of references to objects that were used within a transaction – Contains the values of the objects too – The commit process uses the intention list to determine what needs to be committed – Changed objects are then written to a recovery file – Intentions are also written to the recovery file – Abort uses intention list to delete tentative objects – A recovery file contains history of all the transactions – It contains • Intention lists, values of objects and status entries – Status • Prepared: ready to commit changes • Committed: previously commit changes • Abort: aborted transaction 10 Distributed transaction recovery Object: A Object: B Object: C Object: A Object: B Trans: T Trans: T 100 200 300 80 220 Prepared committed Distributed transaction recovery • Recovery is performed by the recovery manager • Objects are created and then their details filled by the recovery manager <A,P1> <B,P2> • It applies atomicity P0 p1 p0 p2 p3 – Transactions fully or not at – All in the same order p4 Position 0 Position 1 Position 2 Position 3 Position 4 The state of objects A, B and C prior to transition T The tentative object A The tentative object B Transaction T is prepared T was committed <A,P1> intention list contains reference to tentative object A at P1 and <B,P2> intention list contains reference to tentative object B at P0 • Recovery could work forwards from the beginning of recovery file • Or backwards from the end (this can be more efficient) Rollback position prior to T is at P0 Summary END • Transactions and Concurrency control – – – – Transactions Concurrency control Nested Transactions Locks • Distributed Transactions – – – – – Flat and nested distributed transactions Atomic commit protocols Concurrency control Distributed Deadlocks Transaction recovery 11