1DT066 DISTRIBUTED INFORMATION SYSTEM Transactions and Concurrency Control 1 OUTLINE Motivation Transaction Two Concepts Phase Commit Distributed Transactions and Deadlocks Summary 2 1 MOTIVATION What happens if a failure occurs during modification of resources? Which operations have been completed? Which operations have not (and have to be done again)? In which states will the resources be? 3 1 REVISIT OF FUNDS TRANSFER EXAMPLE Balances at t0 Acc1: 7500, Acc2: 0 Funds transfer from Acc1 to Acc2: t0 t1 t2 t3 t4 t5 t6 t7 Time Acc1->debit(7500): Acc1->lock(write); Acc1.balance=0; Acc1->unlock(write); Acc2->credit(7500): Acc2->lock(write); Acc2.balance=7500; Acc2->unlock(write); 4 1 FUNDS TRANSFER IN CONCURRENCY Balances at t0 Acc1: 7500, Acc2: 0 Funds transfer from Acc1 t0 t1 t2 Funds transfer to Acc2 Acc1->debit(7500): t3 t4 t5 t6 t7 Time Acc1->lock(write); Acc2->credit(7500): Acc1.balance=0; Acc2->lock(write); Acc1->unlock(write); Acc2.balance=7500; Acc2->unlock(write); 5 2 TRANSACTION CONCEPTS 1 ACID Properties Atomicity Consistency Isolation Durability 2 Transaction Commit vs. Abort 3 Roles of Distributed Components 4 Flat vs. Nested Transactions 6 2.1.1 ATOMICITY Transactions are either performed completely or no modification is done. Start of a transaction is a continuation point to which it can roll back. End of transaction is next continuation point. 7 2.1.2 CONSISTENCY Shared resources should always be consistent. Inconsistent states occur during transactions: hidden for concurrent transactions to be resolved before end of transaction. Application defines consistency and is responsible for ensuring it is maintained. Transactions can be aborted if they cannot resolve inconsistencies. 8 2.1.3 ISOLATION Each transaction accesses resources as if there were no other concurrent transactions. Modifications of the transaction are not visible to other resources before it finishes. Modifications of other transactions are not visible during the transaction at all. Implemented through: two-phase locking or optimistic concurrency control. 9 2.1.4 DURABILITY A completed transaction is always persistent (though values may be changed by later transactions). Modified resources must be held on persistent storage before transaction can complete. Wide use of hard disks. 10 2.2 TRANSACTION COMMANDS Begin: Start a new transaction. Commit: End a transaction. Store changes made during transaction. Make changes accessible to other transactions. Abort: End a transaction. Undo all changes made during the transaction. 11 2.3 ROLES OF COMPONENTS Distributed system components involved in transactions can take role of: Transactional Client Transactional Server Coordinator 12 2.3.1 COORDINATOR Coordinator plays key role in managing transaction. Coordinator is the component that handles begin / commit / abort transaction calls. Coordinator allocates system-wide unique transaction identifier. Different transactions may have different coordinators. 13 2.3.2 TRANSACTIONAL SERVER Every component with a resource accessed or modified under transaction control. Transactional server has to know coordinator. Transactional server registers its participation in a transaction with the coordinator. Transactional server has to implement a transaction protocol (two-phase commit). 14 2.3.3 TRANSACTIONAL CLIENT Only sees transactions through the transaction coordinator. Invokes services from the coordinator to begin, commit and abort transactions. Implementation of transactions are transparent for the client. Cannot tell difference between server and transactional server. 15 2.4 DISTRIBUTED TRANSACTIONS (a) Flat transaction (b) Nested transactions M X T11 X Client T Y T T1 N T 12 T T 21 T2 Client Y P Z T 22 16 2.4 FLAT TRANSACTIONS Begin Trans. Commit Flat Transaction Begin Trans. Crash Flat Transaction Rollbac k Begin Trans. Abort Flat Transaction Rollbac k 17 2.4 NESTED TRANSACTIONS Begin Trans. Commit Main Transaction Call Begin Trans. Call Commit Begin Trans. Commit Call Begin Trans. Commit 18 3 TWO-PHASE COMMIT Multiple autonomous distributed servers: For a commit, all transactional servers have to be able to commit. If a single transactional server cannot commit its changes every server has to abort. Single phase protocol is insufficient. Two phases are needed: Phase one: Voting Phase two: Completion. 19 3.1 PHASE ONE Called the voting phase. Coordinator asks all servers if they are able (and willing) to commit. Servers reply: Yes: it will commit if asked, but does not yet know if it is actually going to commit. No: it immediately aborts its operations. Hence, servers can unilaterally abort but not unilaterally commit a transaction. 20 3.1 PHASE TWO Called the completion phase. Co-ordinator collates all votes, including its own, and decides to commit if everyone voted ‘Yes’. abort if anyone voted ‘No’. All voters that voted ‘Yes’ are sent ‘DoCommit’ if transaction is to be committed. Otherwise ‘Abort'. Servers acknowledge DoCommit once they have committed. 21 3.1 SERVER UNCERTAINTY Period when a server must be able to commit, but does not yet know if has to. This period is known as server uncertainty. Usually short (time needed for coordinator to receive and process votes). However, failures can lengthen this process, which may cause problems. 22 3.2 RECOVERY IN TWO-PHASE COMMIT Failures prior to start of 2PC results in abort. Coordinator failure prior to transmitting commit messages results in abort. After this point, coordinator will retransmit all commit messages on restart. If server fails prior to voting, it aborts. If it fails after voting, it sends GetDecision. If it fails after committing it (re)sends HaveCommitted message. 23 3.2 COMPLEXITY Assuming N participating servers: (N-1) Voting requests from coordinator to servers. (N-1) Votes from servers to coordinator. At most (N-1) Completion requests from coordinator to servers. (When commit) (N-1) acknowledgement from servers to coordinator. Hence, complexity of requests is linear in the number of participating servers. 24 3.3 COMMITTING NESTED TRANSACTIONS Cannot use same mechanism to commit nested transactions as: subtransactions can abort independent of parent. subtransactions must have made decision to commit or abort before parent transaction. Top level transaction needs to be able to communicate its decision down to all subtransactions so they may react accordingly. 25 3.3 PROVISIONAL COMMIT Subtransactions vote either: aborted or provisionally committed. Abort is handled as normal. Provisional commit means that coordinator and transactional servers are willing to commit subtransaction but have not yet done so. 26 3.3 EXAMPLE FOR A NESTED TRANSACTION T 11 T1 provisional commit (at X) T T provisional commit (at N) T21 provisional commit (at N) T provisional commit (at P) 12 T 2 abort (at M) aborted (at Y) 22 27 3.3 INFORMATION HELD BY COORDINATORS Coordinator of transaction T T1 T2 T 11 T 12 , T 21 T 22 Child transactions T 1, T 2 T 11 , T 12 T 21 , T 22 Participant Provisional commit list T 1, T 12 T 1, T 12 yes yes no (aborted) no (aborted) T 12 but not T 21 T 21 , T 12 no (parent aborted)T 22 Abort list T 11 , T 2 T 11 T2 T 11 28 3.3 TWO-PHASE COMMIT FOR NESTED TRANSACTIONS For nested transactions, the top-level transaction plays as coordinator, while participants are all the provisionally committed subtransaction coordinators without aborted ancestors. Hierarchic two-phase commit: a multi-level nested protocol where the coordinator communicates to the immediate child transaction coordinator in a hierarchic fashion. Flat two-phase commit: the coordinator contact all participants with provisional commit directly. 29 3.3 LOCKING AND PROVISIONAL COMMITS Locks cannot be released after provisional commit. Data items remain ‘protected’ until top-level transaction commits. This may reduce concurrency. Interactions between sibling subtransactions: should they be prevented as they are different? allowed as they are part of the same transaction? Generally they are prevented. 30 4 DISTRIBUTED TRANSACTIONS AND DEADLOCKS In distributed transactions, each server is responsible for applying concurrency control to its own objects, and all the servers jointly ensure the concurrent transactions are performed in a serially equivalent manner. This means interleavings of two transactions have to be serially equivalent both locally at each server and globally. 31 4.1 INTERLEAVINGS OF TWO TRANSACTIONS T Write (A) U at X Write (B) Read (B) at Y Read (A) at Y at X Transaction T before Transaction U on server X Transaction U before Transaction T on server Y This is not serially equivalent globally since T before U in one server and U before T in another. 32 4.1 INTERLEAVINGS OF TRANSACTIONS U, V AND W U d.deposit(10) a.deposit(20) b.withdraw(30) V lock D at Z b.deposit(10) lock A at X W lock B at Y c.deposit(30) lock C at Z a.withdraw(20) wait at X wait at Y c.withdraw(20) wait at Z 33 4.3 DISTRIBUTED DEADLOCK (a) (b) W Held by D C A X Z Waits for W Waits for V Held by Held by V U U Waits for B Held by Y 34 4.2 LOCAL AND GLOBAL WAIT-FOR GRAPHS local wait-for graph T U X local wait-for graph V global deadlock detector T T Y U V • Phantom deadlock: A deadlock that is “detected” but is not really a deadlock is called a phantom deadlock. • E.g.: Transaction U releases an object at server X and requests the one held by V at server Y. Assuming the 35 latter is first received. 4.3 PROBES TRANSMITTED TO DETECT DEADLOCK W W U V W Held by Waits for Deadlock detected C A Z W U V Waits for Initiation X W U V U Held by Y B Waits for 36 4.3 TWO PROBES INITIATED (a) initial situation Waits for V Waits for T U W (c) detection initiated at object requested by W (b) detection initiated at object requested by T Waits for T V T U T W U TUWV TUW V W V VTU U W W T V W Waits for 37 6 SUMMARY Transaction concepts: ACID Transaction commands Roles of distributed components in transactions Two-phase commit phase one: voting phase two: completion Distributed Transactions and Distributed Deadlocks Read Textbook Chapter 16. 38