Distributed Systems Session 9: Transactions Christos Kloukinas Dept. of Computing City University London © City University London, Dept. of Computing Distributed Systems / 9 - 1 Last Session: Summary Lost updates and inconsistent analysis. Pessimistic vs. optimistic concurrency control » Pessimistic control: – higher overhead for locking. + works efficiently in cases where conflicts are likely » Optimistic control: + small overhead when conflicts are unlikely. – distributed computation of conflict sets expensive. – requires global clock. CORBA uses pessimistic two-phase locking. © City University London, Dept. of Computing Distributed Systems / 9 - 2 Session 9 - Outline 1 Motivation 2 Transaction Concepts 3 Two phase Commit 4 CORBA Transaction Service 5 Summary © City University London, Dept. of Computing Distributed Systems / 9 - 3 1 Motivation What happens if a failure occurs during modification of resources? » e.g. system failure, disk crash Which operations have been completed? » e.g in cash transfer scenario; was debit successful? Which operations have not (and have to be done again)? » was credit unsuccessful?? In which states will the resources be? » e.g. Has the money being lost on its way and does it need to be recovered? © City University London, Dept. of Computing Distributed Systems / 9 - 4 What is Required? Transactions Clusters a sequence of object requests together such that they are performed with ACID properties » i.e transaction is either performed completely or not at all » leads from one consistent state to another » is executed in isolation from other transactions » once completed it is durable Used in Databases and Distributed Systems For example consider the Bank account scenario from last session © City University London, Dept. of Computing Distributed Systems / 9 - 5 Scenario Class Diagram DirectBanking +funds_transfer(from:Account, to:Account,amount:float) Account -balance:float =0 +credit(amount:float) InlandRevenue +debit(amount:float) +get_balance():float +sum_of_accounts( set:Account[]):float •A funds transfer involving a debit operation from one account and a credit operation from another account would be regarded as a transaction •Both operations will have to be executed or not at all. •They leave the system in a consistent state. •They should be isolated from other transactions. •They should be durable, once transaction is completed. © City University London, Dept. of Computing Distributed Systems / 9 - 6 2 Transaction Concepts 1 ACID Properties » Atomicity » Consistency » Isolation » Durability 2 Transaction Commands: Commit vs. Abort 3 Identify Roles of Distributed Components 4 Flat vs. Nested Transactions © City University London, Dept. of Computing Distributed Systems / 9 - 7 2.1.1 Atomicity Transactions are either performed completely or no modification is done. » I.e perform successfully every operation in cluster or none is performed » e.g. both debit and credit in the scenario Start of a transaction is a continuation point to which it can roll back. End of transaction is next continuation point. © City University London, Dept. of Computing Distributed Systems / 9 - 8 2.1.2 Consistency Shared resources should always be consistent. Inconsistent states occur during transactions: » hidden for concurrent transactions » to be resolved before end of transaction. Application defines consistency and is responsible for ensuring it is maintained. » e.g for our scenario, consistency means “no money is lost”: at the end this is true, but in between operations it may be not Transactions can be aborted if they cannot resolve inconsistencies. © City University London, Dept. of Computing Distributed Systems / 9 - 9 2.1.3 Isolation Each transaction accesses resources as if there were no other concurrent transactions. Modifications of the transaction are not visible to other resources before it finishes. Modifications of other transactions are not visible during the transaction at all. Implemented through: » two-phase locking or » optimistic concurrency control. © City University London, Dept. of Computing Distributed Systems / 9 - 10 2.1.4 Durability A completed transaction is always persistent (though values may be changed by later transactions). Modified resources must be held on persistent storage before transaction can complete. May not just be disk but can include properly battery-backed RAM or the like of EEPROMs. © City University London, Dept. of Computing Distributed Systems / 9 - 11 2.2 Transaction Commands Begin: » Start a new transaction. Commit: » End a transaction. » Store changes made during transaction. » Make changes accessible to other transactions. Abort: » End a transaction. » Undo all changes made during the transaction. © City University London, Dept. of Computing Distributed Systems / 9 - 12 © City University London, Dept. of Computing Distributed Systems / 9 - 13 2.3 Roles of Components Distributed system components involved in transactions can take role of: Transactional Client Transactional Server Coordinator © City University London, Dept. of Computing Distributed Systems / 9 - 14 2.3.1 Coordinator Coordinator plays key role in managing transaction. Coordinator is the component that handles begin / commit / abort transaction calls. Coordinator allocates system-wide unique transaction identifier. Different transactions may have different coordinators. © City University London, Dept. of Computing Distributed Systems / 9 - 15 2.3.2 Transactional Server Every component with a resource accessed or modified under transaction control. Transactional server has to know coordinator. Transactional server registers its participation in a transaction with the coordinator. Transactional server has to implement a transaction protocol (two-phase commit). © City University London, Dept. of Computing Distributed Systems / 9 - 16 2.3.3 Transactional Client Only sees transactions through the transaction coordinator. Invokes services from the coordinator to begin, commit and abort transactions. Implementation of transactions are transparent for the client. Cannot tell difference between server and transactional server. © City University London, Dept. of Computing Distributed Systems / 9 - 17 2.4 Flat Transactions Begin Trans. Commit Flat Transaction Begin Trans. Crash Flat Transaction Rollbac k © City University London, Dept. of Computing Begin Trans. Abort Flat Transaction Rollbac k Distributed Systems / 9 - 18 3 Two-Phase Commit Committing a distributed transaction involves distributed decision making.Communication defines commit protocol Multiple autonomous distributed servers: » For a commit, all transactional servers have to be able to commit. » If a single transactional server cannot commit its changes, then every server has to abort. Single phase protocol is insufficient. Two phases are needed: » Phase one: Voting » Phase two: Completion. © City University London, Dept. of Computing Distributed Systems / 9 - 19 3 Phase One Called the voting phase. Coordinator asks all servers if they are able (and willing) to commit. Servers reply: » Yes: it will commit if asked, but does not know yet if it is actually going to commit. » No: it immediately aborts its operations. Hence, servers can unilaterally abort but not unilaterally commit a transaction. © City University London, Dept. of Computing Distributed Systems / 9 - 20 3 Phase Two Called the completion phase. Co-ordinator collates all votes, including its own, and decides to » commit if everyone voted ‘Yes’. » abort if anyone voted ‘No’. All voters that voted ‘Yes’ are sent » ‘DoCommit’ if transaction is to be committed. » Otherwise ‘Abort'. Servers acknowledge DoCommit once they have committed. © City University London, Dept. of Computing Distributed Systems / 9 - 21 Object Request for 2 Phase Commit begin() debit() Acc1@BankA :Resource Acc2@BankB :Resource :Coordinator register_resource() credit() register_resource() commit() vote() vote() doCommit() doCommit() © City University London, Dept. of Computing Distributed Systems / 9 - 22 3 Server Uncertainty (1) Period when a server is able to commit, but does not yet know if it has to. This period is known as server uncertainty. Usually short (time needed for co-ordinator to receive and process votes). However, failures can lengthen this process, which may cause problems. Solution is to store changes of transaction in temporary persistent storage e.g. log file, and use for recovery on restarting. © City University London, Dept. of Computing Distributed Systems / 9 - 23 3 Recovery in Two-Phase Commit Failures prior to the start of 2PC result in abort. If server fails prior to voting, it aborts. If it fails after voting, it sends GetDecision. If it fails after committing it (re)sends HaveCommitted message. © City University London, Dept. of Computing Distributed Systems / 9 - 24 3 Recovery in Two-Phase Commit Coordinator failure prior to transmitting DoCommit messages results in abort (since no server has already committed). After this point, co-ordinator will retransmit all DoCommit messages on restart. » This is why servers have to store even their provisional changes in a persistent way. » The coordinator itself needs to store the set of participating servers in a persistent way too. © City University London, Dept. of Computing Distributed Systems / 9 - 25 4 Transaction Example: Funds Transfer Acc1@bankA Acc2@bankB (Resource) begin() (Resource) Current Coordinator debit() get_control() credit() register_resource() get_control() commit() register_resource() prepare() prepare() commit() commit() © City University London, Dept. of Computing Distributed Systems / 9 - 26 5 Summary Transaction concepts: » ACID » Transaction commands (begin, (vote), commit, abort) » Roles of distributed components in transactions Two-phase commit » phase one: voting » phase two: completion CORBA Transaction Service » implements two-phase commit » needs resources that are transaction aware. © City University London, Dept. of Computing Distributed Systems / 9 - 27 4 CORBA Transaction Service Application Objects CORBA facilities Object Request Broker Transaction © City University London, Dept. of Computing CORBA services Distributed Systems / 9 - 28 4 IDL Interfaces Object Transaction Service defined through three IDL interfaces: Current (transaction interface) Coordinator Resource (transactional servers) © City University London, Dept. of Computing Distributed Systems / 9 - 29 4 Current – Implicit Current Tx interface Current { void begin() raises (...); void commit (in boolean report_heuristics) raises (NoTransaction, HeuristicMixed, HeuristicHazard); void rollback() raises(NoTransaction); Status get_status(); string get_transaction_name(); Coordinator get_control(); Coordinator suspend(); void resume(in Coordinator which) raises(InvalidControl); }; (every CORBA object has an implicit transaction associated with it) © City University London, Dept. of Computing Distributed Systems / 9 - 30 4 Coordinator – Explicit Tx Coordinator interface Coordinator { Status get_status(); Status get_parent_status(); Status get_top_level_status(); boolean is_same_transaction(in Coordinator tr); boolean is_related_transaction(in Coordinator tr); RecoveryCoordinator register_resource( in Resource r) raises(Inactive); void register_subtran_aware( in subtransactionAwareResource r) raises(Inactive, NotSubtransaction); ... }; © City University London, Dept. of Computing Distributed Systems / 9 - 31 4 Resource interface Resource { Vote prepare(); // ask the resource/server to vote void rollback() raises(...); void commit() raises(...); void commit_one_phase raises(...); void forget(); }; interface SubtransactionAwareResource:Resource { void commit_subtransaction(in Coordinator p); void rollback_subtransaction(); }; © City University London, Dept. of Computing Distributed Systems / 9 - 32