Chapter 9 9 Transaction Management and Concurrency Control Database Systems: Design, Implementation and Management 6th Edition Peter Rob & Carlos Coronel What Is a Transaction? A transaction is a logical unit of work that must be either entirely completed or aborted; no intermediate states are acceptable. 9 Most real-world database transactions are formed by two or more database requests. A database request is the equivalent of a single SQL statement in an application program or transaction. A database request involving update actually involves at least one Read and at least one Write operation. A transaction that changes the contents of the database must alter the database from one consistent database state to another. To ensure consistency of the database, every transaction must begin with the database in a known consistent state. Transaction Examples A transaction includes read and write operations to access the database T2 (“Deposit”) read_item(X); X = X + M; write_item(X); 9 T1 (“Transfer”) read_item(X); X = X - N; write_item(X); read_item(Y); Y = Y + N; write_item(Y); A bit of terminology Read Set = set of all items a transaction reads {X} for T2; {X,Y} for T1 Write Set = set of all items a transaction writes {X} for T2; {X,Y} for T1 What Is a Transaction? Evaluating Transaction Results 9 An accountant wishes to register the credit sale of 100 units of product X to customer Y in the amount of $500.00: Reducing product X’s Quantity on hand by 100. Adding $500.00 to customer Y’s accounts receivable. UPDATE PRODUCT SET PROD_QOH = PROD_QOH - 100 WHERE PROD_CODE = ‘X’; UPDATE ACCREC SET AR_BALANCE = AR_BALANCE + 500 WHERE AR_NUM = ‘Y’; If the above two transactions are not completely executed, the transaction yields an inconsistent database. – Garbage is worse than no data What Is a Transaction? Evaluating Transaction Results 9 The DBMS does not guarantee that the semantic meaning of the transaction truly represents the realworld event. If we define the transaction to be just this one statement, then the DBMS doesn’t know that we have messed up UPDATE PRODUCT SET PROD_QOH = PROD_QOH - 100 WHERE PROD_CODE = ‘X’; What Is a Transaction? Transaction Properties (ACID plus) 9 Atomicity requires that all operations of a transaction be completed; if not, the transaction is aborted. Consistency Preserving – complete execution of transaction takes DB from one consistent state to another Isolation – transaction should appear as though it is running in isolation – transactions shouldn’t interfere with each other Durability – (or permanency) – changes made by committed transactions must persist – cannot be lost. Serializability – concurrent transactions are treated as if they were executed in serial order (one after another) Transaction Management with SQL 9 ANSI has defined standards that govern SQL database transactions Transaction support is provided by two SQL statements: COMMIT and ROLLBACK. When a transaction sequence is initiated, it must continue through all succeeding SQL statements until one of the following four events occurs: A COMMIT statement is reached. A ROLLBACK statement is reached. The end of a program is successfully reached (equivalent to COMMIT). The program is abnormally terminated (equivalent to ROLLBACK). The Transaction Log 9 A transaction log keeps track of all transactions that update the database. The information stored in the log is used by the DBMS for a recovery triggered by a ROLLBACK statement, program crash, or a system failure. The transaction log stores before-and-after data about the database and any of the tables, rows, and attribute values that participated in the transaction. – start of transaction, reads, writes, commits, aborts all noted. The transaction log is itself a database, and it is managed by the DBMS like any other database. The Transaction Log Stores: A record for the beginning of transaction For each transaction component (SQL statement) 9 Type of operation being performed (update, delete, insert) Names of objects affected by the transaction (the name of the table) “Before” and “after” values for updated fields Pointers to previous and next transaction log entries for the same transaction The ending (COMMIT) of the transaction A Snippet of a Transaction Log 9 Concurrency Control Concurrency control coordinates simultaneous execution of transactions in a multiprocessing database. 9 The objective of concurrency control is to preserve the Isolation of transactions – generally by ensuring the serializability of transactions in a multi-user database environment. Important Simultaneous execution of transactions over a shared database can create several data integrity and consistency problems: Lost Updates. Uncommitted Data (Dirty Read) Inconsistent retrievals. (Incorrect Summary) Normal Sequential Execution of Two Transactions 9 T2 (“Deposit”) read_item(X); X = X + M; write_item(X); T1 (“Transfer”) read_item(X); X = X - N; write_item(X); read_item(Y); Y = Y + N; write_item(Y); The Lost Update Problem T2 (“Deposit”) 9 T1 (“Transfer”) read_item(X); X = X - N; read_item(X); X = X + M; write_item(X); read_item(Y); write_item(X); Y = Y + N; write_item(Y); The Dirty Read (Uncommitted Data) Problem T2 (“Deposit”) 9 T1 (“Transfer”) read_item(X); X = X - N; write_item(X); read_item(X); X = X + M; read_item(Y); <CRASH> Recovery sets X back to original value write_item(X); Concurrency Control Inconsistent Retrievals (or Incorrect Summary) 9 Inconsistent retrievals occur when a transaction calculates some summary (aggregate) functions over a set of data while other transactions are updating the data. Example: T1 calculates the total quantity on hand of the products stored in the PRODUCT table. At the same time, T2 updates the quantity on hand (PROD_QOH) for two of the PRODUCT table’s products. Retrieval During Update 9 Transaction Results: Data Entry Correction 9 Inconsistent Retrievals 9 Concurrency Control The Scheduler 9 The scheduler (part of DBMS) establishes the order in which the operations within concurrent transactions are executed. The scheduler interleaves the execution of database operations to make sure that the computer’s CPU is used efficiently – while also ensuring serializability and isolation. To determine the appropriate order, the scheduler bases its actions on concurrency control algorithms, such as locking or time stamping or optimistic methods. Read/Write Conflict Scenarios: Conflicting Database Operations Matrix 9 Concurrency Control with Locking Methods Concurrency can be controlled using locks. 9 A lock guarantees exclusive use of a data item to a current transaction. A transaction acquires a lock prior to data access; the lock is released (unlocked) when the transaction is completed. All locking of information is managed by a lock manager. Concurrency Control with Locking Methods Lock Granularity 9 Lock granularity indicates the level of lock use. Database level (See Figure 9.3) Table level (See Figure 9.4) Page level (See Figure 9.5) Row level (See Figure 9.6) Field (attribute) level A Database-Level Locking Sequence 9 An Example Of A Table-Level Lock 9 An Example Of A Page-Level Lock 9 An Example Of A Row-Level Lock 9 Concurrency Control with Locking Methods Binary Locks 9 A binary lock has only two states: locked (1) or unlocked (0). If an object is locked by a transaction, no other transaction can use that object. If an object is unlocked, any transaction can lock the object for its use. A transaction must unlock the object after its termination. Every transaction requires a lock and unlock operation for each data item that is accessed (which could be at any level of granularity – table, page, row, …). An Example Of A Binary Lock 9 Concurrency Control with Locking Methods Exclusive Locks 9 An exclusive lock exists when access is specially reserved for the transaction that locked the object. The exclusive lock must be used when the potential for conflict exists. An exclusive lock is issued when a transaction wants to write (update) a data item and no locks are currently held on that data item. Concurrency Control with Locking Methods Shared Locks 9 A shared lock exists when concurrent transactions are granted READ access on the basis of a common lock. A shared lock produces no conflict as long as the concurrent transactions are read only. A shared lock is issued when a transaction wants to read data from the database and no exclusive lock is held on that data item. Concurrency Control with Locking Methods Potential Problems with Locks 9 The resulting transaction schedule may not be serializable. The schedule may create deadlocks. Solutions Two-phase locking for the serializability problem. Deadlock detection and prevention techniques for the deadlock problem. Concurrency Control with Locking Methods Two-Phase Locking 9 The two-phase locking protocol defines how transactions acquire and relinquish locks. It guarantees serializability, but it does not prevent deadlocks. In a growing phase, a transaction acquires all the required locks without unlocking any data. Once all locks have been acquired, the transaction is in its locked point. In a shrinking phase, a transaction releases all locks and cannot obtain any new locks. How A Deadlock Condition Is Created 9 Concurrency Control with Locking Methods Three Techniques to Control Deadlocks: 9 Deadlock Prevention A transaction requesting a new lock is aborted if there is a possibility that a deadlock can occur. Restarted later Deadlock Detection The DBMS periodically tests the database for deadlocks. If a deadlock is found, one of the transactions (“victim”) is aborted, and the other transaction continues. Deadlock Avoidance The transaction must obtain all the locks it needs before it can be executed. Database Recovery Management Recovery restores a database from a given state, usually inconsistent, to a previously consistent state. 9 Recovery techniques are based on the atomic transaction property: All portions of the transaction must be treated as a single logical unit of work, All operations must be applied and completed to produce a consistent database. If, for some reason, any transaction operation cannot be completed, the transaction must be aborted, and any changes to the database must be rolled back. Database Recovery Management Levels of Backup 9 Full backup of the database It backs up or dumps the whole database. Differential backup of the database Only the last modifications done to the database are copied. Backup of the transaction log only It backs up all the transaction log operations that are not reflected in a previous backup copy of the database. Database Recovery Management Database Failures 9 Software Operating system, DBMS, application programs, viruses Hardware Memory chip errors, disk crashes, bad disk sectors, disk full errors Programming Exemption Application programs, end users Transaction Deadlocks External Fire, earthquake, flood Transaction Recovery Makes use of deferred-write and write-through Deferred write (or deferred update) 9 Transaction operations do not immediately update the physical database Only the transaction log is updated Database is physically updated only after the transaction reaches its commit point using the transaction log information Transaction Recovery (continued) Write-through 9 Database is immediately updated by transaction operations during the transaction’s execution, even before the transaction reaches its commit point Transaction log is also updated If a transaction fails, the database uses the log information to roll back the database A Transaction Log for Transaction Recovery Examples 9 End Chapter 9 9