Chapters 21 & 22 6e -19 & 20 5e: Trans Processing & Concurrency Control CSE 4701 Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT 06269-3155 steve@engr.uconn.edu http://www.engr.uconn.edu/~steve (860) 486 – 4818 A portion of these slides are being used with the permission of Dr. Ling Lui, Associate Professor, College of Computing, Georgia Tech. Other slides have been included/modified from the PPTs provided by the Fundamentals of Database Systems 6e course materials. Remaining slides represent new material. Chaps21&22-1 Overview of Material CSE 4701 Key Background Topics: Synchronization in Operating Systems Transaction and Deadlock Concepts Prevention, Avoidance, Detection Chapter 21 6e; 19 5e - Transaction Processing Concurrency Control Data Consistency Problems Schedules and Serializability Chapter 22 6e; 20 5e - Concurrency Control Different Locking-Based Algorithms 2 Phase Protocol Deadlock and Livelock Optimistic Concurrency Control Chaps21&22-2 What is Synchronization? CSE 4701 Ability of Two or More Serial Processes to Interact During Their Execution to Achieve Common Goal Recognition that “Today’s” Applications Require Multiple Interacting Processes Client/Server and Multi-Tiered Architectures Inter-Process Communication via TCP/IP Mobile Applications Interacting with Cloud, Web, or REST Services Fundamental Concern: Address Concurrency Control Access to Shared Information Historically Supported in Database Systems Currently Available in Many Programming Languages Chaps21&22-3 Thread Synchronization CSE 4701 Suppose X and Y are Concurrently Executing in Same Address Space What are Possibilities? X Y 1 2 3 What Does Behavior at Left Represent? Synchronous Execution! X Does First Part of Task Y Next Part Depends on X X Third Part Depends on Y Threads Must Coordinate Execution of Their Effort Chaps21&22-4 Thread Synchronization CSE 4701 Now, What Does Behavior at Left Represent? X Y Asynchronous Execution! X Does First Part of Task 1 Y Does Second Part Concurrent 2 3 with X Doing Third Part What are Issues? Will Second Part Still Finish After Third Part? Will Second Part Now Finish Before Third Part? What Happens if Variables are Shared? This is the Database Concern - Concurrent Transactions Against Shared Tables! Chaps21&22-5 Databases have Transactions CSE 4701 A Transaction is A Logic Unit of Database Processing Represents the Collection of Actions that Make Consistent Transformations of System States while Preserving System Consistency Interleaved (A and B) and Concurrent (C and D) Chaps21&22-6 Two Sample Transactions CSE 4701 Transaction T1 Reads/Writes X/Y, Modifying X by Subtracting N and Y by Adding N Transaction T2 Reads X and Modifies X by Adding M Transactions can Execute Serially: T1 followed by T2 (or reverse) Interleaved: Operation by Operation What Could Happen in Serial Case? What can go Wrong in Interleaved Case? Chaps21&22-7 CSE 4701 What’s Possible? X=10, N=5, Y=15, M=7 What is the Result of Each? T1 T2 Read(X); X:=XN; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; T1 T2 Read(X); X:=X+M; Write(X); commit; Read(X); X:=X+M; Write(X); commit; X=12, Y=35 Read(X); X:=XN; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; X=12, Y=35 T1 T2 Read(X); X:=XN; Read(X); X:=X+M; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Write(X); commit; X=17, Y=35 Chaps21&22-8 Why Do we Need to Synchronize? CSE 4701 Promote Sharing of Resources, Data, etc. Cooperating Processes Norm Not Exception Difficult to Program Solutions Handle Concurrent Behavior of Processes Multiple Processes Interacting via OS Resources Under Control of Process Manager/Scheduler Performance, Parallel Algorithms & Computations Multi-Processor Architectures Different OS to Handle Multiple Processors Client/Server and Multi-Tier Architectures Underlying Database Support Concurrent Transactions Shared Databases Chaps21&22-9 Potential Problems without Synchronization? CSE 4701 Data Inconsistency Lost-Update Problem Impact on Correctness of Executing Software Deadlock Two Processes (Transactions) Each Hold Unique Resource (Data Item) and Want Resource (Date Item) of Other Process (Trans.) Processes Wait Forever Non-Determinacy of Computations Behavior of Computation Different for Different Executions Two Processes (Transactions) Produce Different Results When Executed More than Once on Same Data Chaps21&22-10 Classic Synchronization Techniques CSE 4701 Goal: Shared Variables and Resources Two Approaches: Critical Sections Define a Segment of Code as Critical Section Once Execution Enters Code Segment it Cannot be Interrupted by Scheduler Release Point for Critical Section for Interrupts We’ll Briefly Review Semaphores Proposed in 1960s by E. Dijkstra Utilizes ADTs to Design and Implement Behavior that Guarantees Consistency and Non-Deadlock Covered in OS Course (CSE4300) Chaps21&22-11 Critical Sections CSE 4701 Two Processes Share “balance” Data Item Remember, Assignment is Not Atomic, but May Correspond to Multiple Assembly Instructions shared float balance; Code for p1 . . . balance = balance + amount; . . . Code for p2 . . . balance = balance - amount; . . . p1 p2 balance Chaps21&22-12 Critical Sections CSE 4701 Recall the Code Below shared double balance; Code for p1 Code for p2 . . . . . . balance = balance + amount; balance = balance - amount; . . . . . . What Happens if Time Slice Expires at Arrow and an Interrupt is Generated? Code for p1 load load add store R1, R2, R1, R1, balance amount R2 balance Code for p2 load load sub store R1, R2, R1, R1, balance amount R2 balance Chaps21&22-13 Critical Sections (continued) CSE 4701 There is a Race to Execute Critical Sections Sections May Be Different Code in Different Processes: Cannot Detect With Static Analysis Results of Multiple Execution Are Not Determinate Need an OS Mechanism to Resolve Races If p1 Wins, R1 and R2 Added to Balance - Okay If p2 Wins, its Changed Balance Different from One Held by p1 which Adds/Writes Wrong Value Code for p1 load load add store R1, R2, R1, R1, balance amount R2 balance Code for p2 load load sub store R1, R2, R1, R1, balance amount R2 balance Chaps21&22-14 Deadlock in Databases CSE 4701 Databases Must Control Access to Information by Multiple Concurrent Transactions (Processes) How do we Prevent Simultaneous Updates of Database by Concurrent Transactions (Processes)? Data is the Resource in Database System Trans. 1 Trans. 2 Trans. 3 Shared Database Chaps21&22-15 General Problem: A Deadly Embrace CSE 4701 T1 has A Wants B - Won’t Release A Until it Gets B T2 has B Wants A - Won’t Release B Until it Gets A T3 has C Wants A - Won’t Release C Until it Gets A What is the End Result? Deadlock! Trans. 1 Trans. 2 Trans. 3 Data Item A Data Item B Data Item C Chaps21&22-16 Addressing Deadlock CSE 4701 Deadlock is Global Condition! Need to Analyze All Processes that Need All Resources Can’t Make Local Decision Based on Needs of One Process Four Deadlock Approaches: Prevention: Never Allow Deadlock to Occur Avoidance: System Makes Decision to Head Off Future Deadlock State Detection & Recovery: Check for Deadlock (Periodically or Sporadically), Then Recover Manual Intervention: Operator Reboot if System Seems Too Slow Chaps21&22-17 Prevention: A First Look CSE 4701 Design the System So that Deadlock is Impossible Deadlock Only Occurs If All Following TRUE!! Mutual Exclusion: Allocated Data Items are Exclusive Property of Transaction Hold and Wait: Transaction Can Hold Data Items While Waiting for Another Resource Circular Waiting: T1 has A needs B, T2 has B needs C, … Tn has Z needs A No Preemption: Only Transaction Can Release Data Items or Withdraw Data Items Request All Four Necessary for Deadlock to Exist Prevention Requires Concurrency Control Manager to Violate at Least One Condition at All Times! Chaps21&22-18 Avoidance: A First Look CSE 4701 Construct a Formal Model of System States Via Model, Choose a Strategy that Will Not Allow the System to Go to a Deadlock State Predictive Approach: Requires Transaction to Declare Intent re. Data Items in Advance Transaction X Needs A, B, and D Represents “Maximum Claim” on Data Items Transaction X Won’t Proceed Until all Data Items Available May Require “Long” Waits Amenable to Formal Solution/Algorithm Chaps21&22-19 Detection and Recovery: A First Look CSE 4701 When Deadlock Occurs, Can we Detect and Recover? Two Phases to Algorithm Detection: Is there Deadlock? Recovery: Preempt Data Items from Transactions Detection Algorithm When is it Executed? What is its Overhead? Too Often - Wastes Data Items Too Infrequent - Blocked Transactions Don’t Do Enough Work Dominant Commercial Solution Chaps21&22-20 Deadlock in Databases CSE 4701 Concurrent Access to Database Information Optimistic Concurrency Control Assume Problems Infrequent (ATM Example) Maintain Transaction Log Detect and Correct Errors in System via Log “Long-After” Their Occurrence Pessimistic Concurrency Control Assume Problems will Occur (Airline Example) Require Transactions to Lock Portions of Data for Read and Write Requests Chaps21&22-21 Prevention CSE 4701 Necessary Conditions for Deadlock Mutual Exclusion Hold and Wait Circular Waiting No Preemption Ensure that at Least One of the Necessary Conditions is False at All Times Why Must Mutual Exclusion Hold at All Times? Some Data Items (System Catalog/Meta Data) Must be Exclusively Held by a Transaction How Can a Prevention Strategy be Designed to Guarantee Failure of One of Other Conditions? Chaps21&22-22 Hold and Wait CSE 4701 Invalidate: Hold and Wait: Transaction Can Hold One Data Item While Waiting for Another Data Item Approach 1: Targeted to Batch Systems Transaction Must Request All Data Items it Needs Transaction Competes for All Data Items Even if Needs Only One Data Item at Time Holds Data Items “Done” With Approach 2: Targeted to Timesharing For Transaction to Acquire a Data Item Must Release All Held Data Items Reacquire All (Released &New) Data Items Needed Overhead to Reacquire Held Data Items Could Encourage Starvation Chaps21&22-23 Avoidance CSE 4701 Requires a Multi-Phase Approach Construct a Model of System States Choose a Strategy that Guarantees that the System Will Not Go to a Deadlock State Service Transactions in Some Order, Not Necessarily Order Received Requires Extra Information for Each Transaction Maximum Claim - Every Data Item Each Transaction Will Ever Request Concurrency Controller Sees the Worst Case and can Allow Transitions Based on that Knowledge Goal: To Maintain “Safe” State Chaps21&22-24 Synopsis of Techniques CSE 4701 Resource Allocation Policy: Detection - Very Liberal - requested Resources (Data Items) are Granted Where Possible Prevention - Conservative - Undercommits Resources (Data Items) Avoidance - Moderate - Between Detection/Prev. Different Invocation Schemes Detection - Periodically to Test for Deadlock Prevention - Request All Resources at Once, Preempt, and Order Resources Avoidance - Manipulate Transactions to Find at Least One Safe Execution Path Chaps21&22-25 Advantages CSE 4701 Detection Never Delays Transaction Initiation Facilitates On-Line Processing Prevention Works Well for “Short” Transactions Enforceable Via Compile Time Checks Run-Time Computation Reduced Avoidance No Preemption Needed Disadvantages Detection Inherent Preemption Losses Prevention Inefficient Preempts Too Often Disallows Incremental Transaction Requests Avoidance Future Transaction Requirements Must be Known in Advance Transactions can be Blocked for Long Periods Chaps21&22-26 Transaction Processing Concepts CSE 4701 Basic Transaction Processing Concepts What are Processing Types? What is a Transaction? Why do we need Concurrency Control in a MultiUser Environment? Atomic Transactions and ACID Properties Serial execution and Serializability Concurrency Control Techniques Locking, Timestamps, and Multiversion Optimistic Concurrency Control Transactions Provide Atomic/Reliable Execution in the Presence of Failures Correct Execution of Multiple User Accesses Chaps21&22-27 What are Processing Types? CSE 4701 Single-User System: At most one user at a time can use the system. Multiuser System: Many users can access the system concurrently. Concurrency Interleaved processing: Concurrent execution of processes is interleaved in a single CPU Parallel processing: Processes are concurrently executed in multiple CPUs. Database Systems Seek Concurrent Behavior Against Shared Data Repository Chaps21&22-28 What is a Transaction? CSE 4701 A Transaction: Logical unit of database processing that includes one or more access operations Read – always mean Retrieval Write – could be insert or update, delete. A transaction (set of operations) may be stand-alone specified in a high level language like SQL submitted interactively, or may be embedded within a program. Transaction boundaries: Begin and End transaction An application program may contain several transactions separated by the Begin and End transaction boundaries Chaps21&22-29 What is a Transaction? CSE 4701 A Transaction is A Logic Unit of Database Processing Represents the Collection of Actions that Make Consistent Transformations of System States while Preserving System Consistency Database in a consistent state Database may be temporarily in an inconsistent state during execution Begin Transaction Objectives: Execution of Transaction Database in a consistent state End Transaction Concurrency Transparency Failure Transparency Chaps21&22-30 Simple Model of a Database CSE 4701 A database is a collection of named data items Granularity of data - a field, a record , or a whole disk block Concepts are independent of granularity Access Can Vary Based on DB System Basic operations are read and write read_item(X): Reads a database item named X into a program variable. To simplify our notation, we assume that the program variable is also named X. write_item(X): Writes the value of program variable X into the database item named X. Chaps21&22-31 Read Operation CSE 4701 Basic unit of data transfer from the disk to the computer main memory is one block. In general, a data item (what is read or written) will be field of some record in the database, although it may be a larger unit such as a record or even a whole block. read_item(X) command includes the following steps: Find the address of the disk block that contains item X. Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer). Copy item X from the buffer to the program variable named X. Chaps21&22-32 Write Operation CSE 4701 write_item(X) command includes the following steps: Find the address of the disk block that contains item X. Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer). Copy item X from the program variable named X into its correct location in the buffer. Store the updated block from the buffer back to disk (either immediately or at some later point in time). Chaps21&22-33 Two Sample Transactions CSE 4701 Transaction T1 Reads/Writes X/Y, Modifying X by Subtracting N and Y by Adding N Transaction T2 Reads X and Modifies X by Adding M Their Interleaved Execution can Yield Dramatically Different Results! What are the Possibilities? How Many are Possible? Chaps21&22-34 What are Underlying Concepts? CSE 4701 This Demonstrates two SERIAL Schedules Two Ways to Execute with No Overlap Both Result in X = 12 and Y = 35 T1 Init Values X=10, N=5 Y=15, M=7 T2 Read(X); X:=XN; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; T1 T2 Read(X); X:=X+M; Write(X); commit; Read(X); X:=X+M; Write(X); commit; X=12, Y=35 Read(X); X:=XN; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; X=12, Y=35 Chaps21&22-35 What about the Following Two Schedules? CSE 4701 Recall Earlier Interleaved Schedule and a New Sched What Can we Say About Results? T1 T2 Read(X); X:=XN; Init Values X=10, N=5 Y=15, M=7 Read(X); X:=X+M; T1 Read(X); X:=XN; Write(X); Read(Y); Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Write(X); commit; X=17, Y=35 T2 Read(X); X:=X+M; Write(X); commit; Y = Y + 20; Write(Y); commit; X=12, Y=35 Chaps21&22-36 What are Options for 3 Transactions? CSE 4701 XX Different Serial (Non-Overlapped) Schedules T1 T2 T3 T1 T3 T2 T2 T1 T3 T2 T3 T1 T3 T1 T2 T3 T2 T1 Chaps21&22-37 Example of Transaction for Query CSE 4701 Consider an Airline Reservation System: FLIGHT(FNO, DATE, SRC, DEST, STSOLD, CAP) CUST(CNAME, ADDR, BAL) FC(FNO, DATE, CNAME,SPECIAL) Query: User Steve Reserves Seat on Flight 123 Transaction Comprised of Three Steps: Update the Seats Available (CAP) for Flight 123 If Steve is a New Customer, Insert Information for Customer “Steve” into the CUST Table Insert Information for the Reservation into the Flight-Customer Table FC To Record the Flight What Order Must you do this in? What Happens if Something Fails? Chaps21&22-38 Termination of Transactions CSE 4701 Note: Checking to See if Steve Customer is Omitted Begin_Transaction Reservation { input(flight_no, date, customer_name); EXEC SQL SELECT STSOLD,CAP INTO temp1,temp2 FROM FLIGHT WHERE FNO = flight_no AND DATE = date; if temp1 = temp2 then { output(“no free seats”); Abort } else { EXEC SQL UPDATE FLIGHT SET STSOLD = STSOLD + 1 WHERE FNO = flight_no AND DATE = date; EXEC SQL INSERT INTO FC(FNO, DATE, CNAME, SPECIAL); VALUES (flight_no, date, customer_name, null); Commit output(“Reservation Completed”) } end_Transaction Reservation; Chaps21&22-39 What is Concurrency Control? CSE 4701 Concurrency Control Concurrent Execution of Transactions May Interfere with Each Other May Produce an Incorrect Overall Result Even If Each Transaction is Correct When Executed in Isolation Why is Concurrency Control Needed? The Lost Update Problem The Dirty Read Problem The Incorrect Summary Problem The Unrepeatable Read Problem Chaps21&22-40 Why is Concurrency Control Needed? CSE 4701 The Lost Update Problem Two transactions access the same database items have their operations interleaved in a way that makes the value of some database item incorrect The Temporary Update (or Dirty Read) Problem One transaction updates a database item and then the transaction fails for some reason Updated item accessed by another transaction before it is changed back to its original value. The Incorrect Summary Problem One transaction calculating an aggregate summary function on a number of records Other transaction(s) updating some of the records Aggregate Calculation on Inconsistent Data (some before and some after updates) Chaps21&22-41 The Lost Update Problem CSE 4701 Problem: Item X has an incorrect value Since its Update by T1 is “lost” (Overwritten by T2) T1 T2 Read(X); X:=X; time Read(X); X:=X; Write(X); Read(Y); Write(X); commit; Y = Y + 20; Write(Y); commit; OSs use Mutual Exclusion. Short Operations on Simple Data Low Cost Synchronization In Databases, we Need to Do Better Long Operations on Large Databases Data Contention in Important Long-Term Transactions S1: R1(X), R2(X), W1(X), R1(Y), W2(X), c2, W1(Y), c1; Chaps21&22-42 The Dirty Read Problem CSE 4701 Problem: Item X read by T2 is “dirty” (incorrect) Since Due to T1 Failing before Completion (Commit), T1 T2 System Must Undo the Read(X); Update and Change X X:=X; Back to Original Value Write(X); Read(X); It is Created by a Trans. X:=X; That Has Not Been Write(X); Completed/Committed Read(Y); Unfortunately T2 has Read ... the “temporary” value of X Accidentally time abort S2: R1(X), W1(X), R2(X), W2(X), c2, R1(Y), a1; Chaps21&22-43 The Incorrect Summary Problem CSE 4701 Problem: Inconsistent Values w.r.t. Time - X has been Changed but Y has Not - X and Y are “Correct” Chaps21&22-44 Summary of the Problems CSE 4701 Lost Update Problem Two processes execute the programs that intend to update the same data item X concurrently X may end up with just one update Dirty Data Read Problem A process may write intermediate values into the database Further writes invalidate that particular value Process rollback also invalidate that value Chaps21&22-45 Summary of the Problems CSE 4701 Incorrect Summary Problem Query: Calculate total checking deposits Update: Transfer $1 M from Acct 1 to Acct 2 If query reads account 1 before the update and account 2 after the update, the result is off by $1M The Unrepeatable Read Problem Consider at Transaction T T Reads Data Item X at Time t Another Transaction Y Modifies X at Time t+1 T then Read X again at Time t+2 T has Read Two Different values of X! Chaps21&22-46 A Brief Look at Role of Recovery CSE 4701 What causes a Transaction to fail? 1. A computer failure (system crash): A hardware or software error occurs in the computer system during transaction execution. If the hardware crashes, the contents of the computer’s internal memory may be lost. 2. A transaction or system error: Some operation in the transaction may cause it to fail, such as integer overflow or division by zero. Transaction failure may also occur because of erroneous parameter values or because of a logical programming error. In addition, the user may interrupt the transaction during its execution. Chaps21&22-47 A Brief Look at Role of Recovery CSE 4701 What causes a Transaction to fail? 3. Local errors or exception conditions detected by the transaction: Certain conditions necessitate cancellation of the transaction. For example, data for the transaction may not be found. A condition, such as insufficient account balance in a banking database, may cause a transaction, such as a fund withdrawal from that account, to be canceled. A programmed abort in the transaction causes it to fail. 4. Concurrency control enforcement: The concurrency control method may decide to abort the transaction, to be restarted later, because it violates serializability or because several transactions are in a state of deadlock. Chaps21&22-48 A Brief Look at Role of Recovery CSE 4701 What causes a Transaction to fail? 5. Disk failure: Some disk blocks may lose their data because of a read or write malfunction or because of a disk read/write head crash. This may happen during a read or a write operation of the transaction. 6. Physical problems and catastrophes: This refers to an endless list of problems that includes power or air-conditioning failure, fire, theft, sabotage, overwriting disks or tapes by mistake, and mounting of a wrong tape by the operator. Chaps21&22-49 Further Transaction and System Concepts CSE 4701 A transaction is an atomic unit of work that is either completed in its entirety or not done at all. For recovery purposes, the system needs to keep track of when the transaction starts When the transaction terminates When the transaction commits or aborts. Transaction states: Active state Partially committed state Committed state Failed state Terminated State Chaps21&22-50 Further Transaction and System Concepts CSE 4701 Recovery manager keeps track of the following operations: begin_transaction: This marks the beginning of transaction execution. read or write: These specify read or write operations on the database items that are executed as part of a transaction. end_transaction: This specifies that read and write transaction operations have ended and marks the end limit of transaction execution. Are changes permanently applied to the database or Is transaction has to be aborted because it violates concurrency control or for some other reason Chaps21&22-51 Further Transaction and System Concepts CSE 4701 Recovery manager keeps track of the following operations (cont): commit_transaction: Signals a successful end of the transaction so that any Changes (updates) executed by the transaction can be safely committed to the database and will not be undone rollback (or abort): Signals that the transaction has ended unsuccessfully any changes or effects that the transaction may have applied to the database must be undone. Chaps21&22-52 Further Transaction and System Concepts CSE 4701 Recovery techniques use the following operators: undo: Similar to rollback except that it applies to a single operation rather than to a whole transaction. redo: Specifies that certain transaction operations must be redone to ensure that all the operations of a committed transaction have been applied successfully to the database. Chaps21&22-53 What can Go Wrong? CSE 4701 Transaction Moves Through Many States from Begin to End From System Issue, Key Concern are Potential Abort When Can Aborts Occur? What are Issues? Chaps21&22-54 What can Go Wrong? CSE 4701 Aborting Active Transaction Recovery Likely Not Needed Reads/Writes on “Local” Copies Permanent Copy Not Updated Chaps21&22-55 What can Go Wrong? CSE 4701 Aborting Partially Committed Transaction Transaction Commits by Writing Values to DB Suppose Write A, Write B, Write C If Failure After Write A/B and Before Write C, Transaction Aborts and Corrective Action Needed Must “Undo” Effect of All Completed Writes Chaps21&22-56 What is Tracked for Transaction Processing? CSE 4701 System Log or Journal keeps track of all transaction operations that affect the values of database items. This information may be needed to permit recovery from transaction failures. The log is kept on disk, so it is not affected by any type of failure except for disk or catastrophic failure. In addition, log periodically backed up to archival storage to guard against such catastrophic failures. Chaps21&22-57 What is Tracked for Transaction Processing? CSE 4701 The System Log: T is a unique transaction-id that is generated automatically by the system: Types of log record: [start_transaction,T]: Records that transaction T has started execution. [write_item,T,X,old_value,new_value]: Records that transaction T has changed the value of database item X from old_value to new_value. [read_item,T,X]: Records that transaction T has read the value of database item X. [commit,T]: Records that transaction T has completed successfully, and affirms that its effect can be committed (recorded permanently) to the database. [abort,T]: Records that transaction T has been aborted. Chaps21&22-58 What is Tracked for Transaction Processing? CSE 4701 Recovery: If the system crashes, we can recover to a consistent database state by examining the log Log contains a record of every write operation that changes the value of some database item Possible to undo writes operations by tracing backward through the log and resetting all items changed by a write operation of T to old_values. Or, redo the effect of the write operations of a transaction T by tracing forward through the log Set all items changed by a write operation of T to their new_values. Chaps21&22-59 What is Tracked for Transaction Processing? CSE 4701 Commit Point of a Transaction: A transaction T reaches its commit point when All its operations to access database have been executed successfully AND Effect of all the transaction operations on the database has been recorded in the log Beyond the commit point, the transaction is said to be committed, and its effect is assumed to be permanently recorded in the database. Transaction writes [commit,T] entry into the log. Roll Back of transactions: Needed for transactions that have a [start_transaction,T] entry into the log but no commit entry [commit,T] into the log. Chaps21&22-60 ACID: Desirable Properties of Transactions CSE 4701 Atomicity: A transaction is an atomic unit of processing; it is either performed in its entirety or not performed at all. Consistency preservation: A correct execution of the transaction must take the database from one consistent state to another. Isolation: A transaction should not make its updates visible to other transactions until it is committed; this property, when enforced strictly, solves the temporary update problem and makes cascading rollbacks of transactions unnecessary. Durability or permanency: Once a transaction changes the database and the changes are committed, these changes must never be lost because of subsequent failure. Chaps21&22-61 ACID in Terms of Operations CSE 4701 Database Consists of Set of Data Items Read(x) Gets Last Stored Value in X Write(x) Stores a New Value Into X Atomicity: A Set of R/W Operations that Either Completes Entirely or Not at All Consistency: R/W Operations take the Database from One Consistent State to Another Consistent State Isolation: No Intermediate Values Produced by the R/W Operations will be Visible to Other Transactions Durability: Once the Transaction is Completed, and All the Updates are Committed, then these Changes Must Never be Lost because of Subsequent Failure Chaps21&22-62 What is a Schedule? CSE 4701 Transaction schedule or history: Transactions executing concurrently in an interleaved fashion Order of execution of operations is known as a transaction schedule A schedule S of n transactions T1, T2, …, Tn is: Ordering of operations of transactions where For each transaction Ti that participates in S, the operations of T1 in S must appear in the same order in which they occur in T1. Operations from other transactions Tj can be interleaved with the operations of Ti in S. Chaps21&22-63 What is a Schedule? CSE 4701 A Schedule S is a Sequence of R/W Operations, Which End with Commit or Abort Different Transactions Executing Concurrently in an Interleaved Fashion with One Another Each Transaction a Sequence of R/W Operations Two Schedules S1 and S2 are Equivalent, Denoted as S1 S2 , If and Only If S1 and S2 Execute the Same Set of Transactions Produce the Same Results (i.e., Both Take the DB to the Same Final State) Chaps21&22-64 What are Possible Different Schedules? CSE 4701 Chaps21&22-65 Transactions and a Schedule CSE 4701 Below are Transactions T1 and T2 Note that the Their Interleaved Execution Shown Below is an Example of One Possible Schedule There are Many Different Interleaves of T1 and T2 T1 T2 Read(X); X:=X; Write(X); Init Values X=10, Y=15 Read(X); X:=X; Write(X); commit; Read(Y); Y = Y + 20; Write(Y); commit; Final Values X=7, Y=35 Schedule S: R1(X), W1(X), R2(X), W2(X), c2, R1(Y), W1(Y), c1; Chaps21&22-66 Equivalent Schedules CSE 4701 Are the Two Schedules below Equivalent? S1 and S4 are Equivalent, since They have the Same Set of Transactions and Produce the Same Results T1 Init Values X=10, Y=15 T1 Read(X); X:=X; Write(X); Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Schedule S1 Final Values X=7, Y=35 T2 T2 Init Values X=10, Y=15 Schedule S4 Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); commit; Read(Y); Y = Y + 20; Write(Y); commit; Final Values X=7, Y=35 S1: R1(X),W1(X), R1(Y), W1(Y), c1, R2(X), W2(X), c2; S4: R1(X), W1(X), R2(X), W2(X), c2, R1(Y), W1(Y), c1; Chaps21&22-67 Transactions and a Schedule What Happens if the Schedule Changes to: CSE 4701 T1 Init Values X=10, Y=15 Final Values X=7, Y=35 T2 T1 T2 Init Values X=10, Y=15 Read(X); X:=X; Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Read(X); Write(X); Read(X); X:=X; Write(X); commit; X:=X; Write(X); commit; Read(Y); Y = Y + 20; Write(Y); commit; Final Values X=9, Y=35 Chaps21&22-68 What are Different Types of Schedules? CSE 4701 Recoverable schedule: One where no transaction needs to be rolled back. No transaction T in S commits until all transactions T’ that write an item that T reads have committed. Cascadeless schedule: One where every transaction reads only the items that are written by committed transactions. Cascaded rollback: A schedule in which uncommitted transactions that read an item from a failed transaction must be rolled back – Read value written by Failed Trans Strict Schedules: A schedule in which a transaction can neither read or write an item X until the last transaction that wrote X has committed. Chaps21&22-69 Serial and Serializable Schedules CSE 4701 Serial schedule: A schedule S is serial if, for every transaction T participating in the schedule, all the operations of T are executed consecutively in the schedule. Otherwise, the schedule is called nonserial schedule. Serializable schedule: A schedule S is serializable if it is equivalent to some serial schedule of the same n transactions. Being serializable implies that the schedule is a correct schedule that: Leaves the database in a consistent state. The interleaving of operations results in a state as if the transactions were serially executed, while achieving efficiency due to concurrent execution. Chaps21&22-70 Serializability of Schedules CSE 4701 A Serial Execution of Transactions Runs One Transaction at a Time (e.g., T1 and T2 or T2 and T1) All R/W Operations in Each Transaction Occur Consecutively in S, No Interleaving Consistency: a Serial Schedule takes a Consistent Initial DB State to a Consistent Final State A Schedule S is Called Serializable If there Exists an Equivalent Serial Schedule A Serializable Schedule also takes a Consistent Initial DB State to Another Consistent DB State An Interleaved Execution of a Set of Transactions is Considered Correct if it Produces the Same Final Result as Some Serial Execution of the Same Set of Transactions We Call such an Execution to be Serializable Chaps21&22-71 Example of Serializability CSE 4701 Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20 After S1 or S2 X = 7 and Y = 40 These are the two Possible Serial Schedules Schedule S1 T1 T2 Schedule S2 T1 T2 Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Chaps21&22-72 Example of Serializability CSE 4701 Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20 After S1 or S2 X = 7 and Y = 40 Is S3 a Serializable Schedule? Schedule S1 T1 T2 Schedule S2 T1 T2 Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Schedule S3 T1 T2 Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Read(X); X:=X; Write(X); commit; Chaps21&22-73 Example of Serializability CSE 4701 Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20 After S1 or S2 X = 7 and Y = 40 Is S4 a Serializable Schedule? Schedule S1 T1 T2 Schedule S2 T1 T2 Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Schedule S4 T1 T2 Read(X); X:=X; Write(X); Read(X); X:=X; Write(X); commit; Read(Y); Y = Y + 20; Write(Y); commit; Chaps21&22-74 Two Serial Schedules with Different Results CSE 4701 Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20 After S1 X = 7 and Y = 28 After S2 X = 7 and Y = 27 Schedule S1 T1 T2 Schedule S2 T1 T2 Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = X + 20; Write(Y); commit; Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = X + 20; Write(Y); commit; A Schedule is Serializable if it Matches Either S1 or S2 , Even if S1 and S2 Produce Different Results! Chaps21&22-75 Thoughts on Serializability CSE 4701 Serializability is hard to check Interleaving of operations occurs in an operating system through some scheduler Difficult to determine beforehand how the operations in a schedule will be interleaved Need to Adopt a Practical Approach Come up with methods (protocols) to ensure serializability. However, it is not possible to determine when a schedule begins and when it ends. Hence, we reduce the problem of checking the whole schedule to checking only a committed project of the schedule Chaps21&22-76 How do we Check for Conflicts? CSE 4701 Testing for conflict serializability: Look at only read_Item (X) and write_Item (X) operations Constructs a precedence graph (serialization graph) with directed edges An edge is created from Ti to Tj if one of the operations in Ti appears before a conflicting operation in Tj The schedule is serializable if and only if the precedence graph has no cycles. Chaps21&22-77 The Serializability Theorem CSE 4701 A Dependency Exists Between Two Transactions If: They Access the Same Data Item Consecutively in the Schedule and One of the Accesses is a Write Three Cases: T2 Depends on T1 , Denoted by T1 T2 T2 Executes a Read(x) after a Write(x) by T1 T2 Executes a Write(x) after a Read(x) by T1 T2 Executes a Write(x) after a Write(x) by T1 Don’t carE about Read(x) Read(x) Transaction T1 Precedes Transaction T2 If: There is a Dependency Between T1 and T2, and The R/W Operation in T1 Precedes the Dependent T2 Operation in the Schedule Chaps21&22-78 The Serializability Theorem CSE 4701 A Precedence Graph of a Schedule is a Graph G = <TN, DE>, where Each Node is a Single Transaction; i.e.,TN = {T1, ..., Tn} (n>1) and Each Arc (Edge) Represents a Dependency Going from the Preceding Transaction to the Other i.e., DE = {eij | eij = (Ti, Tj), Ti, Tj TN} Use Dependency Cases on Prior Slide The Serializability Theorem A Schedule is Serializable if and only of its Precedence Graph is Acyclic Chaps21&22-79 Serializability Theorem Example CSE 4701 Consider S1 and S2 for Transactions T1 and T2 Consider the Two Precedence Graphs for S1 and S2 No Cycles in Either Graph! Schedule S1 T1 T2 X T2 X T2 T1 T2 Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Schedule S1 T1 T1 Schedule S2 Read(X); X:=X; Write(X); commit; Read(X); X:=X; Write(X); Read(Y); Y = Y + 20; Write(Y); commit; Schedule S2 Chaps21&22-80 What are Precedence Graphs for S3 and S4? CSE 4701 For S3 T1 T2 (T2 Write(X) After T1 Write(X)) T2 T1 (T1 Write(X) After T2 Read (X)) For S4 T1 T2 (T2 Read/Write(X) After T1 Write(X)) X Schedule S3 T1 T1 T2 Read(X); X:=X; X Write(X); Read(Y); Schedule S3 T1 T2 X Schedule S4 T2 Y = Y + 20; Write(Y); commit; Read(X); X:=X; Write(X); commit; Schedule S4 T1 T2 Read(X); X:=X; Write(X); Read(X); X:=X; Write(X); commit; Read(Y); Y = Y + 20; Write(Y); commit; Chaps21&22-81 Four Schedules and their … CSE 4701 Chaps21&22-82 … Precedence Graphs CSE 4701 Chaps21&22-83 Serializability Facts CSE 4701 Serializability Emphasizes Throughput Serializable Executions Allow us to Enjoy the Benefits of Concurrency without Giving up Any Correctness However, we May NOT GET the Same Result Testing for Serializability Difficult in Practice: Finding a Serializable Schedule for an Arbitrary Set of Transactions is NP-hard Interleaving of Operations From Concurrent Trans is Determined Dynamically at Run-time Practically Almost Impossible to Determine Ordering of Operations Beforehand to Ensure Serializability Chaps21&22-84 Transaction Processing Issues CSE 4701 Transaction Structure (Usually Called Transaction Model) Flat (Simple), Nested Internal Database Consistency Semantic Data Control (Integrity Enforcement) Algorithms Reliability Protocols Atomicity & Durability Local Recovery Protocols Global Commit Protocols Concurrency Control Algorithms How to Synchronize Concurrent Transaction Executions (Correctness Criterion) Intra-Transaction Consistency, Isolation Chaps21&22-85 Transaction Execution Who Participates in Transaction Execution? CSE 4701 User Application Begin_Transaction, Read, Write, Abort, Commit, End_Transaction User Application … Transaction Manager (TM) Read, Write, Abort, EOT Results Scheduler (SC) Scheduled Operations Results Recovery Manager (RM) Chaps21&22-86 Transaction Support in SQL2 CSE 4701 A single SQL statement is always considered to be atomic. Either the statement completes execution without error or it fails and leaves the database unchanged. With SQL, there is no explicit Begin Transaction statement. Transaction initiation is done implicitly when particular SQL statements are encountered. Every transaction must have an explicit end statement, which is either a COMMIT or ROLLBACK. Chaps21&22-87 Transaction Support in SQL2 CSE 4701 Characteristics specified by a SET TRANSACTION statement in SQL2: Access mode: READ ONLY or READ WRITE. The default is READ WRITE unless the isolation level of READ UNCOMITTED is specified, in which case READ ONLY is assumed. Diagnostic size n, specifies an integer value n, indicating the number of conditions that can be held simultaneously in the diagnostic area. Chaps21&22-88 Transaction Support in SQL2 CSE 4701 Characteristics specified by a SET TRANSACTION statement in SQL2: Isolation level <isolation>, where <isolation> can be READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ or SERIALIZABLE. The default is SERIALIZABLE. With SERIALIZABLE: the interleaved execution of transactions will adhere to our notion of serializability. However, if any transaction executes at a lower level, then serializability may be violated. Chaps21&22-89 Transaction Support in SQL2 CSE 4701 Potential problem with lower isolation levels: Dirty Read: Reading a value that was written by a transaction which failed. Nonrepeatable Read: Allowing another transaction to write a new value between multiple reads of one transaction. A transaction T1 may read a given value from a table. If another transaction T2 later updates that value and T1 reads that value again, T1 will see a different value. Consider that T1 reads the employee salary for Smith. Next, T2 updates the salary for Smith. If T1 reads Smith's salary again, then it will see a different value for Smith's salary. Chaps21&22-90 Transaction Support in SQL2 CSE 4701 Potential problem with lower isolation levels (cont.): What are Phantom Rows? New rows being read using the same read with a condition A transaction T1 may read a set of rows from a table, perhaps based on some condition specified in the SQL WHERE clause. Now suppose that a transaction T2 inserts a new row that also satisfies the WHERE clause condition of T1, into the table used by T1. If T1 is repeated, then T1 will see a row that previously did not exist, called a phantom. Chaps21&22-91 Transaction Support in SQL2 CSE 4701 Sample SQL transaction: EXEC SQL whenever sqlerror go to UNDO; EXEC SQL SET TRANSACTION READ WRITE DIAGNOSTICS SIZE 5 ISOLATION LEVEL SERIALIZABLE; EXEC SQL INSERT INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY) VALUES ('Robert','Smith','991004321',2,35000); EXEC SQL UPDATE EMPLOYEE SET SALARY = SALARY * 1.1 WHERE DNO = 2; EXEC SQL COMMIT; GOTO THE_END; UNDO: EXEC SQL ROLLBACK; THE_END: ... Chaps21&22-92 Summary of Transaction Processing CSE 4701 Transaction and System Concepts Desirable Properties of Transactions Characterizing Schedules based on Recoverability Characterizing Schedules based on Serializability Transaction Support in SQL Chaps21&22-93 Database Concurrency Control CSE 4701 Purpose of Concurrency Control To enforce Isolation (through mutual exclusion) among conflicting transactions. To preserve database consistency through consistency preserving execution of transactions. To resolve read-write and write-write conflicts. Example: In concurrent execution environment if T1 conflicts with T2 over a data item A, then the existing concurrency control decides if T1 or T2 should get the A and if the other transaction is rolled-back or waits. Chaps21&22-94 Concurrency Control CSE 4701 Different Locking-Based Algorithms Binary Locks (Lock and Unlock) Share Read Locks and Exclusive Write Locks Write Lock Does Not Imply Read 2 Phase Protocol All Locks Must Precede All Unlocks in Trans. True for All Transactions - Schedule Serializable Concurrency Control Implementation Techniques Optimistic Concurrency Control Time-Based Access to Information Consider “When” Information Read/Written to Identify Potential or Prior Conflicts We’ll Deviate from Textbook Notation Chaps21&22-95 Summary of CC Techniques CSE 4701 Two-Phase Locking Most Important in Practice Used by a Majority of DBMSs Serializes in the Middle of Transactions Low Overhead Relatively Low Concurrency Timestamp-Based Based on Multiple Versions of Data Items Serializes at the Beginning of Transactions Mostly Used in Distributed DBMSs Optimistic Concurrency Control Methods Serializes at the End of Transactions Relatively High Concurrency Chaps21&22-96 Recalling Important Concepts CSE 4701 Transaction: Sequence of Database Commands that Must be Executed as a Single Unit (Program) Recall SQL Update Query Equivalent to Multiple Operations Read from DB, Modify (Local Copy), Write to DB Modify Sometimes Delete and Insert Granularity: Size of Data that is Locked for an Executing DB Transaction - Wide Range Database Relation (Tuple vs. Entire Table) Attribute (Column) Meta-Data (System Catalog) Locking: Provides Means for Synchronization Chaps21&22-97 Transaction Example CSE 4701 Two Possible Outcomes for T1 and T2 – Let A = 5 If T1 First, then A = 150 If T2 First, then A = 60 Is this a Problem? T1 T2 T1 T2 LOCK A READ A A=A*10 WRITE A UNLOCK A commit; LOCK A READ A A=A+10 WRITE A UNLOCK A commit; LOCK A READ A A=A*10 WRITE A UNLOCK A commit; LOCK A READ A A=A+10 WRITE A UNLOCK A commit; Chaps21&22-98 Transaction Example CSE 4701 The Two Different Orderings of T1 and T2 Represent Alternate Serial Schedules (Non-Interleaved) Key Concept: Concurrent (Interleaved) Execution of Several DB Transactions is Correct if and only if its Effect is the Same as that Obtained by Running the Same Transactions in a Serial Order If Result is Either 150 or 60 – it is OK! This is the Concept of Serializability! T1 LOCK A READ A A=A+10 WRITE A UNLOCK A commit; T2 LOCK A READ A A=A*10 WRITE A UNLOCK A commit; Chaps21&22-99 Recalling Key Definitions CSE 4701 A Schedule for a Set of Transactions is the Order in When the Elementary Steps (Read, Lock, Assign, Commit, etc.) are Performed A Schedule is Serial if All Steps of Each Transaction Occur Consecutively A Schedule is Serializable if it is Equivalent to “Some” Serial Schedule If T1, T2 and T3 are Transactions - What are the Possible Serial Schedules? T2 T3 T1 T1 T2 T3 T3 T1 T2 T1 T3 T2 T3 T2 T1 T2 T1 T3 Different Serial Schedules for 4 Transactions? Chaps21&22-100 Another Example of Serializability CSE 4701 Two Serial Schedules – Let A = 15, B = 25, C=5 What are Values of A, B, and C after Each? A = 5, B = 15, C=25 S1 T1 Read(A); A:=A0; Write(A); Read(B); B = B + 10; Write(B); commit; T2 Read(B); B:=B0; Write(B); Read(C); C=C+20 Write(C) commit; S2 T1 T2 Read(B); B:=B0; Write(B); Read(C); C=C+20 Write(C) commit; Read(A); A:=A0; Write(A); Read(B); B = B + 10; Write(B); commit; Chaps21&22-101 Another Example of Serializability CSE 4701 Is S3 or S4 – Let A = 15, B = 25, C = 5 Serial Values: A = 5, B = 15, C=25 What are Resulting Values of Each Schedule? T1 A = 5 B = 15 C = 25 T2 Read(A); Read(B); T1 T2 Read(A); A:=A0; Read(B); A:=A0; B:=B0; A = 5 B = 35 C = 25 Write(A); B:=B0; Write(A); Write(B); Read(B); Write(B); Read(B); Read(C); B = B + 10; Read(C); B = B + 10; C=C+20 Write(B); Write(C) commit; commit; Write(B); commit; C=C+20 Write(C) commit; Chaps21&22-102 Locks CSE 4701 Lock: Variable Associated with a Data Item in DB, Describing the Status of that Item w.r.t. Possible Ops. A Means of Synchronizing the Access by Concurrent Transactions to the Database Item Managed by Lock Manager Binary Locks: Lock(x) and Unlock(x) A Transaction T Must Issue the Lock(x) before any Read(x) or Write(x) A Transaction T Must use the Unlock(x) After all Read(x)/Write(x) Operations are Completed in T System Catalog Maintains a Lock Table for All Locked Items Lock(x)(or Unlock(x)) will not be Granted if there Already Exists a Lock(x) (or Unlock(x)) Chaps21&22-103 A Basic Lock/Unlock Model CSE 4701 Database Transaction is a Sequence of Lock/Unlocks Item Locked must Eventually be Unlocked A Transaction Holds a Lock between Lock and Unlock Statements Lock/Unlock Assumes that the Value of the Item Changes (Always Assumes a Write) a0 f(a0) a0 Lock A Unlock A f(a0) For a Number of Transactions that Lock/Unlock A, we’d have: f1(f2(f3( … fn( a0)))) Chaps21&22-104 Example - Assessing Schedule CSE 4701 Consider Three Transactions Below: T1 has f1(a) and f2(b) T2 has f3(b) and f4(c) and f5(a) T3 has f6(a) and f7 (c) Functions Represent actions that Modify Instances a, b, and c of Data Items A, B, and C, Respectively T1 Lock A Lock B Unlock A Unlock B T2 Lock B Lock C Unlock B Lock A Unlock C Unlock A T3 Lock A Lock C Unlock C Unlock A Chaps21&22-105 Example - Assessing Schedule Consider the Schedule with Changes to a, b, and c CSE 4701 T1 Lock A T2 Lock B T2 Lock C T2 Unlock B T1 Lock B T1 Unlock A T2 Lock A T2 Unlock C T2 Unlock A T3 Lock A T3 Lock C T1 Unlock B T3 Unlock C T3 Unlock A A a a a a a f1(a) f1(a) f1(a) f5 (f1(a)) f5 (f1(a)) f5 (f1(a)) f5 (f1(a)) f5 (f1(a)) f6(f5 (f1(a))) B b b b f3(b) f3(b) f3(b) f3(b) f3(b) f3(b) f3(b) f3(b) f2 (f3(b)) f2 (f3(b)) f2 (f3(b)) C c c c c c c c f4( c ) f4( c ) f4( c ) f4( c ) f4( c ) f7 (f4( c )) f7 (f4( c )) Is this Schedule Serializable? Chaps21&22-106 Is this Schedule Serializable? CSE 4701 Focus on the Final Line - It indicates the Effective Order of Execution of Each Transaction for a, b, and c T1 has f1(a) and f2(b) T2 has f3(b) and f4(c) and f5(a) T3 has f6(a) and f7 (c) For A - Order of Transactions is T1 T2 T3 f6(f5 (f1(a))) For B - T2 Must Precede T1 f2 (f3(b)) For C - T2 Must Precede T3 f7 (f4( c )) Can All Three Conditions be True w.r.t. Order? T3 Unlock A A f6(f5 (f1(a))) B f2 (f3(b)) C f7 (f4( c )) Chaps21&22-107 Determining Serializability in this Model CSE 4701 Examine Schedule Based on Order in Which Various Transactions Obtain Locks Order must be Equivalent to Some Hypothetical Serial Schedule of Transactions If Orders for Different Data Items Forces Two Transactions to Appear in a Different Order (T2 Must Precede T1 and T1 Must Precede T2 ) There is a Paradox! This is Equivalent to Searching for Cycles in a Directed Graph Chaps21&22-108 Recall Topological Sort CSE 4701 Graph is Acyclic Find a Node of Graph with ONLY Arrows Leaving (no Entering) Delete Node and Arrows What are Possible Sorts of Each Graph? 7-5-3-11-2-8-9-10 5-3-7-11-8-2-9-10 B-D-C-F; B-C-D-F Chaps21&22-109 Algorithm 1: Binary Lock Model CSE 4701 Input: Schedule S for Transactions T1, T2 , … Tk Output: Determination if S is Serializable, and If so, an Equivalent Serial Schedule Method: Create a Directed Precedence Graph G: Let S = a1 ; a2 ; … ; an where each ai is Tj :Lock Am or Tj : Unlock Am For each ai = Tj : Unlock Am , find next ap = Ts : Lock Am (1 < p n) (Ts is next Trans. to lock Am), and if so, draw Arc in G from Tj to Ts Repeat Until All Unlock/Lock are Checked Review the Resulting Precedence Graph If G has Cycles - Non-Serializable If G is Acyclic - Topological Sort to Find an Equivalent Serial Schedule Chaps21&22-110 Precedence Graph for Prior Example CSE 4701 T1 Lock A T2 Lock B T2 Lock C T2 Unlock B T1 Lock B T1 Unlock A T2 Lock A T2 Unlock C T2 Unlock A T3 Lock A T3 Lock C T1 Unlock B T3 Unlock C T3 Unlock A Look for Unlock Lock Combos on the Same Data Item T2 Unlock B and T1 Lock B T1 Unlock A and T2 Lock A T2 Unlock C and T3 Lock C T2 Unlock A and T3 Lock A B T1 T2 A, C A T3 IS IT SERIALIZABLE? Chaps21&22-111 Another Example CSE 4701 T2 Lock A T2 Unlock A T3 Lock A T3 Unlock A T1 Lock B T1 Unlock B T2 Lock B T2 Unlock B Look for Unlock Lock Combos on the Same Data Item T2 Unlock A and T3 Lock A T1 Unlock B and T2 Lock B IS IT SERIALIZABE? IF SO WHAT IS THE SCHEDULE? T1 T2 A B T3 Chaps21&22-112 Two-Phase Protocol CSE 4701 Two-Phase Protocol - All Locks Must Precede All Unlocks in the Schedule for a Transaction Which of the Transactions Below are Two-Phase? Why or Why Not? T1 Lock A Lock B Unlock A Unlock B T2 Lock B Lock C Unlock B Lock A Unlock C Unlock A T3 Lock A Lock C Unlock C Unlock A Chaps21&22-113 Theorems Regarding Serializability CSE 4701 Theorem 1: Algorithm 1 Correctly Determines if a Schedule S is Serializable (omit the proof). Theorem 2: If S is any Schedule of 2 Phase Transactions (i.e., all of its Transactions are 2-Phase), then S is Serializable. Proof by Contradiction. Suppose Not - they by Theorem 1, S has a Precedence Graph G with a Cycle T1 T2 T3 … Tp T1 UNL L UNL UNL L In T1 T2 , T1 is Unlock, so all Remaining Actions must also be Unlock, since S is 2 Phase However, in Tp T1 , T1 is Lock, which is a Contradiction to Fact that S is 2 Phase Chaps21&22-114 Problems of Binary Locks CSE 4701 Only One Transaction Can Hold a Lock on a Given Item No Shared Reading is Allowed - Too Restrictive For Example T1 is Read Only on X - Yet Needs Full Lock T2 is Read Only on X and Y - Needs Full Locks T1 Read(X); Read(Y); time t1 t2 Y = Y + 20; Write(Y); T2 t3 t4 t5 Read(X); Read(Y) commit; commit; Chaps21&22-115 Algorithm 2: A Read/Write Lock Model CSE 4701 Refines the Granularity of Locking to Differentiate Between Read and Write Locks Improves Concurrent Access Rlock (Shared): If T has an Rlock A, then Any Other Transaction can Also Rlock A, but All Transactions are Forbidden from Wlock A until All Transactions with Rlock A issue Ulock A (Multiple Reads) Wlock (Exclusive): If T has Wlock A, then All Other Transactions are Forbidden to Rlock or Wlock A Until T Ulocks A (Write Implies Reading, Single Write) Two Schedules are Equivalent if: Produce Same Value for Each Data Item Each Rlock on an Item Occurs in Both Schedules at a Time When Locked Item has the Same Value Chaps21&22-116 Motivating Algorithm 2 CSE 4701 Rlock (Shared): Multiple Reads Allowed Wlock (Exclusive): Write Implies Reading, Sole Write Identify All Dependencies Among Transactions that Read and Write the Same Item If Ti :Rlock A and Tj : Wlock A is Next Trans to Write A – put in an arc from Ti to Tj Ti must precede Tj in the Schedule w.r.t. A If Ti :Wlock A and Tj : Wlock A is Next Trans to Write A – put in an arc from Ti to Tj Ti must precede Tj in the Schedule w.r.t. A If Tm: Rlock A between Ti :Wlock A and Tj : Wlock– put in an arc from Ti to Tm Tm must follow Ti in the Schedule w.r.t. A Chaps21&22-117 Algorithm 2: Read/Write Lock Model CSE 4701 Input: Schedule S for Transactions T1, T2 , … Tk Output: Is S Serializable? If so, Serial Schedule Method: Create a Directed Precedence Graph G: Suppose in S, Ti :Rlock A. If Tj : Wlock A is the Next Transaction to Wlock A (if it exists) then place an Arc from Ti to Tj. Repeat for all Ti’s, all Rlocks before Wlock on A! Suppose in S, Ti :Wlock A. If Tj : Wlock A is the Next Transaction to Wlock A (if it exists) then place an Arc from Ti to Tj. If Also exists Tm :Rlock A after Ti :Wlock A but before Tj : Wlock A, then Draw an Arc from Ti to Tm. Review the Resulting Precedence Graph If G has Cycles - Non-Serializable If G is Acyclic - Topological Sort for Serial Schedule Chaps21&22-118 Algorithm 2: Read/Write Lock Model CSE 4701 Look for Following Arcs: Add Arc: Ti :Rlock A to Tj : Wlock A where Tj is the NEXT transaction to Write A Add Arc: Ti :Wlock A to Tj : Wlock A where Tj is the NEXT transaction to Write A Add Arc: Ti :Wlock A to Tm :Rlock Where Tm :Rlock A after Ti :Wlock A but before Tj : Wlock A, then Draw an Arc from Ti to Tm. Chaps21&22-119 Consider the Following Schedule What are the Dependencies Among Transactions? CSE 4701 T1 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) T2 T3 Wlock A T4 Rlock B Unlock A Rlock A Unlock B Wlock B Rlock A Unlock B Wlock B Unlock A Unlock A Wlock A Unlock B Rlock B Unlock A Unlock B Chaps21&22-120 What are the Different Cases? For Each Rlock T1 :Rlock A T2 :Rlock A T1 before T4, T2 before T4 T3 before T1, T3 before T2, T3 before T4 Look for CSE T4 before T3, T3 before T1 Next T to 4701 T1 T2 T3 T4 Wlock A (1) Wlock A (2) Rlock B For Each Wlock (3) Unlock A T3 :Wlock A (4) Rlock A Look for (5) Unlock B Next T to (6) Wlock B Rlock or Wlock A (7) Rlock A For Each Rlock (8) Unlock B T4 :Rlock B (9) Wlock B Next T to (10) Unlock A Wlock B (11) Unlock A (12) Wlock A For Each Wlock (13) Unlock B T3 :Wlock B (14) Rlock B Look for (15) Unlock A Next T to (16) Unlock B Wlock B Chaps21&22-121 Consider the Following Schedule What is the Precedence Graph G? CSE 4701 T1 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) T2 T3 Wlock A T4 Rlock B Unlock A Rlock A Unlock B Wlock B Rlock A Unlock B Wlock B Unlock A Unlock A Wlock A Unlock B Rlock B Unlock A Unlock B Chaps21&22-122 Precedence Graph CSE 4701 What is the Resulting Precedence Graph? Is the Schedule Serializable? Why or Why Not? T1 before T4, T2 before T4 T3 before T1, T3 before T2, T3 before T4 T4 before T3, T3 before T1 T1 T2 A:RW A:RW A:WR B:WW A:WW B:WW T4 T3 B:RW Chaps21&22-123 A Read-Only/Write-Only Lock Model CSE 4701 Revision of the Read/Write Model for Algorithm 2 Refining Our Assumptions Assume that a Wlock on an Item Does not Mean that the Transaction First Reads the Item Contrary to First Two Models Example: Read A; Read B; C=A+B; A=A-1; Write A; Write C Reads A, B and Writes A,C (No Read on C) Reformulate Notion of Equivalent Schedules Chaps21&22-124 How Does This Model Differ from Alg. 2? CSE 4701 Consider the Schedule Segment: T1 : Wlock A T1 : Ulock A T2 : Wlock A T2 : Ulock A In Algorithm 2 - T2 : Wlock A Assumes that T2 Reads the Value Written by T1 However, This Need Not be True in the New Model If Between T1 and T2, No Transaction Rlocks A, then Value Written by is T1 Lost, T1 Does not Have to Precede T2 in a Schedule w.r.t. A Chaps21&22-125 Motivating Algorithm 3 CSE 4701 Rlock (Shared): Multiple Reads Allowed Wlock (Exclusive): Write Does Not Mean Read, Sole Write Successive Writes without intervening Read Means the Effects of Earlier Writes Disappear For a Clean Start All Items Written Prior before 1st Step of Sched For a Clean Finish All Items are Read After last Step of Sched Identify All Dependencies Among Transactions that Write (Ti) and Read (Tj) Same Item (T0 through Tf ) Add Arc from Ti to Tj (Ti is BEFORE Tj ) For Next “Reads” after “Write” Can’t be Intervening Writes Chaps21&22-126 Intuitive View of Algorithm 3 CSE 4701 If Tj Reads Value of “A” Written by Ti , then Tj Must Precede in any Serial Schedule For WR Combo - Draw an Arc from Ti to Tj Now Consider a T that also Writes “A” T Must be either Before Ti or After Tj Add in a Pair of Arcs T to Ti and Tj to T of Which one Must be Chosen in the Final Precedence Graph Serializability Occurs if After Choices Made for each “T” Pair, the Resulting Graph is Acyclic G is Referred to as a “Polygraph” with Nodes, Arcs, and Alternate Arcs Chaps21&22-127 Redefine Serializability CSE 4701 Conditions on Serializability Must be Redefined in Support of the Write-Does-Not-Assume Read Model If in Schedule S, Tj Reads “A” Written by Ti, then Ti Before Tj in any Serial Schedule Equivalent to S Further, if there is a T that Writes “A”, then in any Serial Schedule Equivalent to S, T is Before Ti or After Tj, but may not be Between Ti and Tj Graphically, we have: T A:WR A:WR Ti A:WR Ti Tj Tj A:RW A:RW T T A:WR Chaps21&22-128 Algorithm 3 Example Schedule T1 CSE 4701 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) T2 T3 T4 Rlock A Rlock A Wlock C Unlock C Rlock C Wlock B Unlock B Rlock B Unlock A Unlock A Wlock A Rlock C Wlock D Unlock B Unlock C Rlock B Unlock A Wlock A Unlock B Wlock B Unlock B Unlock D Unlock C Unlock A Chaps21&22-129 Augmentation of Precedence Graph CSE 4701 In Support of the Write Does Not Imply Read Model, we must Augment the Precedence Graph: Add an Initial Transaction To that Writes Every Item, and a Final Transaction Tf that Reads Every Item When a Transaction T’s Output is Invisible in Tf (I.e., the Value is Lost), Then T is Referred to as a Useless Transaction Useless Transactions have no Paths from Transaction to Tf Note: Maintain Same set of Locks (Rlock, Wlock, Ulock) with Different Interpretation on Wlock Chaps21&22-130 Algorithm 3 – Augmented Graph CSE 4701 T0 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) Tf T1 Write A Rlock A Wlock C Unlock C T2 Write B Rlock A T3 Write C T4 Write D T0 Writes A, B, C, D Prior to Step (1) Rlock C Wlock B Unlock B Rlock B Unlock A Unlock A Wlock A Rlock C Wlock D Unlock B Unlock C Rlock B Unlock A Wlock A Unlock B T Unlock D f Read A Read B Reads A, B, C, D After Step (24) Read C Wlock B Unlock B Unlock C Unlock A Read D Chaps21&22-131 Algorithm 3 – Steps 1 to 4 CSE 4701 Input: Schedule S for Transactions T1, T2 , … Tk Output: Is S Serializable? If so, Serial Schedule Method: Create a Directed Polygraph Graph P: 1. Augment S with Dummy To (Write Every Item) an Dummy Tf (Read Every Item) 2. Create Initial Polygraph P by Adding Nodes for To, Tf, and Each Ti Transaction , in S 3. Place an Arc from Ti to Tj Whenever Tj Reads A in Augmented S (with Dummy States) that was Last Written by Ti. Write to Read for Each Item Repeat this Step for all Arcs. Don’t Forget to Consider Dummy States! 4. Discover Useless Transactions - T is Useless if there is no Path from T to Tf This is the “Initialization” Phase of Algorithm 3 Chaps21&22-132 Resulting Polygraph - Steps 1 to 2 Create the Polygraph by 1. Add To and Tf to S, 2. Add To , Tf , T1 , T2 , T3 , T4 to Polygraph P CSE 4701 T0 T1 T2 T3 T4 Tf 3. Augment Schedule with To and Tf T0 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) Tf T1 Write A T2 Write B T3 Write C T4 Write D Rlock A Rlock A Wlock C Unlock C Rlock C Wlock B Unlock B Rlock B Unlock A Unlock A Wlock A Rlock C Wlock D Unlock B Unlock C Rlock B Unlock A Wlock A Unlock B Wlock B Unlock B Unlock D Unlock C Unlock A Read A Read B Read C Read D Chaps21&22-133 Alg 3 Step 3 - Init=T0 & Fin=Tf CSE 4701 T0 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) Tf T1 Write A T2 Write B Rlock A Rlock A Wlock C Unlock C T3 Write C T4 Write D WhoReads ReadsB after AD Who Who Reads CAafter after WritesB? A? A? TT41T210Writes Writes C? D? Rlock C Wlock B Unlock B Rlock B Unlock A Unlock A Wlock A Rlock C Wlock D Unlock B Unlock C Rlock B No one Reads A after T3 Writes A? Unlock A Wlock A Unlock B Wlock B Unlock B Unlock D Read A Read B Read C Unlock C Unlock A Read D Chaps21&22-134 Step 3 -Write to Reads on A CSE 4701 Chaps21&22-135 Step 3 - Write to Reads on B CSE 4701 Chaps21&22-136 Step 3 - Write to Reads on C CSE 4701 Chaps21&22-137 Step 3 - Write to Reads on D CSE 4701 Chaps21&22-138 Resulting Polygraph - Steps 1 to 3 CSE 4701 1. Add To and Tf to S, 2. Add To , Tf , T1 , T2 , T3 , T4 to Polygraph P 3. Look for Ti Write X to Tj Read X for all Items X 4. Look for Useless Transactions - No Paths from T to Tf D:WR C:WR B:WR T0 A:WR T1 A:WR B:WR T2 T3 T4 A:WR B:WR Tf C:WR C:WR Chaps21&22-139 Resulting Polygraph - Steps 1-4 CSE 4701 1. Add To and Tf to S, 2. Add To , Tf , T1 , T2 , T3 , T4 to Polygraph P 3. Look for Ti Write X to Tj Read X for all Items X 4. For - T3 Remove Arcs Into T3 – This Completes Step 4 D:WR C:WR B:WR T0 A:WR T1 B:WR T2 T3 T4 A:WR B:WR Tf A:WR C:WR Chaps21&22-140 Algorithm 3 – Steps 5 to 7 CSE 4701 Method: Reassess the Initial Polygraph P: 5. For Each Remaining Arc Ti W to Tj R(meaning that Tj Reads Item A Written by Ti ) Consider all T To and T Tf that also Writes A: I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj Tf then Add Arc from Tj to T III. If Ti To and Tj = Tf then Add Arc from T to Ti IV. If Ti To and Tj Tf then Add Arc Pair from T to Ti and Tj to T 6. Determine if P is Acyclic by “Choosing” One Transaction Arc for Each Pair - Make Choices Carefully 7. If Acyclic - Serializable - Perform Topological Sort without To , Tf for Equivalent Serial Schedule. Else - Not Serializable Chaps21&22-141 What are Four Cases of Step 5 Conceptually? CSE 4701 5. For Each Remaining Arc Ti W to Tj R Consider all T To and T Tf that also Writes A: I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj Tf then Add Arc from Tj to T III. If Ti To and Tj = Tf then Add Arc from T to Ti IV. If Ti To and Tj Tf then Add Arc Pair from T to Ti and Tj to T General Case: Ti X:WR Case I: no new arc T0 X:WR Tf Tj Case II: Add Arc to from Ti to T T is after T0 X:WR Tj T II X:RW Chaps21&22-142 What are Four Cases of Step 5 Conceptually? CSE 4701 5. For Each Remaining Arc Ti W to Tj R Consider all T To and T Tf that also Writes A: I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj Tf then Add Arc from Tj to T III. If Ti To and Tj = Tf then Add Arc from T to Ti IV. If Ti To and Tj Tf then Add Arc Pair from T to Ti and Tj to T General Case: Ti X:WR Tj Case III: Add Arc from T to Ti – T is before T III X:RW Ti X:WR Tf Chaps21&22-143 What are Four Cases of Step 5 Conceptually? CSE 4701 5. For Each Remaining Arc Ti W to Tj R Consider all T To and T Tf that also Writes A: I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj Tf then Add Arc from Tj to T III. If Ti To and Tj = Tf then Add Arc from T to Ti IV. If Ti To and Tj Tf then Add Arc Pair from T to Ti and Tj to T General Case: Ti X:WR Case IV: Add in two Arcs T is after Tj or before Ti Tj Ti X:WR Tj T IV X:RW IV X:RW Chaps21&22-144 Step 5 - Go Thru Each Write/Read Arrow CSE 4701 T0 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) Tf T1 Write A T2 Write B Rlock A T3 Write C T4 Write D For For TT004 to to TT12f Arc Arc Who Who Else Else Writes Writes A? A? Rlock A Wlock C Unlock C Rlock C Wlock B Unlock B Rlock B Unlock A Unlock A Wlock A Rlock C Wlock D Unlock B Unlock C Rlock B Unlock A Wlock A Unlock B Wlock B Unlock B Unlock D Read A Read B Read C Unlock C Unlock A Read D Chaps21&22-145 Resulting Polygraph - Step 5 - A:WR D:WR C:WR B:WR CSE 4701 T0 A:WR T1 B:WR T2 T3 T4 A:WR B:WR Tf A:WR C:WR C:WR B:WR II A:RW II A:RW T0 A:WR T1 B:WR T2 D:WR II A:RW T3 T4 II A:RW III A:RW A:WR B:WR Tf A:WR C:WR Chaps21&22-146 Resulting Polygraph - Step 5 - A:WR 5. For Each Arc Ti to Tj Consider All T’s that Write X I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj Tf then Add Arc from Tj to T III. If Ti To and Tj = Tf then Add Arc from T to Ti IV. If Ti To and Tj Tf then Add Pair from T to Ti and Tj to T Check Items A (see new arcs/labels - case II and III) CSE 4701 C:WR B:WR II A:RW II A:RW T0 A:WR T1 A:WR B:WR T2 D:WR II A:RW T3 T4 II A:RW III A:RW A:WR B:WR Tf C:WR Chaps21&22-147 Alg 3 Ex - Step 5 - Who Else Writes C/D? CSE 4701 T0 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) Tf T1 Write A T2 Write B Rlock A Rlock A Wlock C Unlock C T3 Write C T4 Write D T0 For three For One T T12 Arcs Arc Does Does Anyone Anyone Else Else Write Write C? D? Rlock C Wlock B Unlock B Rlock B Unlock A Unlock A Wlock A Rlock C Wlock D Unlock B Unlock C Rlock B Unlock A No Writes No New Arcs Wlock A Unlock B Wlock B Unlock B Unlock D Read A Read B Read C Unlock C Unlock A Read D Tf Chaps21&22-148 Resulting Polygraph-Step 5- C:WR & D:WR 5. For Each Arc Ti to Tj Consider All T’s that Write X CSE 4701 I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj Tf then Add Arc from Tj to T III. If Ti To and Tj = Tf then Add Arc from T to Ti IV. If Ti To and Tj Tf then Add Pair from T to Ti and Tj to T Do any Other Transactions Write C or Write D for the arrows labeled C:WR and D:WR Respectively? C:WR B:WR II A:RW II A:RW T0 A:WR T1 B:WR T2 D:WR II A:RW T3 T4 III A:RW II A:RW A:WR B:WR Tf A:WR C:WR Chaps21&22-149 Alg 3 Ex - Step 5 - Who Else Writes B? CSE 4701 T0 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) Tf T1 Write A T2 Write B Rlock A T3 Write C Rlock A Wlock C Unlock C Rlock Wlock B Unlock B T4 Write D For For T to to Arc Arc For Just T This TTTwo T1is to already soArcs: Case TTT no arc IV 2Arc 1 but 41 4 f4 Who WhoTElse Else Else Writes Writes B? B? Who Arc from Writes Writes T to TB? and 4after 1T 4 4T 2B C T4 before T1 Rlock B Unlock A Unlock A Wlock A Rlock C Wlock D Unlock B Unlock C Rlock B Unlock A Wlock A Unlock B Wlock B Unlock B Unlock D Read A Read B Read C Unlock C Unlock A Read D Chaps21&22-150 Two Added Arcs for Case IV and B T4 Follows T2 T1 and T4 Before CSE 4701 IV B:RW C:WR B:WR II A:RW II A:RW D:WR II A:RW T0 A:WR T1 B:WR T2 T3 II A:RW A:WR III A:RW T4 A:WR B:WR Tf C:WR IV B:RW Chaps21&22-151 Resulting Polygraph - Step 5 and 6 5. For Each Arc Ti to Tj Consider All T’s that Write X I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj Tf then Add Arc from Tj to T III. If Ti To and Tj = Tf then Add Arc from T to Ti IV. If Ti To and Tj Tf then Add Pair from T to Ti and Tj to T B (see new arcs - including alternates - dashed) CSE 4701 For T1 to T2, T4 writes - so add T2 to T4 and T4 to T1 – Case IV Either T4 After T2 or Before T1 - no new arcs for other WRs. C:WR B:WR IV B:RW II A:RW II A:RW D:WR II A:RW T0 A:WR T1 B:WR T2 T3 II A:RW A:WR IV B:RW III A:RW T4 A:WR B:WR Tf C:WR Chaps21&22-152 Resulting Polygraph - Step 5 and 6 6. Which Option of Pair of Arcs Should be Chosen? Why? CSE 4701 C:WR B:WR IV B:RW II A:RW II A:RW D:WR II A:RW T0 A:WR T1 B:WR T2 II A:RW T3 A:WR IV B:RW III A:RW T4 A:WR B:WR Tf C:WR Chaps21&22-153 Final Polygraph - Step 7 Final Graph with Are Removed Delete Dummy States below CSE 4701 C:WR B:WR IV B:RW II A:RW II A:RW D:WR II A:RW T0 A:WR T1 B:WR T2 T3 II A:RW A:WR III A:RW T4 A:WR B:WR Tf C:WR Topological Sort Yields Order: T1 , T2 , T3 , T4 C:WR B:WR II A:RW II A:RW II A:RW T1 B:WR T2 II A:RW T3 IV B:RW III A:RW T4 Chaps21&22-154 Implementation Issues for CC CSE 4701 Return to Earlier Diagram… Transaction Manager + Schedule Implement CC User Application Begin_Transaction, Read, Write, Abort, Commit, End_Transaction User Application … Transaction Manager (TM) Read, Write, Abort, EOT Results Scheduler (SC) Scheduled Operations Results Recovery Manager (RM) Chaps21&22-155 Implementation Issues for CC CSE 4701 To Implement Algorithms 1 to 3, Focus on Software Infrastructure (TM and SC) Protocol (CC Model for Algorithms 1 to 3) TM/SC: Arbitrates and Controls Transaction Execution Protocol: Restrictions on the Elementary Steps of a Transaction in Order to Promote Serializability TM/SC + Protocol Comprise the Requirements/Specification of the Concurrency Control Mechanism Concurrency Control Mechanism Itself Can be Modeled as a Lock Manager with Lock Tables Chaps21&22-156 Implementation Issues for CC CSE 4701 Locking Modes - In Support of Algorithms 1-3, there is Requirement to Establish Locking Modes For Binary, Read/Write, etc., a Table Lists the Compatibility of the Locks w.r.t. Concurrent Behavior For Example, Tables Below Illustrates all Legal Concurrent Actions of Two Transactions R/W Locks (on left) R/W/Increment Locks (on right) What are Increment Locks used for? R R W Yes No W No No R R W R W I Yes Yes No No No W W I No No No No No No No Yes Chaps21&22-157 Implementation Issues for CC CSE 4701 Locking Modes - Can be Extended and Refined Based on the Level of Granularity that is Desired For Example: Retrieve-Delete-Update-Insert Questions: How Can Two Deletes/Inserts be Compatible? Will Effect of a Delete/Insert be Lost? Delete R Delete W Insert Yes Yes No No Update Retrieve W Insert Update Retrieve No No No No No Yes No No No No No No No No No Yes Chaps21&22-158 Implementation Issues for CC CSE 4701 Answer: Focus on Buffer Management Capability Smart Buffer Manager Tracks All Blocks at All Times If T1 loaded Block 123 at Time t, when T2 Goes to Access Block 123 at Time t+10, Buffer Manager Checks to See if Block Already in Memory Buffer Manager also has Concurrency Control! R Delete W Insert Delete Yes Yes No No Insert Update No No No No Yes No Retrieve Update No No No No Retrieve No No No Yes No No Chaps21&22-159 Implementation Techniques for CC CSE 4701 Algorithms 1 to 3 as Presented are Not Directly Implementable! Don’t Integrate the CC Requirements (Protocol) with the TM/SC Typical Implementation Techniques Utilize Queueing Strategies to Impose an Ordering on Transactions: Queue for Each Transaction that Tracks the Data Items Needed by the Transaction for its Execution Queue for Each Data Item that Tracks the Locks Requested and Held by All Transactions Contain Inverse Data of One Another Chaps21&22-160 Examples of Queues CSE 4701 T1 A Rlock B Wlock ... T2 B Rlock C Wlock ... T3 A Rlock B Rlock ... T4 C Rlock A Wlock ... A T1 Rlock T3 Rlock T4 Wlock ... B T1 Wlock T2 Rlock T3 Rlock ... C T2 Wlock T4 Rlock ... What is the State of Each Lock? What is the State of Each Transaction? What Happens when a Transaction, T1, Completes? Chaps21&22-161 Examples of Queues CSE 4701 Algorithms that Manage Queues Implement the CC Strategy Lock State is Often Maintained within Queue T1 A Rlock Held B Wlock Held ... T2 B Rlock Held C Wlock Wait ... A Rlock Held B Rlock Wait T3 T4 C Rlock Wait A Wlock Wait A T1 Rlock Held T3 Rlock Held T4 Wlock Wait ... B T1 Wlock Held T2 Rlock Wait T3 Rlock Wait ... C T2 Wlock Wait T4 Rlock Wait ... ... ... Chaps21&22-162 Why Optimistic Concurrency Control? CSE 4701 Motivate by Disadvantages of Locking Techniques Lock Maintenance Deadlock-Free Locking Protocols Limit Concurrency Secondary Memory Access Causes Locks to be Held for a Long Duration Locks Typically Held Until Transaction Completes, Which Reduces Concurrency Often Needed in “Worst” Case Only Overhead - Locking + Deadlock Detection Key Concept Write Collisions in Large Databases for “Many” Applications are Rare OCC: “Don’t Worry be Happy” Approach Chaps21&22-163 Basic Ideas of OCC CSE 4701 Interference Between Transactions is Rare and Locking Incurs too Much Overhead Instead, Allow Each Transaction to Execute Freely, and Check Serializability at the end of the Transaction Win (Allow to Commit) If No Interference Occurs or There have been No Conflicts Pessimistic execution Validate Read Write (and Compute) Optimistic execution Read Validate Write (and Compute) Chaps21&22-164 How Does OCC Work? CSE 4701 Execute Transactions Ad-Hoc - Let them Go Uncontrolled Maintain Information of “Relevant” Actions Against DB (Often in Conjunction with Recovery/Journal) When Transactions Finish - Check to see if Everything Proceeded Satisfactorily Assumes that Probability of Transaction Interference is Quite Small Two Questions re. OCC: How Do We know Everything Went OK? How do we Recover if it Didn’t? Chaps21&22-165 What is a Timestamp? CSE 4701 Timestamp A system generated clock “tick” to record event Two events cannot occur at same “tick” A monotonically increasing variable (integer) indicating the age of an operation or a transaction. A larger timestamp value indicates a more recent event or operation. Timestamp based algorithm uses timestamp to serialize the execution of concurrent transactions. For DB Transactions, a timestamp could be: Time that transaction is initiated Time of first read/write of transaction Remains unchanged throughout all Transaction steps Chaps21&22-166 How are Timestamps Utilized? CSE 4701 Each Transaction has unique Timestamp(TS) when started Associated with the Read time and Write time (when Stored) of Each Item in the DB t1 TS of Transaction, B an Item with TS t2 Avoid “impossible” situations – A Transaction CANNOT read the value of an Item if it was not written until after transaction executed Trans TS t1 can’t read Item B with write TS t2 if t2 > t1 A Transaction CANNOT write an Item if that Item has an old value read at a later time (after) Trans TS t1 can’t write Item B with read TS t2 if t2 > t1 If happens - Trans TS t1 must abort Chaps21&22-167 OCC Utilizes Timestamps CSE 4701 Timestamps are Clock Ticks used to Record the Major Milestones in the Execution of a Transaction Examples Include: Start Time of Transaction Read/Write Times for DB Items Finish Time of Transaction Commit Time of Transaction Two Important Definitions are: Read Time of an Item: Highest Time Stamp Possessed by Any Transaction that Reads the Item Write Time of an Item: Highest Time Stamp Possessed by Any Transaction that Wrote the Item A Transaction has a Fixed Time when it Started that is Constant Throughout its Execution Chaps21&22-168 How are Timestamps Used? CSE 4701 Focus on “When” Reads and Writes Occur Transaction Cannot Read an Item if its Value was Not Written Until After the Transaction Finished its Execution Transaction T with Timestamp t1 Cannot Read an Item with a Write Time of t2 if t2 > t1 If this is the Case, T Must Abort and be Restarted Can’t Read Item if it hasn’t been Written Transaction Cannot Write an Item if that Item has its Old Value Read at a Later Time Transaction T with Timestamp t1 Cannot Write an Item with a Read Time of t2 if t2 > t1 If this is the Case, T Must Abort and be Restarted Can’t Write Item Being Read at a Later Time Chaps21&22-169 Algorithm 4: Optimistic CC CSE 4701 Let T be a Transaction with Timestamp t Attempting to Perform Operation X on a Data Item I with Readtime tR and Writetime tW If (X = Read and t tW ) Perform Oper If t > tW then set tR = t for Data Item I (read after write) If (X = Write and t tR and t tW ) Perform Oper If t > tr then set tW = t for Data Item I (write after read) If (X = Write and tR t < tW ) then Do Nothing since Later Write will Cancel out the Write of T If (X = Read and t < tW ) or (X = Write and t < tR ) then Abort the Operation 1st - T trying to Read Item Before it was Written 2nd - T trying to Write an Item Before it was Read Chaps21&22-170 Example of OCC CSE 4701 T1 T2 200 150 T3 175 (1) Read B (2) Read A (3) Read C (4) Write B (5) Write A A B C RT=0 WT=0 RT=0 WT=0 RT=0 WT=0 RT=0 WT=0 RT=150 WT=0 RT=150 WT=0 RT=150 WT=0 RT=150 WT=200 RT=200 WT=0 RT=200 WT=0 RT=200 WT=0 RT=200 WT=200 RT=200 WT=200 RT=0 WT=0 RT=0 WT=0 RT=175 WT=0 RT=175 WT=0 RT=175 WT=0 What Happens at Each Step w.r.t. RT/WT? T3 ≥150 TS 175 – set C.RT T1 TST2200 B.WT =≥ 0C.WT –= set B.RT =200 TS ≥ A.WT 0 =– 0 set A.RT =150=175 T1 TS 200 ≥ B.RT = 200 – set B.WT =200 T1 TS 200 ≥ A.RT = 150 – set A.WT =200 Chaps21&22-171 Example of OCC CSE 4701 T1 T2 200 150 T3 175 (1) Read B (2) Read A (3) Read C (4) Write B (5) Write A (6) Write C A B C RT=0 WT=0 RT=0 WT=0 RT=0 WT=0 RT=0 WT=0 RT=150 WT=0 RT=150 WT=0 RT=150 WT=0 RT=150 WT=200 RT=200 WT=0 RT=200 WT=0 RT=200 WT=0 RT=200 WT=200 RT=200 WT=200 RT=0 WT=0 RT=0 WT=0 RT=175 WT=0 RT=175 WT=0 RT=175 WT=0 RT=150 WT=200 RT=200 WT=200 RT=175 WT=0 What Happens at Step 6? T2 WT(C) =150 < RT(C)=175 Trying to write C after its Read - Consequence - Abort T2 Chaps21&22-172 Example of OCC CSE 4701 T1 T2 200 150 T3 175 (1) Read B (2) Read A (3) Read C (4) Write B (5) Write A (6) Write C (7) Write A A B C RT=0 WT=0 RT=0 WT=0 RT=0 WT=0 RT=0 WT=0 RT=150 WT=0 RT=150 WT=0 RT=150 WT=0 RT=150 WT=200 RT=150 WT=200 RT=150 WT=200 RT=200 WT=0 RT=200 WT=0 RT=200 WT=0 RT=200 WT=200 RT=200 WT=200 RT=200 WT=200 RT=200 WT=200 RT=0 WT=0 RT=0 WT=0 RT=175 WT=0 RT=175 WT=0 RT=175 WT=0 RT=175 WT=0 RT=175 WT=0 Step (7) T3 175 < A.RT can Finish, but No Effect Chaps21&22-173 Summary of Example CSE 4701 T1 Completes Successfully; T2 Aborts; T3 Completes but Doesn’t Write A T1 T2 T3 A 200 150 175 RT=0 WT=0 RT=0 WT=0 RT=0 WT=0 RT=0 WT=0 RT=150 WT=0 RT=150 WT=0 RT=150 WT=0 RT=150 WT=200 RT=200 WT=0 RT=200 WT=0 RT=200 WT=0 RT=200 WT=200 RT=200 WT=200 RT=0 WT=0 RT=0 WT=0 RT=175 WT=0 RT=175 WT=0 RT=175 WT=0 RT=150 WT=200 RT=150 WT=200 RT=200 WT=200 RT=200 WT=200 RT=175 WT=0 RT=175 WT=0 (1) Read B (2) Read A (3) Read C (4) Write B (5) Write A (6) Write C (7) Write A B C Chaps21&22-174 Viewing OCC vs. Phases of Execution CSE 4701 Read Phase: Database Information Read from Secondary Storage into Primary Memory All Writes are to Local Workspace Validate Phase: Check to see if Integrity of Data has not been Violated Write Phase: Update the DB (Secondary Storage) from Local Copies Optimistic execution Read Validate Write (and Compute) Chaps21&22-175 Contrasting PCC and OCC CSE 4701 Transaction Control PCC: Control by Having Transactions Wait OCC: Control by Having Transactions Backed up Serializability PCC: Ordering of Data Items OCC: Ordering of Transactions Biggest Potential Problem PCC: Deadlock, rather Preventing it OCC: Starvation Different Applications Suited to Different Approaches Some DBMS Support Both DBA Can Configure on Application-byApplication Basis Chaps21&22-176 Recovery Consideration CSE 4701 Actual Write Operations of Previous Example are Phase 1 of Two-Phase Commit (Write to Journal) Commit - Phase 2 - Writes to DB Between Write to Log and Write to DB, No Other Transaction is Allowed to Read Items being Written OCC Reduces Work as Follows: One Step for Read, Two for Writes (write/commit) In Locking, we had Four Steps for R or W: Lock, Read or Write, Unlock, Commit Chaps21&22-177 Two Phase Commit Policy CSE 4701 All Actions for a Transaction are Performed in a Workspace (in Memory) Rather than Directly on the DB Copy of the Data These Actions are Written in Journal (Including the Commit Action) Leads to Two-Phase Commit Policy Transaction Cannot Write to DB Until Committed Transaction Cannot Commit Until All Changes have been Recorded First in the Jorunal Two Phases are: Phase 1: Write Data in Journal Phase 2: Write Data in DB Failure Can Occur Anytime! Chaps21&22-178 Why is Two Phase Commit Important? CSE 4701 Suppose DB Writes Occur Before Commit Assume a Transaction Aborts in the Middle of Processing Undo DB Changes Made to Actual Database Prior to Failure Relatively Straightforward and Manageable Undo Actions of Other Transactions that Read Information Written by Aborted Transaction Impossible! Undo May Require you to Propagate to Many Other Transactions, Particularly if Aborted Transaction was Long-Duration (hours) Basic Concepts of Recovery are Used to Non-Locking Optimistic CC Approach! Chaps21&22-179 Concurrency Control Locking Details CSE 4701 Two-Phase Locking Techniques Locking is an operation which secures (a) permission to Read (b) permission to Write a data item for a transaction. Example: Lock (X). Data item X is locked in behalf of the requesting transaction. Unlocking is an operation which removes these permissions from the data item. Example: Unlock (X): Data item X is made available to all other transactions. Lock and Unlock are Atomic operations. Chaps21&22-180 Two-Phase Locking Techniques: CSE 4701 Essential components Two locks modes: (a) shared (read) (b) exclusive (write). Shared mode: shared lock (X) More than one transaction can apply share lock on X for reading its value but no write lock can be applied on X by any other transaction. Exclusive mode: Write lock (X) Conflict matrix Write Y N Write Read Read Only one write lock on X can exist at any time and no shared lock can be applied by any other transaction on X. N N Chaps21&22-181 Two-Phase Locking Techniques: CSE 4701 Essential components Lock Manager: Managing locks on data items. Lock table: Lock manager uses it to store the identify of transaction locking a data item, the data item, lock mode and pointer to the next data item locked. One simple way to implement a lock table is through linked list. Transaction ID Data item id lock mode Ptr to next data item T1 X1 Read Next Chaps21&22-182 Two-Phase Locking Techniques: CSE 4701 Essential components Database requires that all transactions should be well-formed. A transaction is well-formed if: It must lock the data item before it reads or writes to it. It must not lock an already locked data items and it must not try to unlock a free data item. Chaps21&22-183 Two-Phase Locking Techniques: Essential components CSE 4701 The following code performs the lock operation: B: if LOCK (X) = 0 (*item is unlocked*) then LOCK (X) 1 (*lock the item*) else begin wait (until lock (X) = 0) and the lock manager wakes up the transaction); goto B end; Chaps21&22-184 Two-Phase Locking Techniques: CSE 4701 Essential components The following code performs the unlock operation: LOCK (X) 0 (*unlock the item*) if any transactions are waiting then wake up one of the waiting the transactions; Chaps21&22-185 Two-Phase Locking Techniques: Essential components CSE 4701 The following code performs the read operation: B: if LOCK (X) = “unlocked” then begin LOCK (X) “read-locked”; no_of_reads (X) 1; end else if LOCK (X) “read-locked” then no_of_reads (X) no_of_reads (X) +1 else begin wait (until LOCK (X) = “unlocked” and the lock manager wakes up the transaction); go to B end; Chaps21&22-186 Two-Phase Locking Techniques: Essential components CSE 4701 The following code performs the unlock operation: if LOCK (X) = “write-locked” then begin LOCK (X) “unlocked”; wakes up one of the transactions, if any end else if LOCK (X) “read-locked” then begin no_of_reads (X) no_of_reads (X) -1 if no_of_reads (X) = 0 then begin LOCK (X) = “unlocked”; wake up one of the transactions, if any end end; Chaps21&22-187 Two-Phase Locking Techniques: CSE 4701 The algorithm Two Phases: (a) Locking (Growing) (b) Unlocking (Shrinking). Locking (Growing) Phase: A transaction applies locks (read or write) on desired data items one at a time. Unlocking (Shrinking) Phase: A transaction unlocks its locked data items one at a time. Requirement: For a transaction these two phases must be mutually exclusively, that is, during locking phase unlocking phase must not start and during unlocking phase locking phase must not begin. Chaps21&22-188 Two-Phase Locking Techniques: The algorithm T1 T2 Result read_lock (Y); read_item (Y); unlock (Y); write_lock (X); read_item (X); X:=X+Y; write_item (X); unlock (X); read_lock (X); read_item (X); unlock (X); Write_lock (Y); read_item (Y); Y:=X+Y; write_item (Y); unlock (Y); Initial values: X=20; Y=30 Result of serial execution T1 followed by T2 X=50, Y=80. Result of serial execution T2 followed by T1 X=70, Y=50 Copyright © 2011 Ramez Elmasri and Shamkant Navathe Two-Phase Locking Techniques: The algorithm T1 T2 read_lock (Y); read_item (Y); unlock (Y); Time Result X=50; Y=50 Nonserializable because it. violated two-phase policy. read_lock (X); read_item (X); unlock (X); write_lock (Y); read_item (Y); Y:=X+Y; write_item (Y); unlock (Y); write_lock (X); read_item (X); X:=X+Y; write_item (X); unlock (X); Copyright © 2011 Ramez Elmasri and Shamkant Navathe Two-Phase Locking Techniques: The algorithm T’1 T’2 read_lock (Y); read_item (Y); write_lock (X); unlock (Y); read_item (X); X:=X+Y; write_item (X); unlock (X); read_lock (X); read_item (X); Write_lock (Y); unlock (X); read_item (Y); Y:=X+Y; write_item (Y); unlock (Y); Copyright © 2011 Ramez Elmasri and Shamkant Navathe T1 and T2 follow two-phase policy but they are subject to deadlock, which must be dealt with. Two-Phase Locking Techniques: CSE 4701 The algorithm Two-phase policy generates two locking algorithms (a) Basic (b) Conservative Conservative: Prevents deadlock by locking all desired data items before transaction begins execution. Basic: Transaction locks data items incrementally. This may cause deadlock which is dealt with. Strict: A more stricter version of Basic algorithm where unlocking is performed after a transaction terminates (commits or aborts and rolled-back). This is the most commonly used two-phase locking algorithm. Chaps21&22-192 Dealing with Deadlock and Starvation CSE 4701 Deadlock prevention A transaction locks all data items it refers to before it begins execution. This way of locking prevents deadlock since a transaction never waits for a data item. The conservative two-phase locking uses this approach. Chaps21&22-193 Dealing with and Starvation CSE 4701 Deadlock detection and resolution In this approach, deadlocks are allowed to happen. The scheduler maintains a wait-for-graph for detecting cycle. If a cycle exists, then one transaction involved in the cycle is selected (victim) and rolled-back. A wait-for-graph is created using the lock table. As soon as a transaction is blocked, it is added to the graph. When a chain like: Ti waits for Tj waits for Tk waits for Ti or Tj occurs, then this creates a cycle. Chaps21&22-194 Dealing with Deadlock and Starvation CSE 4701 Deadlock avoidance There are many variations of two-phase locking algorithm. Some avoid deadlock by not letting the cycle to complete. That is as soon as the algorithm discovers that blocking a transaction is likely to create a cycle, it rolls back the transaction. Algorithms use timestamps to avoid deadlocks by rolling-back victim. Chaps21&22-195 Dealing with Deadlock and Starvation CSE 4701 Starvation Starvation occurs when a particular transaction consistently waits or restarted and never gets a chance to proceed further. In a deadlock resolution it is possible that the same transaction may consistently be selected as victim and rolled-back. This limitation is inherent in all priority based scheduling mechanisms. A younger transaction may always be aborted by a long running older transaction which may create starvation. Chaps21&22-196 Concluding Remarks CSE 4701 Background OS Concepts of Sharing and Synchronization Deadlock Detection, Prevention, Avoidance Chapter 21 Transaction Processing Concepts Different Problems re. Concurrency Control Deadlock, Livelock, Starvation Lost Update, Dirty Read, etc. Serial Schedule and Serializability Chapter 22 Deviated from Textbook Notation 3 Pessimistic Locking Based CC Algorithms 1 Optimistic Timestamp Based CC Algorithm Role of Recovery in CC Chaps21&22-197