General Overview Relational model - SQL Formal & commercial query languages Functional Dependencies Normalization Physical Design Indexing Query Processing and Optimization Transaction Processing and CC Review: AC[I]D Isolation Concurrent xctions unaware of each other How? We discussed locking protocols: 2PL protocol and its variants Graph-based locking protocols Multiple-Granularity Locks Hard to decide what granularity to lock (tuples vs. pages vs. tables). Shouldn’t have to decide! Data “containers” are nested: Database contains Tables Pages Tuples Intention Lock Modes In addition to S and X lock modes, there are three additional lock modes with multiple granularity: intention-shared (IS): indicates explicit locking at a lower level of the tree but only with shared locks. intention-exclusive (IX): indicates explicit locking at a lower level with exclusive or shared locks shared and intention-exclusive (SIX): the subtree rooted by that node is locked explicitly in shared mode and explicit locking is being done at a lower level with exclusive-mode locks. MGL: multiple granularity locking Before locking an item, Xact must set “intention locks” on all its ancestors. For unlock, go from specific to general (i.e., bottom-up). SIX mode: Like S & IX at the same time. -- IS IX S X IS IX -- S X Parent locked in IS IX S SIX X Child can be locked in IS, S IS, S, IX, X, SIX [S, IS] not necessary X, IX, [SIX] none P C Example T1(IS) , T2(IX) R1 t1 t2 T1(S) t3 t4 T2(X) Multiple Granularity Locking Scheme Transaction Ti can lock a node Q, using the following rules: (1) Follow multiple granularity comp function (2) Lock root of tree first, any mode (3) Node Q can be locked by Ti in S or IS only if parent(Q) can be locked by Ti in IX or IS (4) Node Q can be locked by Ti in X,SIX,IX only if parent(Q) locked by Ti in IX,SIX (5) Ti uses 2PL (6) Ti can unlock node Q only if none of Q’s children are locked by Ti Observe that locks are acquired in root-to-leaf order, whereas they are released in leaf-to-root order. Examples T1(IX) T1(IS) R R T1(IX) t2 t1 t3 T1(S) t4 t3 t2 t1 t4 T1(X) f2.1 f2.2 f4.2 f4.2 f2.1 f4.2 f2.2 T1(SIX) Can T2 access object f2.2 in X mode? What locks will T2 get? R T1(IX) t2 t1 t3 t4 T1(X) f2.1 f2.2 f4.2 f4.2 f4.2 Examples T1 scans R, and updates a few tuples: T1 gets an SIX lock on R, then repeatedly gets an S lock on tuples of R, and occasionally upgrades to X on the tuples that needs to update. T2 uses an index to read only part of R: T2 gets an IS lock on R, and repeatedly gets an S lock on tuples of R. T3 reads all of R: T3 gets an S lock on R. OR, T3 could behave like T2; can use lock escalation to decide which. -- IS IX S X IS IX -- S X Optimistic CC Locking is a conservative approach in which conflicts are prevented. Disadvantages: Lock management overhead. Deadlock detection/resolution. Lock contention for heavily used objects. If conflicts are rare, we might be able to gain concurrency by not locking, and instead checking for conflicts before Xacts commit. Validation-Based Protocol Execution of transaction Ti is done in three phases. 1. Read and execution phase: Ti reads all values and makes copies to local variables (private workspace.) Ti writes only to temporary local variables. No locking. 2. Validation phase: Transaction Ti performs a ``validation test'' to determine if local variables can be written without violating serializability. 3. Write phase: If Ti is validated, the updates are applied to the database; otherwise, Ti is rolled back. optimistic concurrency control: transaction executes fully in the hope that all will go well during validation Validation-Based Protocol (Cont.) Each transaction Ti has 3 timestamps Start(Ti) : the time when Ti started its execution Validation(Ti): the time when Ti entered its validation phase Finish(Ti) : the time when Ti finished its write phase Serializability order is based on Validation(Ti). Key idea: validation is atomic! Validation-Based Protocol To implement validation, system keeps the following sets: FIN = transactions that have finished phase 3 (and are all done) VAL = transactions that have successfully finished phase 2 (validation) For each transaction the Read and Write Sets Example of what validation must prevent: RS(T1)={B} WS(T1)={B,D} T1 start T2 start RS(T2)={A,B} = WS(T2)={C} T1 T2 validated validated time T2 validation will fail! Example of what validation must allow: RS(T1)={B} WS(T1)={B,D} T1 start T2 start RS(T2)={A,B} = WS(T2)={C} T1 T2 validated validated T1 finish phase 3 T2 start time Another thing validation must prevent: RS(T1)={A} RS(T2)={A,B} WS(T1)={D,E} WS(T2)={C,D} T1 validated T2 validated finish 2 BAD: w2(D) w1T(D) finish T1 time Another thing validation must allow: RS(T1)={A} RS(T2)={A,B} WS(T1)={D,E} WS(T2)={C,D} T1 T2 validated validated finish finish T1 T1 time Validation rules for Tj: (1) When Tj starts phase 1: IGNORE(Tj) FIN (2) at Tj Validation: if check (Tj) then [ VAL VAL U {Tj}; do write phase; FIN FIN U {Tj} ] All transactions that either validated or finished after the start of Tj Check (Tj): For Ti VAL - IGNORE (Tj) DO IF [ WS(Ti) RS(Tj) OR Ti FIN ] THEN RETURN false; RETURN true; Is this check too restrictive ? Improving Check(Tj) For Ti VAL - IGNORE (Tj) DO IF [ WS(Ti) RS(Tj) OR (Ti FIN AND WS(Ti) WS(Tj) )] THEN RETURN false; RETURN true; Example: U: RS(U)={B} WS(U)={D} T: RS(T)={A,B} WS(T)={A,C} start validate finish W: RS(W)={A,D} WS(W)={A,C} V: RS(V)={B} WS(V)={D,E} U,T,V successful; W abort and roll back Timestamp-Based Protocols Idea: Decide in advance ordering of xctions Ensure concurrent schedule serializes to serial order decided Timestamps 1. TS(Ti) is time Ti entered the system 2. Data item timestamps: 1. W-TS(Q): Largest timestamp of any xction that wrote Q 2. R-TS(Q): Largest timestamp of any xction that read Q Timestamps -> serializability order Timestamp CC Idea: If action pi of Xact Ti conflicts with action qj of Xact Tj, and TS(Ti) < TS(Tj), then pi must occur before qj. Otherwise, restart violating Xact. When Xact T wants to read Object O If TS(T) < W-TS(O), this violates timestamp order of T w.r.t. writer of O. So, abort T and restart it with a new, larger TS. (If restarted with same TS, T will fail again!) If TS(T) > W-TS(O): Allow T to read O. Reset R-TS(O) to max(R-TS(O), TS(T)) Change to R-TS(O) on reads must be written to disk (log)! This and restarts represent overheads. U writes O T reads O T start U start When Xact T wants to Write Object O 1) If TS(T) < R-TS(Q), then the value of Q that T is producing was needed previously, and the system assumed that that value would never be produced. write rejected, T is rolled back and restarts. 2) If TS(T) < W-TS(Q), then T is attempting to write an obsolete value of Q. Hence, this write operation is rejected, and T is rolled back. 3) Otherwise, the write operation is executed, and W-TS(Q) is set to TS(T). U reads Q T writes Q Another approach in 2) is to ignore the write and continue!! Thomas Write Rule T start U start Timestamp CC and Recoverability Unfortunately, unrecoverable schedules are allowed: T1 W(A) T2 R(A) W(B) Commit Timestamp CC can be modified to allow only recoverable schedules: Buffer all writes until writer commits (but update WTS(O) when the write is allowed.) Block readers T (where TS(T) > WTS(O)) until writer of O commits. Similar to writers holding X locks until commit, but still not quite 2PL.