Concurrency III (Timestamps) Schedulers • A scheduler takes requests from transactions for reads and writes, and decides if it is “OK” to allow them to operate on DB or defer them until it is safe to do so. • Ideal: a scheduler forwards a request iff it cannot lead to inconsistency of DB Timestamps • Unique value representing a time. – Example: clock time. – Example: serial counter. Serializability Via Timestamps • Main idea: let things rumble along without any locking or scheduling (be optimistic) • As transactions read/write, check that what they are doing makes sense if the serial order was the same as the order in which transactions initiated. – Big gain if most transactions are readonly. • Every transaction T is given a timestamp TS(T) when it is initiated. Serializability Via Timestamps (Continued) • Every DB element X has two timestamps: 1. RT (X) = highest timestamp of a transaction to read X. 2. WT (X) = highest timestamp of a transaction to write X. • Every DB element X has a bit C(X) indicating whether the most recent writer of X has committed. – Essential to avoid a “dirty read,” where a transaction reads data that was written by a transaction that later aborts. Physically Unrealizable Behaviors • The scheduler assumes the timestamp order of transactions is also the serial order in which they must appear to execute. • Scheduler needs to check that: whenever a read or write occurs, what happens in real time could have happened if each transaction had executed instantaneously at the moment of its timestamp. • If not, we say the behavior is physically unrealizable. What is Physically Unrealizable? • Read Too Late: Transaction T tries to read X, but TS(T) < WT(X). – T would read something that was written after T apparently finished. But Wait if Data is Dirty? • T tries to read X, and TS(T)>WT(X), but C(X) = false. – T would be reading dirty data --- a risk we won't take. What is Physically Unrealizable? • Write Too Late: Transaction T tries to write X, but • TS(T)<RT(X) – Some other transaction read a value written earlier than T write, when it should have read what was written by T. What is Physically Unrealizable? • Write Too Late: Transaction T tries to write X, but RT(X)>TS(T)>WT(X). – When T tries to write X it finds RT(X)>TS(T). This means that X has already been read by some transaction that theoretically executed after T. – We also find TS(T)>WT(X), which means that no other transaction wrote into X a value that would have overwritten T’s value, negating T responsibility for the value of X. • This idea that writes can be skipped when a write with a later writetime is already in place, is called Thomas rule. Thomas Write Rule • If U later aborts, then its value of X should be removed and the previous value and write-time restored. • Since T is committed, it would seem that the value of X should be the one written by T for future reading. • However, we already skipped the write by T and it is too late to repair the damage. Abort/Update Decision • Legal = Physically Realizable. • Illegal = Physically Unrealizable. • If illegal, rollback T = abort T and restart it with a new timestamp. • When a transaction finishes with no rollback, commit the transaction by changing all C(X) bits to true. Rules, in detail… Suppose the scheduler receives a request rT(X), (a) If TS(T) WT(X), the read is physically realizable. i. If C(X)=true, grant the request. If TS(T) > RT(X), set RT(X) := TS(T); otherwise do not change RT(X). ii. If C(X)=false, delay T until C(X) becomes true, or the transaction that wrote X aborts. (b) If TS(T) < WT(X), the read is physically unrealizable. Rollback T; that is, abort T and restart it with a new, larger timestamp. Rules, in detail… Suppose the scheduler receives a request wT(X), (a) If TS(T) RT(X) the write is physically realizable If TS(T) WT(X), the write must be performed. i. Write the new value for X, ii. Set WT(X) := TS(T), and iii. Set C(X) := false. If TS(T) < WT(X), then there is already a later value in X. If C(X)=true, then the previous writer of X is committed, and we simply ignore the write by T; Otherwise, if C(X)=false, then we must delay T. (b) If TS(T) < RT(X), then the write is physically unrealizable, and T must be rolled back. Rules, in detail… Suppose the scheduler receives a request to commit T. • It must find all the database elements X written by T, and set c(X) := true. • If any transactions are waiting for X to be committed, these transactions are allowed to proceed. Suppose the scheduler receives a request to abort T or decides to rollback T. • Then any transaction that was waiting on an element X that T wrote must repeat its attempt to read or write, and see whether the action is now legal after T's writes are cancelled. • • T2 aborts because it tries to write “at time 150” when another value of C was already read “at time 175.” T3 is allowed to “write A,” but since there is already a later write of A, the DB is not affected. Timestamps Versus Locks • Locking requires a lock table for currently locked items, while timestamping uses space for two timestamps in each DB element, locked or not. • Locking may cause transactions to wait; timestamping doesn't cause waiting, but may abort transactions. • Net effect: if most transactions are read only or few transactions interfere, then timestamping gives better throughput; otherwise not.