18.8 Concurrency Control by Timestamps Dongyi Jia - CS257 ID:116 - Spring 2008 - Agenda Timestamps Physically Unrealizable Behaviors Problems With Dirty Data The Rules for Timestamp-Based Scheduling Multiversion Timestamps Timestamps and Locking Timestamp TS(T) Scheduler assign each transaction T a unique number, it’s timestamp TS(T). The timestamp ordering protocol ensures that any conflicting read and write operations are executed in timestamp order. Timestamp TS(T) Each transaction is issued a timestamp when it enters the system. If an old transaction T1 has timestamp TS(T1), a new transaction T2 is assigned timestamp TS(T2) such that TS(T1) <TS(T2). Timestamp TS(T) Two approaches to generate Timestamps: 1. Use the value of system, clock as the timestamp; that is, a transaction’s timestamp is equal to the value of the clock when the transaction enters the system. 2. Use a logical counter that is incremented after a new timestamp has been assigned; that is, a transaction’s is equal to the value of the counter when the transaction enters the system. Implementation of Timestamp-Based Protocols we associate with each database element X two timestamp values and an additional bit: RT(X): highest timestamp of a transaction that has read X. WT(X): highest timestamp of a transaction that has written X. C(X): commit bit for X, which is ture if and only if most recent transaction to wirte X has already committed. Implementation of Timestamp-Based Protocols Purpose of this bit is to avoid a situation where one transactions T reads data written by another transaction U, and U then aborts. ---”dirty read” These time stamps are updated whenever a new Read(X) or Write(X) instruction is executed. Physically Unrealizable Behaviors Read to late: a transaction U that started after transaction T, but wrote a value for X before T reads X. Figure: Transaction T tries to read too later U writes X T reads X T start U start Physically Unrealizable Behaviors Write too late: a transaction U that started after T, but read X before T got a chance to write X. U reads X T writes X T start U start Figure: Transaction T tries to write too late Problems with Dirty Data Dirty Read: It is possible that after T reads the value of X written by U, transaction U will abort. U writes X T reads X U start T start U aborts T could perform a dirty read if it reads X when shown Problems with Dirty Data Thomas Write Rule Write can be skipped when a write with a later writetime is already in place. U writes X T writes X T start U start T commits U aborts Figure: A write is cancelled because of a write with a later timestamp, but the writer then aborts Problems with Dirty Data A simple policy When a transaction T writes a database element X, the write is “tentative” and maybe undone if T aborts. The commit bit C(X) is set to false, and the scheduler makes a copy of the old value of X and its previous WT(X). The Rules of Timestamp-Based Scheduling Four Classes the scheduler receives a request RT(X) the scheduler receives a request WT(X) the scheduler receives a request to commit T the scheduler receives a request to abort T or decides to rollback T The Rules of Timestamp-Based Scheduling(1) Scheduler receives a request RT(X) (a) If TS(T)>=WT(X), the read is physically realizable. i. If C(X) is true, grant the request. ii. If C(X) is false, delay T until C(X) becomes true or the transaction that wrote X aborts. (b) If TS(T)<WT(X), the read is physically unrealizable, abort T and restart it with a new, larger timestamp. The Rules of Timestamp-Based Scheduling(2) Scheduler receives a request WT(X) (a) If TS(T)>=RT(X) and TS(T)>=WT(X), the write is physically realizable and must be performed. i. Write the new value for X ii.Set WT(X):=TS(T), and iii. Set C(X):= false. The Rules of Timestamp-Based Scheduling(2) (b) If TS(T)>=RT(X), but TS(T)<WT(X), then the write is physically realizable, but there is already a later value in X. i. If C(X) is true, then the previous writers of X is committed, and ignore the write by T. ii. If C(X) is false, we must delay T. (c) If TS(T)<=RT(X), the write is physically unrealizable, and T must be rolled back. The Rules of Timestamp-Based Scheduling(3&4) The scheduler receives a request to commit T. It must find all the database elements X written by T, and set C(X):= true. The scheduler receives a request to abort T or decides to rollback T. Any transaction that was waiting on an element X that T wrote must repeat its attempt to read or write, and see whether the action is now legal after the aborted transaction’s writes are cancelled. Multiversion Timestamps Multiversion schemes keep old versions of data item to increase concurrency. Multiversion Timestamp Ordering Multiversion Two-Phase Locking Each successful write results in the creation of a new version of the data item written. Use timestamps to label versions. When a read(X) operation is issued, select an appropriate version of X based on the timestamp of the transaction, and return the value of the selected version. reads never have to wait as an appropriate version is returned immediately. Timestamps and Locking Generally, timestamping is superior than locking in next situations 1.Most transactions are read-only. 2.It is rare that concurrent transaction will try to read and write the same element. In high-conflict situation, locking performs better than timestamps