Concurrency Control Protocols In order to ensure concurrent

advertisement
Concurrency Control Protocols
In order to ensure concurrent transactions, DBMS introduce concurrency control protocols.
Classification of Concurrency Protocols:
1) Lock based Protocol
a) Two phase locking protocol
i) Basic 2PL
ii) Conservative 2PL
iii) Strict 2PL
iv) Rigorous 2PL
b) Graph based Protocol
2) Time Stamp based Protocol
a) Time Stamp Ordering
b) Thomas’s Write Rule
3) Multiple Granularity Protocol
4) Multi Version Protocol
Two-Phase-Locking Protocol
Basic 2PL: Transaction is said to follow the two phase locking protocol if all locking
operations precede the first unlock operation.





Expanding(growing)=first phase
Shrinking=second phase
During shrinking phase no new locks can be acquired
Downgrading OK
Upgrading is not
Conservative 2PL (or static 2PL): Lock all items needed BEFORE execution begins by
predeclaring its read and write set


If any of the items in read or write set is already locked(by other transactions), transaction
waits(does not acquire any locks)
Deadlock free but not very realistic.
Strict 2PL: Transaction does not release its write locks until AFTER it aborts/commits


Not deadlock free but guarantees recoverable schedules (strict schedule: transaction can
neither read/write X until last transaction that wrote X has committed/aborted)
Most popular variation of 2PL.
Rigorous 2PL: No lock is released until after abort/commit

Transaction is in its expanding phase until it ends.
Graph based protocols
The simplest graph based protocols is tree locking protocol which is used to empty exclusive
locks and when the database is in the form of a tree of data items. In the tree locking protocol,
each transaction Ti can lock a data item at most once and must observe the following rules.
a)
b)
c)
d)
e)
All locks are exclusive locks
The first lock by Ti can be any data item including the root node.
Ti can lock a data item Q only if Ti currently locks the parent of Q
Data item may be unlocked at any time
Ti cannot subsequently lock a data item that has been locked and unlocked by Ti
A schedule with a set of transactions that uses the tree locking protocol can be shown to be
serializable. The transactions need not be two phase.
Advantage of tree locking control:
a. Compared to the two phase locking protocol, unlocking of data item is easier waiting
time . So it leads to the shorter waiting times and increase in concurrency.
Disadvantages of tree locking control:
a. A transaction may have to lock data items that it does not access descendants we have to
lock its parent also. So the number of locks and associated locking overhead is high.
Timestamp based protocols
The use of locks, combined with the two phase locking protocol, allows us to guarantee
serializability of schedules. The order of transactions in the equivalent serial schedule is based on
the order in which executing transactions lock the item they require. If a transaction needs an
item that is already locked, it may be forced to wait until the item is released. A different
approach that guarantees serializability involves using transaction timestamps to order
transaction execution for an equivalent serial schedule.
Time Stamps
A timestamp is a unique identifier created by the DBMS to identify a transaction. Timestamp
values are assigned in the order in which the transactions are submitted to the system. So a
timestamp is considered as the transaction start time. With each transaction Ti in the system, a
unique timestamp is assigned and it is denoted by TS (Ti). When a new transaction Tj enters the
system, then TS (Ti) <TS (Tj), this is known as timestamp ordering scheme. To implement this
scheme, each data item (Q) is associated with two timestamp values.
1. “W-timestamp (Q)” denotes the largest timestamp of any transaction that executed write
(Q) successfully.
2. “R-timestamp (Q)” denotes the largest timestamp of any transaction that executed read
(Q) successfully.
These timestamps are updated whenever a new read (Q) or write(Q) instruction is executed.
Timestamp ordering protocol
The timestamp ordering protocol ensures that any conflicting read and write operations are
executed in timestamp order. This protocol operaton is as follows.
A. Suppose transaction Ti issues read(Q)
a. If TS (Ti) <W-timestamp (Q) then Ti needs to read a value of Q that was already
overwritten. Hence, the read operation is rejected and Ti is rolled back.
b. If TS (Ti)>=W-timestamp (Q), then the read operation is executed and R-timestamp (Q)
is set to the maximum of R-timestamp (Q) and TS(Ti).
B. Transaction issue a write(X)
a. If TS(T)<read-timestamp(X), this means that a younger transaction is already using
the current value of the item and it would be an error to update it now. This occurs
when transaction is late in doing a write and younger transaction has already read the
old value.
b. If TS(Ti)<write-timestamp(X), this means transaction T asks to write any item(X)
whose value has already been written by a younger transaction i.e. ,T is attempting to
write an absolute
Value of data item X So T should be rolled back and restarted using a later
timestamp.
c. Otherwise, the write operation can proceed we set write-timestamp(X)=TS(Ti)
This scheme is called basic timestamp ordering and guarantees that transaction are conflict
serializable and the results are equivalent to a serial schedule.
Advantages of timestamp ordering protocol
1) Conflicting operations are processed in timestamp order and therefore it ensures conflict
serializability.
2) Since timestamps do not use locks, deadlocks cannot occur.
Disadvantages of timestamp ordering protocol
1) Starvation may occur if a transaction is continually aborted and restarted.
2) It does not ensure recoverable schedules.
Thomas’s Write Rule
A modification to the basic timestamp ordering protocol is that it relaxes conflict serializability
and provides greater concurrency by rejecting absolute write operation. The extension is known
as Thomas’s Write Rule.
Suppose transaction T1 issues read (Q) :no change, same as Time stamp ordering protocol.
If transaction issues a write(X)
a) If TS (Ti) <read-timestamp(X), this means that a younger transaction is already using the
current value of the item and it would be an error to update it now. This occurs when a
transaction is late in doing a write and younger transaction has already read the old value.
b) If TS (Ti)<write-timestamp(X). This means that a younger transaction has already
updated the value of the item and the value that the order transaction is writing must be
based on the absolute value of the item. In this case write operation can safely be ignored.
This is sometimes known as ignored absolute write rule and allows greater
concurrency.
Multiple Granularity
In all concurrency control schemes, we have used each individual data item as the unit on which
synchronization is performed. However, it would be advantageous to group several data items
and to treat them as one individual unit.
Example, if a transaction Ti needs to access the entire database, it uses a locking protocol. Then
Ti must lock each item in the database, so it is time consuming process. Hence it would be better
if Ti would issue a single lock request to lock the entire database. On the other hand if
transaction Ti needs to access only a few data items, it should not be required to lock the entire
database.
A data item can be one of the following.
1. A database record
2.
3.
4.
5.
Field value of database record
A disk block
Whole File
Whole database
The size of database item is often called the data item granularity. Fine granularity refers to
overall item size where as coarse granularity refers to large item sizes. The best item size
depends on the type transaction.
Hierarchy of data granularities, where the small granularities are nested within larger ones, can
be represented graphically a tree. In the tree, each node represents independent data item, nonleaf node of the multiple granularity tree represents the data associated with its descendents.
Level 0
Level 1
Level 2
Level 3
Level 4
DB
DDB
Files
Pages
Records
Fields
The highest level represents the entire database, then files, pages, records and fields. Hence we
can use shared and exclusive lock when a transaction locks a node, all the descendants of the
node in the same lock node. To make multiple granularity level locking practical, additional
types of locks called intention locks are needed. The idea behind intention locks is for a
transaction to indicate, long path from the root to the desired node, what type of the lock it will
require from one of the node’s descendants. There are three types of intention locks, they are
1. Intention Shared(IS) to indicate that a shared lock will be requested on some descendant
node
2. Intention Exclusive(IX)to indicate that a exclusive lock will be requested on some
descendant node
3. Shared intention exclusive(SIX)to indicate that the current node is locked in shared mode
but an exclusive lock will be requested
Compatibility Matrix for multiple granularity locking
IS
IX
S
SIX
X
IS
T
T
T
T
F
IX
T
T
F
F
F
S
T
F
T
F
F
SIX
T
F
F
F
F
X
F
F
F
F
F
Multi version Schemes
In multi version database systems, each write operation on data item say Q creates a new version
of Q. When a read (Q) operation is issued, the system selects one of the versions of Q to read.
The concurrency control scheme must ensure that the selection of the version to be read is done
in a manner that ensures serializability.
There are two multi version schemes
1. Multi version Timestamp ordering
2. Multi version Two-phase locking
Multi version timestamp ordering
In this technique, several versions Q1, Q2,……Qk of each data item Q are kept by the system.
For each version the value of the version Qk and the following two timestamps are kept.
1) W-timestamp (Qk) is the timestamp of the transaction that created version Qk.
2) R-timestamp (Qk) is largest timestamp of any transaction that successfully read version
Qk.
The scheme operates as follows when transaction Ti issues a read (Q) or writes (Q) operation.
Let Qk denote the version of Q whose timestamp is the largest write timestamp less than or equal
to TS (Ti).
1) It transaction issues a read(Q), then the value returned is the content of version Qk
2) It transaction Ti issues a write(Q), and if TS(Ti)<R-timestamp(Qk), then transaction Ti is
rolled back. Otherwise if TS(Ti)=W-timestamp(Qk) the contents of Qk are over written,
otherwise a new version of Q is created.
Advantages
1) The read request never fails and is never made to wait.
Disadvantages
1) It requires more storage to maintain multiple versions of the data item.
2) Reading of a data item also requires the update of the R-timestamp field, resulting in two
potential disk accesses.
3) Conflicts between transactions are resolved through rollbacks, rather than through waits.
This may be expensive.
Multi version two Phase Locking
The multi version two phase locking protocol attempts to combine the advantages of multi
version concurrency control with the advantages of two phase locking. In the standard locking
scheme, once a transactions obtain a write lock on an item, no other transactions can access that
item. So here it allows other transactions T1 to read an item X while a single transaction T holds
a write lock on X. This is accomplished by allowing two versions of each item of X.
When an update transaction reads an item it gets shared lock on the item and reads the latest
version of that item. When an update transactions wants to write an item, it first gets an exclusive
lock on the item and then creates a new version of the data item. The write is performed on the
new versions the timestamp of the new version is initially set to a value ∞.
Advantages
1) Reads can proceed concurrently with a write operation but it is not permitted in standard
two phase locking.
2) It avoids cascading aborts, since transactions are only allowed to read the version that
was written by a committed transaction.
Disadvantages
1) It requires more storage to maintain multiple versions of data item.
2) Dead locks may occur.
Download