Uploaded by qqwgdveurncdqgesqv

Chapter 1 Transaction Managemeent and Concurrency control

advertisement
Chapter -1
Transaction Management and Concurrency control
• Outline
– Transaction
– Problems of concurrent sharing
– Desirable Properties of Transactions
– Characterizing Schedules based on Recoverability
– Characterizing Schedules based on Serializability
– Transaction support
– Concurrency control
– Concurrency control mechanism
– Database Recovery
– Transaction and Recovery
– Recovery techniques and facilities
1
Introduction (cont…)
• A Transaction:
– Logical unit of database processing that includes one or more
access operations (read, retrieval, write, insert or update and
delete)
– A transaction (set of operations) may be stand-alone specified in
a high level language like SQL submitted interactively, or may be
embedded within a program.
• Transaction boundaries:
– One way of specifying transaction boundaries is using explicit Begin
and End transaction statements in an application program
– An application program may contain several transactions
separated by the Begin and End transaction boundaries
2
Introduction (cont…)
• Basic operations of Transaction are read and write
– read_item(X): Reads a database item named X into a program
variable. To simplify our notation, we assume that the program
variable is also named X.
– write_item(X): Writes the value of program variable X into the
database item named X.
– Can have one of the two outcomes for any transaction:
– Success - transaction commits and database reaches a new
consistent state
• Committed transaction cannot be aborted or rolled back.
– Failure: transaction aborts, and database must be restored to
consistent state before it started
3
Introduction (cont…)
Read and write operations:
• Basic unit of data transfer from the disk to the computer main
memory is one block.
– In general, a data item (what is read or written) will be the
field of some record in the database, although it may be a
larger unit such as a record or even a whole block
• read_item(X) command includes the following steps:
– Find the address of the disk block that contains item X.
– Copy that disk block into a buffer in main memory (if that
disk block is not already in some main memory buffer).
– Copy item X from the buffer to the program variable
named X.
4
Introduction (cont…)
Read and Write Operations (cont.):
• write_item(X) command includes the following steps:
– Find the address of the disk block that contains item X.
– Copy that disk block into a buffer in main memory (if that
disk block is not already in some main memory buffer)
– Copy item X from the program variable named X into its
correct location in the buffer.
– Store the updated block from the buffer back to disk (either
immediately or at some later point in time).
• The decision about when to store back a modified disk
block that is in main memory is handled by the recovery
manager of the DBMS in cooperation with the
underlying operating system
5
Introduction (cont…)
– Example of transactions
• (a) Transaction T1
• (b) Transaction T2
6
Problems of Concurrent Sharing
• Transactions submitted by the various users may execute
concurrently and may access and update the same database
items
• If this concurrent execution is uncontrolled, it may lead to
problems such as inconsistent database
• Why Concurrency Control is needed?
– Concurrency control is needed to respond to the effect of
the following problems on database consistency :
• The Lost Update Problem
• This occurs when two transactions that access the same
database items have their operations interleaved in a
way that makes the value of some database item
incorrect since the update made by the first
transaction is not used by the second transaction.
• In other words, the update made by the fist transaction
is lost(overwritten) by the second transaction
7
(a) The lost update problem
8
Lost Update problem: Example
Time
T1
t1
T2
bal(X)
Begin Tx
100
t2
Begin Tx
R(balX)
100
t3
R(balX)
balx=balx+100
100
t4
balx=balx-10
W(balx)
200
t5
W(balx)
Commit
90
t6
Commit
90
Lost update!!
This could have been avoided if we prevent T1 from reading
untill T2’s update has been completed
9
Introduction (cont…)
• The Temporary Update (Dirty Read) Problem
– This occurs when one transaction updates a database item
and then the transaction fails for some reason.
– The updated item is accessed by another transaction
before it is changed back to its original value.
10
Introduction (cont…)
(b) The temporary update problem.
11
The temporary update problem: Example
Time
T1
T2
bal(X)
t1
Begin Tx
100
t2
R(balX)
100
t3
balx=balx+100
100
W(balx)
200
t4
Begin Tx
t5
R(balX)
t6
balx=balx-10
t7
W(balx)
190
t8
Commit
190
200
Rollback
200
• Temporary update!!
– Could have been avoided if we prevent T1 from reading until after the
decision to commit or rollback T2 has been made
12
Introduction (cont…)
• The Incorrect Summary Problem
– If one transaction is calculating an aggregate summary
function on a number of records while other
transactions are updating some of these records, the
aggregate function may calculate some values before
they are updated and others after they are updated.
13
(c) The incorrect summary problem.
14
The incorrect summary problem: Example
Time
T1
t1
15
T2
Bal(x)
Bal(z)
Sum
Begin Tx
100
25
0
Sum=0
100
25
0
100
25
0
t2
Begin Tx
t3
R(balX)
t4
balx=balx-10
R(balX)
100
25
0
t5
W(balx)
Sum+=balx
90
25
100
t6
R(balZ)
90
25
100
t7
balz=balz+10
90
25
100
t8
W(balz)
90
35
100
t9
Commit
R(balz)
90
35
100
t10
Sum+=balz
90
35
135
t11
W(sum)
90
35
135
t12
commit
90
35
135
What causes a Transaction to fail?
1. A computer failure (system crash):
• A hardware or software error may occur in the
computer system during transaction execution. If the
hardware crashes, the contents of the computer’s
internal memory may be lost.
2. A transaction or system error:
• Some operation in the transaction may cause it to fail,
such as integer overflow or division by zero.
• Transaction failure may also occur because of
erroneous parameter values or because of a logical
programming error
16
Transaction and System Concepts
• Transaction states and additional operations
– Transaction states:
• Active state
• Partially committed state
• Committed state
• Failed state
• Terminated State
17
State transition diagram illustrating the states for
transaction execution
18
Transaction and System Concepts (cont…)
• Transaction operations
– For recovery purposes, the system needs to keep track of when the
transaction starts, terminates, and commits or aborts
– Recovery manager keeps track of the following operations:
• begin_transaction: This marks the beginning of transaction
execution
• read or write: These specify read or write operations on the
database items that are executed as part of a transaction
– End_transaction: This specifies that read and write transaction
operations have ended and marks the end limit of transaction
execution.
• At this point it may be necessary to check whether the changes
introduced by the transaction can be permanently applied to the
database or whether the transaction has to be aborted because it
violates concurrency control or for some other reason.
19
Transaction and System Concepts (cont…)
• commit_transaction:
– This signals a successful end of the transaction so that any
changes (updates) executed by the transaction can be
safely committed to the database and will not be undone.
• rollback (or abort):
– This signals that the transaction has ended unsuccessfully,
so that any changes or effects that the transaction may
have applied to the database must be undone.
20
Transaction and System Concepts (cont…)
• The System Log
– Log or Journal: The log keeps track of all
transaction operations that affect the values of
database items
• This information is needed to permit recovery from
transaction failures
• The log is kept on disk, so it is not affected by any type
of failure except for disk or catastrophic failure.
• In addition, the log is periodically backed up to archival
storage (tape) to guard against such catastrophic
failures.
21
Transaction and System Concepts (cont…)
• The System Log (cont):
– We can use a notation T to refer to a unique transactionid that is generated automatically by the system and is
used to identify each transaction:
– Types of log record:
• [start_transaction,T]: Records that transaction T has
started execution.
• [write_item,T,X,old_value,new_value]: Records that
transaction T has changed the value of database item X
from old_value to new_value.
22
The System Log (cont):
• [read_item,T,X]: Records that transaction T has read the
value of database item X.
• [commit,T]: Records that transaction T has completed
successfully, and affirms that its effect can be
committed (recorded permanently) to the database.
• [abort,T]: Records that transaction T has been aborted.
23
Desirable Properties of Transactions
• Transaction should posses several properties. They are often
called the ACID properties and should be enforced by the
concurrency control and recovery methods of the DBMS.
ACID properties:
• Atomicity: A transaction is an atomic unit of processing; it is
either performed in its entirety or not performed at all.
• Consistency preservation: A correct execution of the transaction
must take the database from one consistent state to another.
• Isolation: A transaction should not make its updates visible to
other transactions until it is committed; this property, when
enforced strictly, solves the temporary update problem and
makes cascading rollbacks of transactions unnecessary
• Durability or permanency: Once a transaction changes the
database and the changes are committed, these changes must
never be lost because of subsequent failure.
24
Schedules
• Schedule (or history) of transaction
– When transactions are executing concurrently in an interleaved
fashion, the order of execution of operations from the various
transactions form what is known as a transaction schedule (or history)
• A schedule (or history) S of n transactions T1, T2,.. ,Tn:
– is an ordering of the operations of the transactions subject to the
constraint that, for each transaction Ti that participates in S, the
operations of Ti in S must appear in the same order in which they occur
in Ti.
• Note, however, that operations from other transactions Tj can be
interleaved with the operations of Ti in S.
25
Schedules (cont…)
• A shorthand notation for describing a schedule uses the
symbols :
–
–
–
–
r : for read_item operations ,
w: write_item,
c: commit and
a: abort
• Transaction numbers are appended as subscript to each
operation in the schedule
• The database item X that is read or written follows the r and w
operations in parenthesis
– Example:
• Sa: r1(X),r2(x),w1(x), r1(Y),w2(x);w1(Y)
• Sb: r1(X),w1(x),r2(x), w2(x), r1(Y),a1
26
Conflicting operations
• Two operations in a schedule are said to conflict if they satisfy
all three of the following conditions:
– They belong to different transactions
– They access the same item X
– At least one of the operations is a write_item(X)
• For example, in a schedule Sa, the operations
– r1(x) and w2(X) conflict, as do the operations r2(X) and
w1(X) and the operations w1(X) and w2(X)
27
Non conflicting operations
– The operations r1(x) and r2(x) do not conflict since both of
them are read operations
– r1(x) and w1(x) do not conflict because they belong to the same
transaction
– W2(x) and w1(y) do not conflict since they operate on distinct
data items x and y
28
Characterizing Schedules based on Serializability
• Serial schedule:
– A schedule S is serial if, for every transaction T
participating in the schedule, all the operations of T
are executed consecutively in the schedule
• Otherwise, the schedule is called non serial schedule.
• Serializable schedule:
– A schedule S is serializable if it is equivalent to some
serial schedule of the same n transactions
• Conflict equivalent:
– Two schedules are said to be conflict equivalent if the
order of any two conflicting operations is the same in both
schedules
– Conflict serializable:
• A schedule S is said to be conflict serializable if it is
conflict equivalent to some serial schedule S’.
29
Characterizing Schedules based on Serializability (cont….)
• Being serializable is not the same as being serial
• Being serializable implies that the schedule is a
correct schedule
– It will leave the database in a consistent state.
– The interleaving is appropriate and will result in a
state as if the transactions were serially executed,
yet will achieve efficiency due to concurrent
execution.
30
Characterizing Schedules based on Serializability (cont…)
• It’s difficult to determine when a schedule begins
and when it ends.
– Hence, we reduce the problem of checking the
whole schedule to checking only a committed
projection of the schedule (i.e. operations from
only the committed transactions.)
• Current approach used in most DBMSs:
– Use of locks with two phase locking
31
Determining conflict serializability
– To determine serializability, first identify the pair of conflicting
operations and check if their order is preserved in one of the
possible serial schedules
– schedule A:
• r1(x);w1(x),r1(y);w1(y);r2(x);w2(x)- serial schedule
– schedule B:
• r2(x);w2(x); r1(x);w1(x),r1(y);w1(y)- serial schedule
– schedule C:
• r1(x);r2(x);w1(x);w2(x),w1(y)- (not serializable).
– ScheduleD :
• r1(x);w1(x);r2(x);w2(x);r1(y);w1(y)-(serializable, equivalent to
schedule A).
32
Serializability (cont…)
Testing for conflict serializability with precedence graphs: Algorithm
– For each transaction Ti participating in Schedule S, create a node
labeled Ti in the precedence graph
– For each case in S where Tj executes read_item(x) after Ti executes a
write_item(x) create an edge (Ti Tj) in the precedence graph
– For each case in S where Tj executes write_item(x) after Ti executes a
read_item(x) create an edge (Ti Tj) in the precedence graph
– For each case in S where Tj executes write_item(x) after Ti executes a
write_item(x) create an edge (Ti Tj) in the precedence graph
– The schedule is serializable if and only if the precedence graph has no
cycles.
33
Testing serializability with Precedence Graphs
Serial
Serial
Not Serializable
Serializable
34
Serializability(Cont…)
Serializability(Cont…)
•There is no cycle in the graph, the schedule is serializable
•T2, T3, T1, T4 is the only equivalent serial schedule
Characterizing Schedules based on Recoverability
• Schedules classified based on recoverability:
– Recoverable schedule:
• Once a transaction T is committed, it should never be necessary to
rollback T
 The schedules that theoretically meet this criterion are called
recoverable and those that do not are non recoverable
• A schedule S is recoverable if no transaction T in S commits until all
transactions T’ that have written an item that T reads have committed
– A transaction T2 reads from Transaction T1 in a schedule S if some
item X is first written by T1 and latter read by T2
• In addition, T1 should not have been aborted before T2 reads
item X and there should be no transaction that write X after T1
writes it and before T2 reads X
37
Characterizing Schedules based on Recoverability
• Consider the schedule given Sa’ where two commit operations
have been added to Sa :
• Sa’ : r1(X),r2(x),w1(x), r1(Y),w2(x);c2;w1(Y);c1
– Sa’ is recoverable despite it suffers from lost update problem
• However, consider the two schedules Sc and Sd below:
– Sc:r1(x);w1(x);r2(x);r1(y);w2(x);c2;a1
• Sc is not recoverable because T2 reads X from T1 and then
T2 commits before T1 commits.
• If T1 aborts after the c2 operations in Sc, then the value of
x that T2 read is no longer valid and T2 must be aborted
after it had been committed, leading to a schedule that is
not recoverable
38
Recoverability (cont…)
– For the above schedule to be recoverable, the c2
operation in Sc must be postponed until after T1
commits as shown in Sd
•
• Sd:r1(x);w1(x);r2(x);r1(y);w2(x);w1(y);c1;c2
Recoverable
39
Recoverability (cont…)
• If T1 aborts instead of committing, then T2 should also abort as
shown in Se because the value of item X it read is no longer valid
Se:r1(x);w1(x);r2(x);r1(y);w2(x);w1(y);a1;a2
Recoverable
• Cascadeless schedule Vs cascading rollback
– Schedules requiring cascaded rollback:
• A schedule in which uncommitted transactions that read an
item from a failed transaction must be rolled back
– Cascadeless schedule:
• One where every transaction reads only the items that are
written by committed transactions
40
Concurrency Control
• Purpose of Concurrency Control
– To ensure Isolation property of concurrently executing
transactions
– To preserve database consistency
– To resolve read-write and write-write conflicts
• Example:
– In concurrent execution environment,
• if T1 conflicts with T2 over a data item A, then the existing
concurrency controller decides if T1 or T2 should get A and
which transaction should be rolled-back or waits
Slide 41
Concurrency Control (cont…)
1. Concurrency control using Locks
– A lock is a mechanism to control concurrent access to a data
item
• Locking is an operation which secures
• (a) permission to Read
• (b) permission to Write a data item for a transaction.
– Notation :
:Li(X) –Transaction Ti requests a lock on database element
X
– Unlocking is an operation which removes these permissions
from the data item.
– Notation :
– Ui (X): Transaction Ti releases (“unlocks”) its lock on
database element X
– Lock and Unlock are Atomic operations.
Slide 42
Concurrency Control (cont…)
Types of locks and system lock tables
• Binary locks
– Can have two states or values: locked and unlocked(0 or 1)
– A distinct lock is associated with each database item X
– If the value of the lock on X is 1, item X can not be accessed by
a database operation that request the item
– Too restrictive for database items because at most one
transaction can hold a lock on a given item
• Shared/Exclusive (or read /write) locks
– Allow several transactions to access the same item x if they all
access x for reading purposes
Slide 43
Concurrency Control(cont…)
• In shared/exclusive method, data items can be locked
in two modes :
– Shared mode: shared lock (X)
• More than one transaction can apply share lock on X for
reading its value but no write lock can be applied on X
by any other transaction
– Exclusive mode: Write lock (X)
• Only one write lock on X can exist at any time and no shared lock
can be applied by any other transaction on X
Slide 44
Concurrency Control (cont…)
•
When we use the shared/exclusive locking scheme, the system
must enforce the following rules:
1. A transaction must issue the operation read_lock(x) or
write_lock(x) before any read_item (x) operation is performed in
T
2. A transaction T must issue the operation write_lock(x) before
any write_item (x) operation is performed in T
3. A transaction T must issue the operation unlock_lock(x) after
all read_item(x) and write_item (x) operation are completed
in T
4. A transaction will not issue a read_lock(x) operation if it
already holds a read(shared) lock or a write (exclusive) lock on
item X
5. A transaction will not issue a write_lock(x) operation if it
already holds read(shared) lock or write(exclusive) lock on
item x
6. A transaction T will not issue an unlock(x) operation unless it
already holds a read(shared) lock or a write(exclusive) lock on
item x
Slide 45
Concurrency Control (cont…)
Lock conversion
• Sometimes, it is desirable to relax condition 4 and 5 in the coding list in
order to allow lock conversions. That is :
– Under certain conditions, a transaction that already holds a lock on
item X is allowed to convert the lock from one lock state to another.
– For example, it is possible for a transaction T to issue a read_lock
(X) and then later on to upgrade the lock by issuing a write_lock(x)
operation
– If T is the only transaction holding a read lock on x at the time it
issues the write_lock (x) operation, the lock can be upgraded
;otherwise, the transaction must wait.
– It is also possible for a transaction T to issue a write_lock(X) and
then later on to downgrade the lock by issuing a read_lock(X)
operation
Slide 46
Concurrency Control (cont…)
– Lock upgrade: change existing read lock to write lock
if Ti has a read-lock (X) and Tj has no read-lock (X) (i  j) then
convert read-lock (X) to write-lock (X)
else
force Ti to wait until Tj unlocks X
– Lock downgrade: change existing write lock to read lock
If Ti has a write-lock (X) (*no transaction can have any lock on X*)
convert write-lock (X) to read-lock (X)
Slide 47
Concurrency Control (cont…)
• Lock compatibility
– A transaction may be granted a lock on an item if the requested
lock is compatible with locks already held on the item by other
transactions
– Any number of transactions can hold shared locks on an item,
• But if any transaction holds an exclusive lock on the item no other
transaction may hold any lock on the item.
– If a lock cannot be granted, the requesting transaction will be
made to wait untill all incompatible locks held by other
transactions have been released
• Lock compatibility matrix
Slide 48
Concurrency Control (cont…)
Using binary or read write locks in transactions as described earlier by itself
does not guarantee serializability :
T1
read_lock (Y);
Y=30
read_item (Y);
unlock (Y);
write_lock (X);
read_item (X);
execution
X:=X+Y;
write_item (X);
unlock (X);
Slide 49
T2
Result
read_lock (X);
Initial values: X=20;
read_item (X);
unlock (X);
Write_lock (Y);
read_item (Y);
Result of serial execution
T1 followed by T2
X=50, Y=80.
Result of serial
Y:=X+Y;
write_item (Y);
unlock (Y);
T2 followed by T1
X=70, Y=50
Concurrency Control (cont…)
T1
read_lock (Y);
read_item (Y);
unlock (Y);
Time
write_lock (X);
read_item (X);
X:=X+Y;
write_item (X);
unlock (X);
Slide 50
T2
read_lock (X);
read_item (X);
unlock (X);
write_lock (Y);
read_item (Y);
Y:=X+Y;
write_item (Y);
unlock (Y);
Result
X=50; Y=50
Nonserializable because it
violated two-phase policy.
Concurrency Control (cont…)
•
Guaranteeing serializability by Two-Phase Locking Protocol (2 PL)
– A transaction is said to follow two phase locking protocol if all
locking operations (either read_lock or write_lock) precede the
first unlock operation in the transaction
– This is a protocol which ensures conflict-serializable schedules.
• Phase 1: Growing Phase
– transaction may obtain locks
– transaction may not release locks
• Phase 2: Shrinking Phase
– transaction may release locks
– transaction may not obtain locks
– It can be proved that the transactions can be serialized in the
order of their lock points (i.e. the point where a transaction
acquired its final lock).
Slide 51
Concurrency Control (cont…)
T’1
read_lock (Y);
read_item (Y);
write_lock (X);
unlock (Y);
read_item (X);
X:=X+Y;
write_item (X);
unlock (X);
Slide 52
T’2
read_lock (X);
read_item (X);
Write_lock (Y);
unlock (X);
read_item (Y);
Y:=X+Y;
write_item (Y);
unlock (Y);
T’1 and T’2 follow two-phase
policy but they are subject to
deadlock
Concurrency Control (cont…)
• Two-phase locking policy generates two locking algorithms
– (a) Basic
– (b) Conservative
– Basic:
• Transaction locks data items incrementally. This may cause
deadlock
– Conservative:
• Prevents deadlock by locking all desired data items before
transaction begins execution.
Slide 53
Concurrency Control (cont…)
Dealing with Deadlock and Starvation
Deadlock :occurs when each transaction T in a set of two or more
transactions is waiting for an item that is locked by some other
transaction T’ in the set
Example of deadlock situation:
T’1
T’2
read_lock (Y);
T’1 and T’2 enter deadlock
read_item (Y);
read_lock (X);
read_item (X);
write_lock (X);
(waits for X)
write_lock (Y);
(waits for Y)
Slide 54
Deadlock and Starvation (cont…)
• Deadlock prevention
1. A transaction locks all the needed data items before it
begins execution
• This way of locking prevents deadlock since a
transaction never waits for a data item
• If any of the items cannot be obtained, none of the
items are locked. Rather the transaction tries again to
lock all the items it needs
• This solution limits concurrency and generally not a
practical assumption
Slide 55
Deadlock and Starvation (cont…)
• Deadlock prevention (cont…)
2. Making a decision about what to do with a transaction
involved in a possible deadlock situation:
– Should it be blocked and made it to wait or should it be
aborted
– Should the transaction preempt and abort another
transaction
• These concepts use transaction timestamp TS(T) which is a
unique identifier assigned to each transaction based on the
order they started
• If transaction T1 starts before transaction T2, then
TS (T1) < TS(T2)
Slide 56
Deadlock and Starvation (cont…)
• Deadlock prevention
– Two schemes that prevent dead lock based on time stamp
includes wait –die and wound-wait
– Suppose that transaction Ti 100 tries to lock an item X but is not
able to do so because X is locked by some other transaction Tj 90
with a conflicting lock. The rules followed by these two schemes
are as follows:
• Wait – die: If TS(Ti) < TS(Tj) i.e Ti is older than Tj, then Ti is allowed to
wait ; other wise, abort Ti (Ti dies) and restart it later with the same
time stamp
• Wound - wait: If TS(Ti) < TS(Tj), abort Tj (Ti wounds Tj) and restart it
later with the same timestamp; other wise Ti is allowed to wait.
Slide 57
Deadlock and Starvation (cont…)
• Another group of protocols that prevent deadlock do not
require timestamps
• No waiting (NW) – If transaction is unable to obtain lock, it
will be immediately aborted and restarted after some time
delay
– This method cause transactions to abort and restart
transactions needlessly
• Cautious waiting (CW) – Suppose that transaction Ti tries to lock
an item X but is not able to do so because X is locked by some other
transaction Tj with a conflicting lock; the cautious rule suggest that :
• If Tj is not blocked (not waiting for some other locked item), then Ti is
blocked and allowed to wait; otherwise abort Ti
Slide 58
Deadlock and Starvation (cont…)
• Starvation
– Starvation occurs when a particular transaction consistently
waits or restarted and never gets a chance to proceed further
– In a deadlock resolution, it may be possible that the same
transaction may consistently be selected as victim and rolledback
– This limitation is inherent in all priority based scheduling
mechanisms
– In Wound-Wait scheme, a younger transaction may always be
wounded (aborted) by a long running older transaction which
may create starvation
Slide 59
Concurrency control (cont…)
2. Timestamp ordering technique
– Time stamp(TS) is a unique identifier created by the DBMS
to identify a transaction
– A larger timestamp value indicates a more recent event or
operation
– Timestamp based algorithm uses timestamp to serialize the
execution of concurrent transaction
3. Multi version concurrency control technique
– This approach maintains a number of versions of a data
item and allocates the right version to a read operation of a
transaction.
– Unlike other mechanisms a read operation in this
mechanism is never rejected
Slide 60
Concurrency Control (cont…)
3. Multiversion concurrency control techniques
– This approach maintains a number of versions of a
data item and allocates the right version to a read
operation of a transaction.
– Unlike other mechanisms a read operation in this
mechanism is never rejected
– Side effect:
• Need more storage (RAM and disk) is required to
maintain multiple versions
• To check unlimited growth of versions, a garbage
collection is run when some criteria is satisfied
Slide 61
Concurrency Control (cont…)
•
4. Validation (Optimistic) Concurrency Control Schemes
– In this technique, serializability is checked only at the time of commit
and transactions are aborted in case of non-serializable schedules
– No checking is done while transaction is executing
– In this scheme, updates in the transaction are not applied directly to
the database item until it reaches its commit point
– Three phases:
1.
2.
3.
Read phase
Validation phase
Write phase
1. Read phase:
•
Slide 62
A transaction can read values of committed data items. However,
updates are applied only to local copies (versions) of the data items (in
database cache)
4. Validation (Optimistic) Concurrency Control Schemes
2. Validation phase: Serializability is checked before
transactions write their updates to the database.
3. Write phase: On a successful validation transactions’
updates are applied to the database; otherwise,
transactions are restarted
Slide 63
Database Recovery
1 Purpose of Database Recovery
– To bring the database into the last consistent state, which
existed prior to the failure.
– To preserve transaction properties , particularly Durability
• Example:
– If the system crashes before a fund transfer transaction
completes its execution, then either one or both accounts
may have incorrect value. Thus, the database must be
restored to the state before the transaction modified any
of the accounts.
64
Recovery(cont…)
2. What causes a Transaction to fail?
1. Local errors or exception conditions detected by the
transaction:
– Certain conditions necessitate cancellation of the
transaction
 For example, data for the transaction may not
be found
– A programmed abort in the transaction causes it
to fail.
2. Concurrency control enforcement:
– The concurrency control method may decide to
abort the transaction, to be restarted later, because
it violates serializability or because several
transactions are in a state of deadlock
65
Recovery(cont…)
3. Disk failure:
• Some disk blocks may lose their data because of a read
or write malfunction or because of a disk read/write
head crash.
• This may happen during a read or a write operation of
the transaction.
4. Physical problems and catastrophes:
This refers to an endless list of problems that includes
power or air-conditioning failure, fire, theft, overwriting
disks or tapes by mistake, and mounting of a wrong tape
by the operator.
66
Recovery using log records
•
If the system crashes, we can recover to a consistent
database state by examining the log record and using
recovery methods.
1. Because the log contains a record of every write
operation that changes the value of some database item,
it is possible to undo the effect of these write operations
of a transaction T by tracing backward through the log
and resetting all items changed by a write operation of T
to their old_values.
2. We can also redo the effect of the write operations of a
transaction T by tracing forward through the log and
setting all items changed by a write operation of T (that
did not get done permanently) to their new_values.
67
Recovery(cont…)
Commit Point of a Transaction:
• Definition a Commit Point:
– A transaction T reaches its commit point when all its
operations that access the database have been executed
successfully and the effect of all the transaction operations on
the database has been recorded in the log
– Beyond the commit point, the transaction is said to be
committed, and its effect is assumed to be permanently
recorded in the database.
• The transaction then writes a commit record
[commit,T] in to the log
68
Recovery(cont…)
• Undoing transactions
– If a system failure occurs, we search back in the log for all
transactions T that have written a [start_transaction,T] entry into
the log but no commit entry [commit,T] record yet
• These transactions have to be rolled back to undo their
effects on the database during recovery process
• Redoing transactions:
– Transactions that have written their commit entry in the log
must also have recorded all their write operations in the log;
otherwise they would not be committed, so their effect on the
database can be redone from the log entries. (Notice that the log
file must be kept on disk.
– At the time of a system crash, only the log entries that have been
written back to disk are considered in the recovery process
because the contents of main memory may be lost.)
69
Recovery(cont…)
3 Transaction Log
– For recovery from any type of failure data values prior to
modification (BFIM - BeFore Image) and the new value after
modification (AFIM – AFter Image) are required.
– These values and other information is stored in a sequential file
called Transaction log. A sample log is given below.
P Operation Data item
BFIM
AFIM
Begin
Write
X
X = 100 X = 200
Begin
W
Y
Y = 50
Y = 100
R
M
M = 200 M = 200
R
N
N = 400 N = 400
End
70
Database Recovery(cont…)
4 Data Update
– Immediate Update: As soon as a data item is modified in cache,
the disk copy is updated.
– Deferred Update: All modified data items in the cache is
written either after a transaction ends its execution or after a
fixed number of transactions have completed their execution.
– Shadow update: The modified version of a data item does not
overwrite its disk copy but is written at a separate disk location.
– In-place update: The disk version of the data item is overwritten
by the cache version.
71
Database Recovery(cont…)
5 Data Caching
– Data items to be modified are first stored into
database cache by the Cache Manager (CM)
– After modification, they are flushed (written) to the
disk.
– The flushing is controlled by dirty and Pin-Unpin
bits.
• Dirty bits=1: Indicates that the data item is modified.
• Pin-Unpin bits: a page in cache is pinned (bit value=1)
if it can not be written back to disk as yet
72
Database Recovery
6 Transaction Roll-back (Undo) and Roll-Forward (Redo)
– To maintain atomicity, a transaction’s operations are
redone or undone.
• Undo: Restore all BFIMs on to disk (Remove all AFIMs).
• Redo: Restore all AFIMs on to disk.
– Database recovery is achieved either by performing only
Undos or only Redos or by a combination of the two.
73
Database Recovery
The read and
write
operations of
three
transactions
74
Database Recovery
System
log at the
time of
crash
Slide 19- 75
Database Recovery
Write-Ahead Logging
• When in-place update (immediate or deferred) is used then
log is necessary for recovery and it must be available to
recovery manager. This is achieved by Write-Ahead Logging
(WAL) protocol.
– WAL states that
• For Undo: Before a data item’s AFIM is flushed to the database disk
(overwriting the BFIM) its BFIM must be written to the log and the
log must be saved on a stable store (log disk).
• For Redo: Before a transaction executes its commit operation, all
its AFIMs must be written to the log and the log must be saved on a
stable store.
76
Database Recovery
7 Checkpointing
–
Randomly or under some criteria, the database flushes its
buffer to database disk to minimize the task of recovery. The
following steps defines a checkpoint operation:
1.
2.
3.
4.
–
77
Suspend execution of transactions temporarily.
Force write modified buffer data to disk.
Write a [checkpoint] record to the log, save the log to disk.
Resume normal transaction execution.
During recovery redo or undo is required to transactions
appearing after [checkpoint] record.
Database Recovery
8 Recovery Scheme
• For Deferred Update (No Undo/Redo)
– The data update goes as follows:
• A set of transactions record their updates in the log.
• At commit point under WAL scheme, these updates are
saved on database disk.
• After reboot from a failure the log is used to redo all
the transactions affected by this failure. No undo is
required because no AFIM is flushed to the disk before
a transaction commits.
78
Database Recovery
An example of recovery
using deferred update in a
single user environment
79
Database Recovery
Deferred Update with concurrent users
• This environment requires some concurrency control
mechanism to guarantee isolation property of transactions. In
the system recovery, transactions which were recorded in the
log after the last checkpoint were redone. The recovery
manager may scan some of the transactions recorded before
the checkpoint to get the AFIMs.
80
Database Recovery
An Example of recovery using deferred update with concurrent transaction
81
Database Recovery
Deferred Update with concurrent users
• Two tables are required for implementing this protocol:
– Active table: All active transactions are entered in this
table.
– Commit table: Transactions to be committed are entered
in this table.
• During recovery, all transactions of the commit table are
redone and all transactions of active tables are ignored since
none of their AFIMs reached the database..
82
Exercise
• Consider the log records shown on the next slide by transactions T1, T2,
T3 and T4 with initial values of B=15, C=50, D=40 and E=25. Using deferred
update, show the final values of B, C, D and E after recovery from failure if
the crash occurred after the indicated point.
B
?
83
C
?
D
?
E
?
Initial values
Continued …
[Start_transaction,T1]
[write_item,T1,B,12]
[write_item,T1,D,10]
[commit T1]
[checkpoint]
[Start_transaction,T3]
[write_item,T3,E,30]
[commit T3]
[Start_transaction,T4]
[write_item,T4,B,18]
[commit T4]
[Start_transaction,T2]
[write_item,T2,C,28]
84
B
15
C
50
D
40
E
25
Final values after recovery
B
?
C
?
Crash
D
?
E
?
Download