Transaction Processing
John Ortiz
Introduction
Transactions are motivated by two of the
properties of DBMS's discussed way back in
our first lecture:
Multi-user database access
Safe from system crashes
Main issues:
How to model concurrent execution of user
programs?
How to guarantee acceptable DB behavior?
How to deal with system crashes?
Lecture 19
Transaction Processing
2
Why Concurrency?
Allowing only serial execution of user programs
may cause poor system performance
Low throughput, long response time
Poor resource utilization (CPU, disks)
Concurrent execution of user programs is
essential for good DBMS performance.
Because disk accesses are frequent, and
relatively slow, it is important to keep the
CPU humming by working on several user
programs concurrently
Lecture 19
Transaction Processing
3
Example: Why Concurrency?
Assume each users’ program uses CPU and I/O
resources (disks) in an interleaved fashion:
CPU, R(X), CPU, W(X)
Suppose each CPU request takes 1 time unit
and each I/O request takes 5 time units.
For a 2 GHz Machine, one clock tick is ½ ns
An 8 millisecond seek time is 8000 microseconds,
which is 8,000,000 ns
Clearly the CPU can get quite a bit done while the
disk is searching for a block
Lecture 19
Transaction Processing
4
Example: Why Concurrency?
Serial schedule
Time units = 48
T1
T2
T3
T4
T1
T2
T3
T4
Lecture 19
CPU
I/O
Time
Non-serial schedule
Time units = 41
CPU
I/O
Time
Transaction Processing
5
Example: Why Concurrency?
Serial schedule
Time units = 48
T1
T2
T3
T4
T1
T2
T3
T4
Lecture 19
CPU
I/O
Time
Non-serial schedule
Time units = 22
CPU
Use 2 disks
I/O 1
I/O 2
Time
Transaction Processing
6
Transaction
A user program may carry out many operations
on data retrieved from database, but DBMS is
only concerned about what data is read/written
from/to the database (on disk)
A transaction is a sequence of database actions
that is considered as a unit of work
DB actions: read (R(X)), write (W(X)),
commit, abort
Represent DBMS’s abstract view of
Interact user sessions
Execution of user programs
Lecture 19
Transaction Processing
7
Example: Transaction
Account(Ano, Name, Type, Balance)
A user want to
update Account set Balance = Balance – 50
where Ano = 10001
update Account set Balance = Balance + 50
where Ano = 12300
Let A be account w/ Ano=10001, B be account
w/ Ano=12300. The transaction is
R(A), W(A), R(B), W(B)
Lecture 19
Transaction Processing
8
States of a Transaction
begin
transaction
partially
committed
end
active transaction
exception
commit
committed
failure
failed
abort
aborted
read/write
Lecture 19
Transaction Processing
9
Consistency of Transaction
Each transaction must leave the database in a
consistent state if the DB is consistent when
the transaction begins.
DBMS will enforce some ICs, depending on
the ICs declared in CREATE TABLE
statements.
Beyond this, the DBMS does not really
understand the semantics of the data. (e.g.,
it does not understand how the interest on a
bank account is computed).
Lecture 19
Transaction Processing
12
Atomicity of Transactions
A transaction might commit after completing
all its actions, or it could abort (or be aborted
by the DBMS) after executing some actions.
A very important property guaranteed by the
DBMS for all transactions is that they are
atomic. That is, a user can think of a
transaction as always executing all its actions
in one step, or not executing any actions at all.
DBMS logs all actions so that it can undo the
actions of aborted transactions.
Lecture 19
Transaction Processing
13
Example: Why Atomicity?
Account(Ano, Name, Type, Balance)
A user want to
update Account set Balance = Balance – 50
where Ano = 10001
update Account set Balance = Balance + 50
where Ano = 12300
System crashed in the middle
Possible outcome w/o recovery:
$50 transferred or lost
The operations must be done as a unit
Lecture 19
Transaction Processing
14
Durability
DBMS often save data in main memory buffer
to improve system efficiency. Data in buffer is
volatile (may get lost if system crashes)
When a transaction commits, DBMS must
guarantee that all updates make by the
transaction will not be lost even if the system
crashes later
DBMS uses the log to redo actions of
committed transactions if necessary
Lecture 19
Transaction Processing
15
Isolation
Users submit transactions, and can think of
each transaction as executing by itself (in
isolation)
Concurrency is achieved by the DBMS, which
interleaves actions (reads/writes of DB
objects) of various transactions
DBMS guarantees that interleaving
transactions do not interfere with each
other
Lecture 19
Transaction Processing
16
Example: Why Isolation?
Two users (programs) do this at the same time
User 1: update Student set GPA = 3.7 where
SID = 123
User 2: update Student set Major = ‘CS’
where SID = 123
Sequence of events: for each user, read tuple,
modify attribute, write tuple.
Possible outcomes w/o concurrency control: one
change or both
Lecture 19
Transaction Processing
17
Example: Why Isolation?
Emp(EID, Name, Dept, Sal, Start, Loc)
User 1: update Emp set Dept = ‘Sales’ where
Loc = ‘Downtown'
User 2: update Emp set Start = 3/1/00 where
Start = 2/29/00
Possible outcomes w/o concurrency control:
each tuple has one change or both, may be
inconsistent across tuples
Lecture 19
Transaction Processing
18
Example: Interleaved Transactions
Consider two transactions:
T1: BEGIN A=A+100, B=B-100 END
T2: BEGIN A=1.06*A, B=1.06*B END
One possible interleaved execution:
T1: A=A+100,
B=B-100
T2:
A=1.06*A,
B=1.06*B
It is OK. But what about another interleaving?
T1: A=A+100,
B=B-100
T2:
A=1.06*A, B=1.06*B
Lecture 19
Transaction Processing
19
Schedule: Modeling Concurrency
Schedule: a sequence of operations from a set
of transactions, where operations from any one
transaction are in their original order
Notation:
Ri(X): read X by Ti
T1
T2
Wi(X): write X by Ti
R(A)
W(A)
R(B)
R1(A), W1(A), R2(B), W2(B),
W(B)
R1(C), W1(C)
R(C)
W(C)
Lecture 19
Transaction Processing
20
Schedule (cont.)
Represents some actual sequence of database
actions.
In a complete schedule, each transaction ends
in commit or abort.
A schedule transforms database from an initial
state to a final state
Initial
state
Lecture 19
A schedule
Transaction Processing
Final
state
21
Schedule (cont.)
Assume a consistent initial state
A representation of an execution of operations
from a set of transactions
Ignore
aborted transactions
Incomplete (not yet committed) transactions
Operations in a schedule conflict if
1. They belong to different transactions
2. They access the same data item
3. At least one item is a write operation
Lecture 19
Transaction Processing
22
Anomalies with Concurrency
Interleaving transactions may cause many kinds
of consistency problems
Reading Uncommitted Data ( “dirty reads”):
R1(A), W1(A), R2(A), W2(A), C2, R1(B), A1
Unrepeatable Reads:
R1(A), R2(A), W2(A), C2, R1(A), W1(A), C1
Overwriting Uncommitted Data (lost update):
R1(A), R2(A), W2(A), W1(A)
Lecture 19
Transaction Processing
23
Anomalies with Concurrency
Incorrect Summary Problem
Data items may be changed by one transaction while
another transaction is in the process of calculating
an aggregate value
A correct “sum” may be obtained prior to any
change, or immediately after any change
Lecture 19
Transaction Processing
24
Serial Schedule
An acceptable schedule must transform
database from a consistent state to another
consistent state
Serial schedule : one transaction runs entirely
before the next transaction starts.
T1: R(X), W(X)
T2: R(X), W(X)
R1(X) W1(X) C1 R2(X) W2(X) C2
R2(X) W2(X) C2 R1(X) W1(X) C1
R1(X) R2(X) W2(X) W1(X) C1 C2
Lecture 19
Transaction Processing
Serial
Non-serial
25
Serial Schedule IS Acceptable
Serial schedules guarantee transaction
isolation & consistency
Different serial schedules can have different
final states
N transactions may form N! different serial
schedules
Any state from a serial schedule is
acceptable – DBMS makes no guarantee
about the order in which transactions are
executed
Lecture 19
Transaction Processing
26
Example: Serial Schedules
T1: R(X), X=X+10, W(X)
T2: R(X), X=X*2, W(X)
Final
X = 60
S1: R1(X) W1(X) C1 R2(X) W2(X) C2
Initial
X = 20
Final
X = 50
S2: R2(X) W2(X) C2 R1(X) W1(X) C1
Lecture 19
Transaction Processing
27
Is Non-Serial Schedule Acceptable?
T1: R(X), X=X*2, W(X), R(Y), Y=Y-5, W(Y)
T2: R(X), X=X+10, W(X)
S1: R1(X) W1(X) R2(X) W2(X) R1(Y) W1(Y) C1 C2
Initial
X=20
Y=35
final
X=50
Y=30
S2: R1(X) W1(X) R1(Y) W1(Y) C1 R2(X) W2(X) C2
Lecture 19
Transaction Processing
28
Serializable Schedules
Serializable schedule: Equivalent to a serial
schedule of committed transactions.
Non-serial (allow concurrent execution)
Acceptable (final state is what some serial
schedule would have produced)
Types of Serializable schedules: depend on how
the equivalency is defined
Conflict: based on conflict operations
View: based on viewing of data
Ex: p.645, text does not show commits
Lecture 19
Transaction Processing
29
Lock-Based Concurrency Control
Strict Two-phase Locking (Strict 2PL)
Protocol:
Each transaction must obtain a S (shared)
lock on object before reading, and an X
(exclusive) lock on object before writing.
All locks held by a transaction are released
when the transaction completes
If a transaction holds an X lock on an
object, no other transaction can get a lock
(S or X) on that object.
Strict 2PL allows only serializable schedules.
Lecture 19
Transaction Processing
30
Cascading Aborts
When a transaction aborts, all its actions are
undone. DBMS uses a log to keep track of
actions of each transaction
If T1 reads uncommitted data written by T2
(dirty read) and T2 must aborted, then T1 must
also be aborted (cascading aborts)
T1: R(A) W(A)
… Abort
T2:
R(A) W(A) …
Cascadeless schedule: transactions only read
data from committed transactions
Lecture 19
Transaction Processing
31
Recoverability
If a transaction fails, the DBMS must return
the DB to its previous state
1.
2.
3.
4.
Computer failure – hw, sw, network, memory error
Transaction error – erroneous input, divison by zero
Local errors – insufficient funds, data not found
Concurrency control enforcement – transaction
aborted
5. Disk failure – hard disk crash (listed in text but not
much different from 1.)
6. Physical catastrophe – power, theft, fire, etc.
Lecture 19
Transaction Processing
32
Recoverability
If T1 reads data from T2, commits and then
T2 needs to abort, what should DBMS do?
This situation is undesirable!
A schedule is recoverable if very transaction
commits only after all transactions from which
it reads data commit.
Cascadeless schedules are recoverable (but not
vice-versa!).
Real systems typically ensure that only
recoverable schedules arise (through locking).
Lecture 19
Transaction Processing
33
Summary
Transactions model DBMS’ view of user
programs
Concurrency control and recovery are
important issues in DBMSs
Transactions must have ACID properties
Atomicity
Consistency
Isolation
Durability
C & I are guaranteed by concurrency control
A & D are guaranteed by crash recovery
Lecture 19
Transaction Processing
34
Summary (cont.)
Schedule models concurrent execution of
transactions
Conflicts arise when two transactions access
the same object, and one of the transactions is
modifying it
Serial execution is our model of correctness
Serializability allows us to “simulate” serial
execution with better performance
Concurrent execution should avoid cascade
abort and be recoverable
Lecture 19
Transaction Processing
35