Uploaded by Kevin Kim

13 - Transactions and Concurrency V1 Read-Only

advertisement
Topics for this lecture
• Transactions and Concurrency control
Transactions and
Concurrency
Transactions
• The goal of transactions
– the objects managed by a server must remain
in a consistent state
• when they are accessed by multiple transactions
and
• in the presence of server crashes
• Recoverable objects
– can be recovered after their server crashes
– objects are stored in permanent storage
–
–
–
–
Transactions
Concurrency control
Nested Transactions
Locks
• Distributed Transactions
–
–
–
–
–
Flat and nested distributed transactions
Atomic commit protocols
Concurrency control
Distributed Deadlocks
Transaction recovery
Transaction Atomicity
1. All or nothing:
– Either every operation in the transaction
completes OR
– No operations complete at all
– Failure atomicity
• effects are atomic even when the server crashes
– Durability
• after a transaction has completed successfully
• all its effects are saved in permanent storage
Transaction Atomicity
2. Isolation
– Each transaction must be performed without
interference from other transactions
– Other transactions can see a transaction's
intermediate effects
– Only once ALL of the operations are complete
can other transactions see the results
ACID properties
• Atomicity
- All of nothing
• Consistent
- A transaction moves from one consistent state to
another consistent state
• Isolated
- Intermediate effects are isolated until the transaction
has completed
• Durable
- Once completed a transaction’s effects are
permanent
1
Concurrency
Serial execution of T and U
• Concurrency occurs when two or more
execution flows are able to run
simultaneously
• Concurrency can cause problems if it is
not managed
• We shall examine two problems
– Lost update
– Inconsistent retrieval
Opening balances
A=100, B = 200 and C = 300
Transaction T
balance = b.getBalance(); // balance = 200
b.setBalance(balance*1.1); // balance*1.1 = 220
a.withdraw(balance/10);
// balance/10 = 20
Transaction U
balance = b.getBalance(); // balance = 220
b.setBalance(balance*1.1); // balance = 242
c.withdraw(balance/10);
// balance/10 = 22
Resulting balances
A=80, B = 242 and C = 278
Interleaved execution and a lost
update
Transaction T
Transaction U
balance = b.getBalance();
b.setBalance(balance*1.1);
a.withdraw(balance/10)
balance = b.getBalance();
b.setBalance(balance*1.1);
c.withdraw(balance/10)
Opening balances
A=100, B = 200
Transaction V
a.withdraw(100); // a.balance = 0
b.deposit(100); // b.balance = 300
balance = b.getBalance(); $200
balance = b.getBalance();
$200
b.setBalance(balance*1.1); $220
b.setBalance(balance*1.1); $220
a.withdraw(balance/10)
$80
c.withdraw(balance/10)
$280
Resulting balances from this interleaving
A=80, B = 220 and C = 280
Different to serial execution!!
Interleaved execution and an
inconsistent retrieval
Transaction V
a.withdraw(100);
Transaction W
aBranch.branchTotal();
b.deposit(100);
a.withdraw(100);
total=a.getBalance();
Serial execution of V and W
Transaction W
total=a.getBalance(); // balance = 0
total=total+b.getBalance(); // balance = 300
// total = 300
Resulting balances
A=0, B = 300 and total = 300
Interleaved execution that is
serially equivalent to T and U
Transaction T
Transaction U
balance = b.getBalance();
b.setBalance(balance*1.1);
a.withdraw(balance/10)
balance = b.getBalance();
b.setBalance(balance*1.1);
c.withdraw(balance/10)
balance = b.getBalance(); $200
b.setBalance(balance*1.1); $220
balance = b.getBalance(); $220
b.setBalance(balance*1.1); $242
total=total+b.getBalance();
b.deposit(100);
a.withdraw(balance/10)
$80
c.withdraw(balance/10)
Resulting balances from this interleaving
Resulting balances from this interleaving
A=0, B = 300 and total=200 <<< WRONG!!
A=80, B = 242 and C = 278
$278
2
Conflicts in transactions
Operations within
different transactions
Conflict?
Reason
read
read
No
No dependency between read
operations
read
write
Yes
Effects from Read and Write
operations depends on their order
write
write
Yes
Effects from write and write operations
depends on their ordering
Commit / Abort
• Commit
– Makes permanent (durable) the isolated
effects of the transaction
• Abort
– If any part fails (atomicity)
– Does not make permanent the isolated effects
of the transaction
– Intermediate effects are undone
Nested transactions
• Transactions that are themselves composed of
sub-transactions
• Form a hierarchy with parents and subtransactions
• Allow for concurrency with the transaction
• Failure with a flat transaction implies that the
whole transaction should be repeated
• With a nested transaction only the subtransaction that aborted needs to be repeated
Commit rules for nested
transactions
• The transaction may commit once children have
completed
• Sub-transactions make independent and final
commit / abort decisions
• Parent abort implies sub-transaction aborts
• Parents can still decide to commit in the
presence of sub-transaction aborts.
• Top-level transaction abort implies all subtransactions abort
Nested transactions
T 1 Top level transaction
T 1 OpenSubTransaction
T 11
T 12
provisional commit
abort
Locks
• Locks provides a means for ensuring
serial equivalence
• Exclusive locking is where a transaction
locks objects exclusively until it commits
3
Serial equivalence with exclusive
locking
Strict two phase locking
• Strict two phase locking
Transaction T
Transaction
balance = b.getBalance();
b.setBalance(balance*1.1);
a.withdraw(balance/10);
Open transaction
balance = b.getBalance();
b.setBalance(balance*1.1);
a.withdraw(balance/10);
Closetransaction
U
– Transactions are not allowed to apply more locks
once they have released locks
– Growing phase = applying locks
– Shrinking phase = releasing locks
– Locks held until commit
balance = b.getBalance();
b.setBalance(balance*1.1);
c.withdraw(balance/10);
Lock B
Lock A
Unlock A B
Open transaction
balance = b.getBalance();
... Waitis ...
balance = b.getBalance();
b.setBalance(balance*1.1);
c.withdraw(balance/10);
closeTransaction
Waits for T to release Lock B
Lock B
Lock C
Unlock B C
Locks
Nested transaction locking
• What do we lock?
– Lockable unit should be as small as possible
• Parents acquire locks from sub-transactions
once they commit
– The top level eventually holds all the locks
– Aim is to prevent sub-transaction inconsistencies
• Lock types
– Many transactions can read without conflict
– One writer implies possible conflict
– We use two different locks read and write locks
• Parents do not run concurrent to subtransactions
– Sub-transactions acquire locks during their execution
from parents if they need them
– Thus, parent and sub-transaction can lock same data
Lock compatibility
Lock requested
Read
Lock
present
Write
None
OK
OK
Read
OK
Wait
Write
Wait
Wait
Nested transaction locking
• Sub transactions can acquire read locks providing that
no other sub-transactions hold current write locks
– We could have many read locks
– Once completed the parent acquires the read locks
– Requests for write locks in the presence of read locks would
require the sub-transactions to wait for the read locks to released
• Once committed locks are passed to parents
• Parents may hold write locks and may give them to subtransactions as required and in accordance with lock
compatibility rules
Dead locks
• Two transactions waiting for each other to
release locks
U
T
U
T
V
..
..
V
T
U
W
4
Dead lock detection
Dead lock timeouts
• A deadlock manager has responsibility for
detecting deadlocks
• It stores details of who transactions are waiting
for
• Uses the details to detect deadlocks
• Can abort transactions if deadlock is detected
T
– Locks have an initial period of invulnerability
– During this time they have exclusive access to the
object
– Lock requests from other transactions are declined
• Venerable locks
– Once a time limit has exceeded locks become
venerable
– Subsequent requests from other transactions cuase
the venerable lock to be broken
V
Deadlock edge graph
Transactions
• Invulnerable locks
Wait for transaction
U, W
V
W
W
T, U, V
T
U
W
Wait for graph
Other locking schemes
• two-version locking
Concurrency
• Disadvantages of locks
– allows writing of tentative versions with
reading of committed versions
– Lock maintenance is an overhead
– Locks can result in deadlock
– Locks do not fully use concurrency
• hierarchic locks
– Uses different lockable units
– i.e. the branchTotal operation locks all the
accounts with one lock whereas the other
operations lock individual accounts (reduces
the number of locks needed)
• Optimistic concurrency control
– Presume that transactions to not interfere with
each other
– If a conflict does occur then abort and restart
Concurrency
• Working phase
– Transactions work on tentative objects (most recently
committed versions)
– Stores a list of which objects are used and how (read
and write lists)
Transaction validation
•
•
• Validation phase
– Occurs once the working phase is complete
– The read / write lists are used to determine if conflicts
exist
• Update phase
•
Transactions are assigned an incremental transaction number when
they start validation
The following rules then ensure serialisability of transactions when
they overlap
Tv
Ti
Rule
Write
Read
1
Ti must not read objects written by Tv
Read
Write
2
Tv must not read objects read by Ti
write
Write
3
Ti must not write objects written by Tv
Tv must not write objects written by Ti
Where:
Tv is the transaction being validated
Tv and Ti are overlapping transactions
– tentative objects committed
5
Transaction validation
Working phase
validation
phase
Transaction validation
Working phase
commit phase
T1
validation
phase
commit phase
T1
T2
T2
Tv
Tv
Tactive1
Tactive1
Tactive2
Tactive2
Tv work phase overlaps with T2 work phase
Tv work phase overlaps with Tactive1 and Tactive2 work phases
Backward validation:
Forward validation:
Rule 2: compares write set of T2 against read set of Tv
Rule 1: compares write set of Tv against read set of Ti (Tactive1 and Tactive2)
Tv aborts if a conflict is found
choice of actions if a conflict is found (defer, abort Tv abort Ti)
Distributed Transactions
• Distributed Transactions
Distributed Transactions
Distributed transactions
• A distributed transaction is a transaction
that invokes operations in several different
servers
• Can be either:
– Flat or
– Nested
–
–
–
–
–
Flat and nested distributed transactions
Atomic commit protocols
Concurrency control
Distributed Deadlocks
Transaction recovery
Flat distributed transactions
– Makes invocations to various
remote servers
– The flat transaction waits for
each request to be serviced
before continuing
– The flat transaction is
sequential
X
Y
T
T
Z
Client
Flat transaction
6
Nested distributed transactions
Coordination
•
– Makes invocations to
various remote servers
– Sub-transactions at the
same level can process
independently and
concurrently
– A coordinator could exist in any server
– The coordinator is responsible for aborting or committing the distributed
transaction
– It requires a list of participants in the transaction
M
T11
T1
T
N
X
T
Client
•
T12
T21
The client then begins its transaction with a request to the
coordinator
– The coordinator passes back a DIS unique transaction ID (TID)
– When the client makes requests of other servers it passes the TID with
the request in the transaction
T2
Y
T22
Coordination is required when committing
P
Nested transaction
•
Servers that are accessed become participants
– They register their participation using the TID when they receive
invocations
Atomic commit protocols
• All nodes in a DIS agree to commit or all nodes
abort
Atomic commit protocols
• Two phase commit protocol
– Allows any participant to abort its part
– Any abort implies all abort (atomicity)
• Two phases
• One phase commit protocol
– Phase 1: Participant votes (commit or abort)
– Coordinator keeps sending messages to participants
until they acknowledge
– Concurrency management can cause problems
– Coordinator cannot abort once a client has requested
commit
Atomic commit protocols
• Distributed transactions are message
based and messages can get lost
Coordinator
canCommit?
Yes
doCommit
haveCommitted
Participent
• Once voted commit they must be able to meet that obligation
• To ensure this participants store their intermediate objects
– Phase 2: vote execution
• Coordinator collects decisions
• If no failures and all votes were yes then the coordinator tells
participants to commit
• Otherwise it tells them all to abort
Atomic commit protocols
• Time-outs to detect failures
– No coordinator doCommit
• Participants can request decision from coordinator if they have been
waiting for a long time (getDecision)
– No coordinator canCommit
• Participants can abort if excessive time has elapsed since it
completed its portion of the transaction and the server has not
requested voting
– Coordinator waiting on a client vote
• Decides to abort
• Coordinator failure
– A big problem
• Only the coordinator knew the participants
• Retry with new coordinator or use a cooperative protocol
7
Atomic commit protocols
Atomic commit protocols
• Two phase commit protocols and nested transactions
M
Abort
T11
T1
T11
T1
Provisional commit
T
T
N
X
Provisional commit
T12
T
Provisional commit
T21
T2
T12
T21
T2
Client
Y
Abort
Provisional commit
T22
T22
P
• Top level transaction is the coordinator
• Sub-transactions become coordinators of other
sub-transactions
• Results propagate up the hierarchy
• When a sub-transaction aborts it merely passes
to its parent the abort decision
• When they provisionally commit it passes their
details to the parent
Nested transaction
Atomic commit protocols
Concurrency Control for Distributed
Transactions
• “Each server manages a set of objects and is
responsible for ensuring that they remain consistent
when accessed by concurrent transactions
Transaction status
Coordinator information
Coordinator
T
T1
T2
T11
T12,T21
T22
Child
Participant
transactions
T 1 ,T 2
T11, T12
T21, T22
Yes
Yes
Provisional
commit list
Abort list
T1, T12
T1, T12
T11, T2
T11
T2
T11
No (aborted)
No (aborted)
T12, not T21
No (parent
aborted)
T21, T12
T22
Locks
• Locks are held locally by a local lock manager
• This can grant locks and tell other transactions to
wait
• Releasing locks is less simple
• Must wait for all other servers to commit their
portions of a distributed transaction before
releasing its locks
• Lock remain through the A2PC (Atomic Two
Phase Commit)
– therefore, each server is responsible for applying
concurrency control to its own objects.
– the members of a collection of servers of distributed
transactions are jointly responsible for ensuring that they
are performed in a serially equivalent manner
– therefore if transaction T is before transaction U in their
conflicting access to objects at one of the servers then they
must be in that order at all of the servers whose objects are
accessed in a conflicting manner by both T and U”
Coulouris et al 2005
Optimistic concurrency Control
• Recall that with optimistic concurrency control
transactions are validated before commit
• With distributed transactions
– transaction numbers assigned at start of validation
– transactions serialized according to transaction
numbers
– validation takes place in phase 1 of 2PC protocol
• Transaction is validated at many servers
• Each validates its objects against other ongoing
transactions
8
Commitment deadlock
•
•
•
•
Optimistic concurrency Control
T is validated at X, U is validated at U
U cannot validate as T has not committed
T cannot validate as U has not committed
Thus, we have commitment deadlock
T
Read(A)
– Rule 2 + Rule 3 is used for backward validation
U
at X
Read(B)
Write(A)
Read(B)
• Distributed validation can be slow as the A2PC
protocol waits for validation
• Parallel validation can help
• This involves forward and backward validation
• Parallel validation can lead to different servers
using different serialised orders (U->T and T->U)
• This can be avoided by checking and the use of
globally unique transaction IDs
at Y
Write(B)
at Y
Read(A)
Write(B)
at X
Write(A)
Distributed deadlock
Distributed deadlock
• Single server transactions can suffer deadlocks
– We can prevent, detect and resolve
– Timeouts can be used for this but are not ideal
– Detection is preferable possibly using wait-for graphs
• Distributed transactions lead to distributed deadlocks
– We can build global wait for graphs from local graphs
– However, with distributed transactions we can obtain
deadlocks that do not show in local wait for graph
– This are what are referred to as distributed deadlocks.
Distributed deadlock
W
Held by
C
Waits for
Held by
W
A
A
Held by
Z
Waits for
Waits for
V
Held by
V
Held by
U
Held by
B
Waits for
Y
U
Held by
a.Deposit(20)
Lock A at X
b.Withdraw(30)
Wait at Y
b.Deposit(10)
c.Withdraw(20)
W
Lock B at Y
c.Deposit(30)
Lock C at Z
a.Withdraw(20)
Wait at X
Wait at Z
• Centralised deadlock detector
Waits for
X
Z
V
Lock D
Distributed deadlock
C
D
U
d.Deposit(10)
B
Waits for
– Servers periodically send their local wait for graphs to
the server
– The server joins them together to check for distributed
deadlock
– It then decides how best to resolve the problem
• Which transaction to abort
• Phantom deadlock
– Deadlocks that do not really exist
– Occur due to time delays in determining deadlocks
9
Distributed deadlock detection
Distributed deadlock detection
• Initiation
• Edge chasing
– Does not construct a wait for graph
– Instead uses probes to detect deadlocks
• Algorithm consists of three steps
– Initiation
– Detection
– Resolution
• Resolution
– This simply involves aborting one of the transactions and thus
breaking the deadlock
– If a server notices a transaction (TA) is waiting for another
transaction TB
• It sends a probe to the server that is blocked (server Sx)
• The probe contains the edge TA -> TB (A waiting for B)
• Detection
– Receivers of probes (Sx) check whether the TB is waiting for
another transaction
• If it is (lets say TB is waiting on TC) then it sends a probe to its
blocked server (Sy)
• The new probe contains the edge TA->TB->TC
– Eventually a probe might contain a cycle
• TA->TB->TC->TA is a cycle as it loops back to itself
Distributed transaction recovery
Distributed transaction recovery
• Atomicity requirements
• Recovery manager responsibilities
– All of the effects of completed transactions are permanent
– None of the effects of partially or aborted transactions are
permanent
• Durability
– Saving objects in permanent storage
– Restoring server objects subsequent to crashes
– Managing the performance through reorganisation of recovery
files
– Reclaiming storage space from the recover file
– Objects are saved in permanent storage
• Failure atomicity
– Effects of transactions are permanent even when servers crash
• For security we need to consider back-up recovery
facilities
– Preferably off-site and isolated
• Both aspects are governed by the recovery manager
Distributed transaction recovery
Distributed transaction recovery
• Intention lists
• Logging technique
– Held at each server for each transaction
– Is a list of references to objects that were used within a
transaction
– Contains the values of the objects too
– The commit process uses the intention list to determine what
needs to be committed
– Changed objects are then written to a recovery file
– Intentions are also written to the recovery file
– Abort uses intention list to delete tentative objects
– A recovery file contains history of all the transactions
– It contains
• Intention lists, values of objects and status entries
– Status
• Prepared: ready to commit changes
• Committed: previously commit changes
• Abort: aborted transaction
10
Distributed transaction recovery
Object: A
Object: B
Object: C
Object: A
Object: B
Trans: T
Trans: T
100
200
300
80
220
Prepared
committed
Distributed transaction recovery
• Recovery is performed by the recovery manager
• Objects are created and then their details filled by the
recovery manager
<A,P1>
<B,P2>
• It applies atomicity
P0
p1
p0
p2
p3
– Transactions fully or not at
– All in the same order
p4
Position 0
Position 1
Position 2
Position 3
Position 4
The state of objects
A, B and C prior to
transition T
The tentative
object A
The tentative
object B
Transaction T is prepared
T was
committed
<A,P1> intention list contains reference
to tentative object A at P1 and <B,P2>
intention list contains reference to
tentative object B at P0
• Recovery could work forwards from the beginning of
recovery file
• Or backwards from the end (this can be more efficient)
Rollback position prior to T is at P0
Summary
END
• Transactions and Concurrency control
–
–
–
–
Transactions
Concurrency control
Nested Transactions
Locks
• Distributed Transactions
–
–
–
–
–
Flat and nested distributed transactions
Atomic commit protocols
Concurrency control
Distributed Deadlocks
Transaction recovery
11
Download