PPT

advertisement
Advanced Database Systems and
Data Warehousing
INTEGRITY AND CONCURRENCY
IN DATABASE SYSTEMS
By: Benmammass Mehdi
Outline
► Integrity
 Introduction
 Achieving integrity in a database system
► Integrity
Subsystem Component
► Integrity Rules
► Concurrency






Introduction
Some important definitions
Lock-Based Protocols
Deadlock Avoidance in lock-based protocols
Locking Granularity
Optimistic Concurrency Control
► Conclusion
Introduction - Integrity
► The
main features that a database system should
exhibit are :
 Accuracy
 Correctness
 Validity
► An
integrity constraint guards against accidental
damage of database.
► It ensures data consistency by allowing only
authorized changes in the database.
► The Integrity Subsystem is a component of the
DBMS.
Outline
► Integrity
 Introduction
 Achieving integrity in a database system
► Integrity
Subsystem Component
► Integrity Constraints
► Concurrency






Introduction
Some important definitions
Lock-Based Protocols
Deadlock Avoidance in lock-based protocols
Locking Granularity
Optimistic Concurrency Control
► Conclusion
Integrity subsystem
► The
role of an integrity subsystem is :
 Monitoring transactions and detecting integrity
violations.
 Take appropriate actions given a violation.
► The
integrity subsystem is provided with a
set of rules that define the following :
 the errors to check for;
 when to check for these errors;
 what to do if an error occurs.
Integrity Rules
► Set
of rules stored in the system dictionary
by Integrity Rule Compiler.
► A new integrity rule, before being adopted,
must fulfill all the existing rules.
RULE#1 : AFTER UPDATING sales.quantity :
sales.quantity > 0
ELSE
DO ;
set return code to “RULE#1 violated” ;
REJECT ;
END ;
Integrity Rules
► The
general structure of an integrity rule is
 Trigger condition (after updating, inserting…)
 Constraint (sales.quantity >0)
 Violation response (else do…)
► There
are three types of integrity rules
 Domain Integrity Rule
 The relation integrity rules
 The fansets integrity constraints
Domain Integrity Rule (1)
DCL S#
PRIMARY DOMAIN CHARACTER (5)
SUBSTR (S#,1,1) = ‘S’
AND IS_NUMERIC (SUBSTR (S#,2,4))
ELSE
DO;
Set return code to “S# domain rule
violated” ;
REJECT ;
END ;
► S# is a string of 5 characters. Te first character is
an S and the last 4 characters are numeric.
Domain Integrity Rule (2)
► Composite
domains : a domain DATE which
is composed of three domains DAY, MONTH
and YEAR
► User-Written Procedures.
► Interdomain Procedures : some conversion
rules (procedures) may help for example to
compare two values from two distinct
domains (distance expressed in kms and
miles).
Relation Integrity Rule (1)
► Immediate
record state constraints
After updating or inserting sales.quantity, verify :
sales.quantity > 0
► Immediate
record transition constraints
New_date > sales.date
► Immediate set state/transition constraints
 define a key uniqueness and enforcing non-null values of
the key (Entity Integrity Rule)
 impose referential integrity (Foreign Key Integrity Rule)
Relation Integrity Rule (2)
► Deferred
record state constraints
► Deferred record transition constraints
► Deferred set state constraints
Applied at the end of the transaction
(WHEN COMMITING). We need this kind
of constraints because sometimes the set
of updates in a transaction violates
temporarily the rule.
► Deferred set transition constraints
Other Integrity Constraints
► Fanset
Integrity Rules
 Used in network databases. They prevent integrity
violations by providing referential integrity.
► Triggered
procedures
 Integrity rules are special case of triggered procedures.
 Are useful to carry out the following tasks :
► Prevent
the user that deleting a client will delete all its sales.
► Access security.
► Performance measurement of the database.
► Controlling stored record (compressing and decompressing data
when storing and retrieving data).
► Exception reporting (expiry date for medicaments)
Outline
►
Integrity
 Introduction
 Achieving integrity in a database system
► Integrity
Subsystem Component
► Integrity Constraints
►
Concurrency






►
Introduction
Some important definitions
Lock-Based Protocols
Deadlock Avoidance in lock-based protocols
Locking Granularity
Optimistic Concurrency Control
Conclusion
Concurrency Control - Introduction
► Contention
occurs when two or more users try to
access simultaneously the same record.
► Concurrency occurs when multiple users have
the ability to access the same resource and each
user has access to the resource in isolation.
Concurrency is high when there is no apparent
wait time for a user to get its request.
Concurrency is low when wait times are evident
► Consistency occurs when users access a shared
resource and the resource exhibits the same
characteristics and satisfies all the constraints
among all operations.
Concurrency Control - Introduction
►
Example (Bank transactions) :

2 accounts A and B (assume balances A and
B=100DH)

2 transactions T1 and T2 that will be executed
concurrently
►
►
T1 : start, A=A+100, B=B-100, COMMIT
T2 : start, A=A*1.05, B=B*1.05, COMMIT
Concurrency Control - Introduction
Consider these two different sequences of execution :
► T1
► T1
A+100
A=A+100
B=B-100
COMMIT
► T2
A=A*1.05
B=B*1.05
COMMIT
► T2
A=A*1.05
INTERFERENCE
B=B*1.05
COMMIT
► T1
B=B-100
COMMIT
A = 210 DH
A = 210 DH
B = 0 DH
B = 5 DH
Concurrency Control - Solutions
► Lock-Based
Protocols
► Timestamp
Techniques
► Optimistic
Concurrency Control
Concurrency Control – Lock Manager
►
A Lock manager can be implemented as a separate
process to which transactions send lock and unlock
requests
►
The lock manager replies to a lock request by sending
a lock grant messages (or a message asking the
transaction to roll back, in case of a deadlock)
►
The requesting transaction waits until its request is
answered
►
The lock manager maintains a data structure called a
lock table to record granted locks and pending
requests
Concurrency – LB Protocols (1)
Principle :
► Transactions ask for a lock on a record before
updating it.
► After update, the record is unlocked
► We have two types of locks :
 exclusive (X) mode: data can be both read as well as
written. X-lock is requested using lock-X instruction.
Records are the unit of locking.
 shared (S) mode: data can only be read. S-lock is
requested using lock-S instruction. Tables are the unit
of locking.
LB Protocols (2)
Shared and Exclusive Locks
Transaction A
Shared Lock
Exclusive Lock
Exclusive Lock
Accounts Table
Row 1
Row 2
Row 3
Row 4
Row 5
Row 6
Row 7
Row 8
Transaction B
Shared Lock
Exclusive Lock
Exclusive Lock
LB Protocols (3)
► Compatibility
Matrix
Shared lock
Exclusive lock
Shared lock
True
False
Exclusive lock
False
False
Lock Table
►
►
►
►
►
Black rectangles indicate granted
locks, white ones indicate waiting
requests
Lock table also records the type of
lock granted or requested
New request is added to the end
of the queue of requests for the
data item, and granted if it is
compatible with all earlier locks
Unlock requests result in the
request being deleted, and later
requests are checked to see if
they can now be granted
If transaction aborts, all waiting or
granted requests of the
transaction are deleted
 lock manager may keep a list
of locks held by each
transaction, to implement this
efficiently
LB Protocols (4) – PX Protocol
► Any
transaction that intends to update a
record must first execute an exclusive lock
request (X-lock) on that record.
► If the lock cannot be acquired, the
transaction goes into a wait state.
► When the record becomes available, the
lock can be granted and the transaction can
resume processing.
LB Protocols (5) – PX Protocol
►
Example : 2 transactions
Transaction 1:
lock-X(B)
read(B)
B = B -50
write(B)
unlock(B)
lock-X(A)
read(A)
A = A + 50
write(A)
unlock(A)
Transaction 2 :
lock-S(A)
read(A)
unlock(A)
lock-S(B)
read(B)
unlock(B)
display(A+B)
LB Protocols (6) – PX Protocol
Execution sequence :
Transaction 1
lock-X(B)
read(B)
B = B -50
write(B)
unlock(B)
lock-X(A)
read(A)
A = A + 50
write(A)
unlock(A)
Transaction 2
Concurrency control manager
grant-X(B)
lock-S(A)
read(A)
unlock(A)
lock-S(B)
read(B)
unlock(B)
display(A+B)
grant-S(A)
grant-S(B)
grant-X(A)
LB Protocols (7) – PX Protocol
► Serializability
: interleaved execution sequence of
a set of transactions that will obtain the same
results as if the transactions are processed serially.
► We have to look at the lock requests of each
transaction and to find an order to execute them
without any interference between then. The
resulting sequence, if there is one, implies that the
two transactions are serializable.
► PX Protocol then can be applied.
LB Protocols (7) – PX Protocol
► Using
the lock-based mechanism, deadlock and
starvation can occur. This is an example of
deadlock :
Transaction A
Transaction B
Accounts Table
Shared Lock
Already XLocked
Asks for an XLock
Row 1
Row 2
Row 3
Row 4
Row 5
Row 6
Row 7
Row 8
Shared Lock
Asks for an XLock
Already XLocked
LB Protocols – PXC Protocol
► Derived
from PX protocol.
► Exclusive locks are retained until end of
transaction (COMMIT or ROLLBACK).
► PXC helps to avoid loss of updates bcause
of ROLLBACK.
 No transaction is allowed to update an
uncommitted changed record.
LB Protocols – PS / PSC Protocols
► Any
transaction that updates a record must firstly
ask for a shared lock of that record.
► During the transaction, just before the update
command, comes a request of changing the lock-S
to lock-X.
► A transaction should not be allowed to lock itself
out.
► The goal here is to limit the duration of X-locks.
LB Protocols – PS Protocol
► Example
(here deadlock occurs at T4):
 Transaction A
SFIND Record1
--UPD Record1
---
Time
T1
T2
T3
T4
Transaction B
--SFIND R1
--UPD R1
LB Protocols – PU / PUC Protocol
► This
protocol uses a third lock state : the
update lock.
► Any transaction that intends to update a
record is required to ask for U-lock of that
record. A U-lock is compatible with an Slock but not with another U-lock.
► Replacing S-locks by U-locks will prevent
deadlock.
LB Protocols – PU / PUC Protocol
► Compatibility
►Example
matrix :
X
S
U
X
S
False
False
False
True
False
True
U
False
True
False
: compare PU and PS protocols
LB Protocols – PU Protocol
► This
protocol is more efficient than the
previous ones.
► It limits considerably deadlock, because it
decreases the number of S-locks.
LB Protocols –Two Phase Locking Protocol
► 2PL
ensures conflict-serializable schedules.
► 2PL includes two phases :
 Growing phase : transaction may obtain locks and
may not release locks
 Shrinking phase : transaction may release locks and
may not obtain locks.
► The
schedule is determined in the relation to the
order of their lock points.
► If all transactions are two-phase, then all
executions are serializable.
LB Protocols –2PL Protocol
Example :
LB Protocols –2PL Protocol
► There
are many protocols derived from 2PL :
 Strict two-phase locking. Here a transaction must
hold all its exclusive locks till it commits.
 Rigorous two-phase locking is even stricter: here all
locks (shared and exclusive) are held till commit. In this
protocol transactions can be serialized in the order in
which they commit.
 Graph-based protocol : we fix an order of accessing
data. If a transaction has to update Row2 and read
Row1, it has to access these data in a predefined order.
LB Protocols – Deadlock Avoidance
► Deadlock
prevention protocols ensure that
the system will never enter into a deadlock
state. It can be achieved using different
strategies :
 Transaction Scheduling
 Request Rejection
 Transaction Retry
LB-Protocols : Deadlock Prevention Strategies
► Timeout-Based
Schemes :
 a transaction waits for a lock only for a specified
amount of time. After that, the wait times out
and the transaction is rolled back.
 thus deadlocks are not possible
 simple to implement; but starvation is possible.
Also difficult to determine good value of the
timeout interval.
LB-Protocols : Deadlock Prevention Strategies
► What
to do when a deadlock is detected ?
► Some transactions will have to roll back to break
deadlock. Select that transaction as victim that
will incur minimum cost.
► We have to determine how far to roll back the
transaction. We can either carry out :
 Total rollback: Abort the transaction and then restart it.
 Partial rollback: it is more effective to roll back
transaction only as far as necessary to break deadlock
Deadlock Avoidance : Transaction Scheduling
► Two
transactions will not be run
concurrently if their data requirements
conflict.
► We must know what are the data
requirements of each transaction before run
time => impossible till runtime.
► Consequently, the lock unit is a set of
records and locks are applied at transaction
initiation instead of during execution.
Deadlock Avoidance : Request Rejection
► The
system rejects any lock request that
cannot be applied.
► It uses the deadlock detection algorithm.
► When trying to grant a lock request, if a
deadlock is detected, the transaction is
rejected.
Deadlock Avoidance : Transaction Retry
►
Transactions are timestamped with their start time.
Example : A requests a lock on a record already locked by B
►
wait-die scheme — non-preemptive
►
 A waits if it is older than B, otherwise, it dies and it is rolled back and
automatically retried.
 a transaction may die several times before acquiring its needed data
item
►
wound-wait scheme — preemptive
 A waits if it is younger than B, otherwise it wounds (forces rollback) of
younger transaction instead of waiting for it.
 Younger transactions may wait for older ones.
 Less rollbacks than wait-die scheme.
►
Transactions retain their timestamps even if they are rolled
back.
LB-Protocols : Deadlock Detection Algorithm
► The
system is in a deadlock state if and only
if the wait-for graph has a cycle.
► The system must invoke a deadlockdetection algorithm periodically to look for
cycles.
LB-Protocols : Deadlock Detection Algorithm
Wait-for graph without a cycle
Back
Wait-for graph with a cycle
LB Protocols : Locking Granularity
► Allow
data items to be of various sizes and define
a hierarchy of data granularities, where the small
granularities are nested within larger ones.
► Can be represented graphically as a tree. When a
transaction locks a node in the tree explicitly, it
implicitly locks all the node's descendents in the
same mode.
► Granularity of locking (level in tree where locking
is done):
 fine granularity (lower in tree): high concurrency, high
locking overhead
 coarse granularity (higher in tree): low locking
overhead, low concurrency
LB Protocols : Locking Granularity
► The
highest level in the example hierarchy
is the entire database.
► The levels below are of type area, file and
record in that order.
LB Protocols : Intent Locking Protocol
► In
addition to S and X lock modes, there are three
additional lock modes with multiple granularity:
 intention-shared (IS): indicates explicit locking at a
lower level of the tree but only with shared locks.
 intention-exclusive (IX): indicates explicit locking at
a lower level with exclusive or shared locks
 shared and intention-exclusive (SIX): the subtree
rooted by that node is locked explicitly in shared mode
and explicit locking is being done at a lower level with
exclusive-mode locks.
► Intention
locks allow a higher level node to be
locked in S or X mode without having to check all
descendent nodes.
LB Protocols : Intent Locking Protocol
Compatibility Matrix :
IS
IX
S
S IX
IS





IX





S





S IX





X





X
LB Protocols : Intent Locking Protocol
►
Transaction Ti can lock a node Q, using the following rules:
1. The lock compatibility matrix must be observed.
2. The root of the tree must be locked first, and may be locked in any
mode.
3. A node Q can be locked by Ti in S or IS mode only if the parent
of Q is currently locked by Ti in either IX or IS mode.
4. A node Q can be locked by Ti in X, SIX, or IX mode only if the
parent of Q is currently locked by Ti in either IX or SIX mode.
5. Ti can lock a node only if it has not previously unlocked any node
(that is, Ti is two-phase).
6. Ti can unlock a node Q only if none of the children of Q are currently
locked by Ti.
►
Locks are acquired in root-to-leaf order, whereas they are released in
leaf-to-root order.
LB Protocols
► Default
Locking Behavior for Oracle
 A pure SELECT will not lock any row.
 INSERT, UPDATE or DELETE will place a row
Exclusive Lock (X-lock).
 SELECT...FROM...FOR UPDATE will place a row
Shared Lock (S-lock).
LB Protocols
► Oracle
Syntax:
LOCK TABLE [schema.] table [options] IN lock mode MODE [NOWAIT]
 Options:
► PARTITION
partition
► SUBPARTITION subpartition
► @dblink
 Lock modes:
EXCLUSIVE
► SHARE
► ROW EXCLUSIVE
► SHARE ROW EXCLUSIVE
► ROW SHARE* | SHARE UPDATE*
►
Optimistic Concurrency Control
Read
►A
Validation
Write
transaction in OCC is composed of three phases :
 Read Phase
Transactions access the database to load data,
then they update data in a separate buffer.
 Validation phase
For each transaction, the system checks if there is
any conflict with another transaction. If there is, the transaction
is rolled back, otherwise the write phase can proceed.
 Write phase
Updates are written from the buffer to the database.
CONCLUSION
► Locking
is a pessimist concurrency control,
because it assumes maximum contention.
► OCC
is dead-lock free because it does not
implement locking.
References
►
Date Book
►
Cooperative Cataloging in a Scalable Digital Library System,
Dr Hachim Haddouti
►
Transaction Management, IBM Research Laboratory, San
Jose California
►
►
Performance of Concurrency Control Mechanisms in
Centralized Database Systems, Vijay Kumar.
Overview of concurrency control and locking for
databases. www.odbmsfacts.com/articles/ concurrency_control_and_locking.html
Integrity and Concurrency
Control in Database System
Q&A
Download