Granularity of Locks and Degrees of Consistency in a Shared

advertisement
Granularity of Locks
and Degrees of Consistency
in a Shared Database
J.N. Gray, R.A. Lorie, G.R. Putzolu, I.L. Traiger
1977
(Presentation by Randy Ellis)
Topics to be Discussed:
* Transaction definition
* Database Object Hierarchy
* Locking Types
* Locking Compatibility
* Locking Granularity/Protocol
* Transaction Consistency
* Transaction Isolation
* Transaction Interdependence
* Transaction Recovery
An initial definition of “Transaction”:
A transaction is a logical unit of work that can be done to the database by any
single client process/session.
For now, we will accept that the transaction can be made up of multiple
statements/operations and will assume that the break between one transaction
and the next is well defined by the database.
The Tradeoff!
Concurrency:
The ability to have one
transaction work with one
set of resources while
another transaction
“Simultaneously” works
with others
Overhead:
The amount of file
structure, memory and
processing time needed
to keep those different
transactions from
affecting each other
undesirably
The DBMS Structure Waterfall:
Database
Shows resources that different transactions must
“share”.
DB Spaces
Table Spaces
Tables
Views
Pages
Indexes
Cursors
Rows
Columns
And probably more!
The “Simplified” DBMS Structure Waterfall:
Now this is something we can work with!
Still shows resources that different transactions must “share”.
Is a DIRECTED ACYCLIC GRAPH (DAG)
Database
Each resource instance has a class
type name and a unique ID/Name
(e.g. Database:Accounting,
Table:Salaries)
Table Spaces (Areas)
Tables (Files/Relations)
Each resource must be reached
THROUGH A PARENT!
Indexes
Rows (Records)
Note: Two ways to reach a record!
Transactions protect each other from
changes by “Locking” the resources
(generally)
Types of Database Locks:
Null
(NL)
No locks for the node
Exclusive/Write
(X)
Prevents other transactions from replacing changes that
haven't been committed
And MAY prevent other transactions from reading changes
that haven't been committed
Share/Read
(S)
Prevents other transactions from overwriting data it is
currently looking at.
Intention Locks
(I_)
Set on ancestors to denote we "intend" to set the real
thing on one of its descendents. Therefore intention locks
have no use on leaf nodes.
Note:
The combination of intention
locks on the ancestors and
real locks on the descendents
is equivalent to the use of
real locks only on the desired
node.
We have to set sentinels higher up in the digraph to
prevent conflict with transactions arriving at our node via
other paths.
We could use real locks higher up but the use of
"Intention" locks allows us to work with that node, and
still avoid changes that might conflict with the lock further
down -- this increases our concurrency.
Lock Compatibility:
When a transaction attempts to create a new lock on an object that already has locks, the
locking mechanism determines if the new lock is "compatible" with the existing one based
on the types of the new and existing locks.
If compatible: The new lock will be set and the transaction will be allowed to proceed.
If not:
1) The transaction may be given an error (non-blocking)
to relay the user
2) OR The transaction may be suspended and
enqueued on the object until the
incompatible lock(s) are released
Note: If the locks are never removed the
locking mechanism will timeout and
check for deadlock - failing one of
the two transactions to correct the
situation.
Tran 2
T
r
a
n
1
NL
NL Yes
IS Yes
IX Yes
S Yes
SIX Yes
X Yes
IS
IX
S SIX X
Yes Yes Yes Yes Yes
Yes Yes Yes Yes No
Yes Yes No No No
Yes No Yes No No
Yes No No No No
No No No No No
If Tran 1 holds a lock of the given type and
Tran 2 requests another lock of the given
type will that request be granted?
Lock Granularity:
A transaction may set locks on more than one node at a time and those nodes are not restricted
to be leaf nodes.
If a transaction sets a lock on a non-leaf node that lock “implicitly” applies to all descendents of
that node.
Performing locks at a lower (item) level allows other transactions to access other items in the lot –
this HIGH GRANULARITY approach provides for maximum concurrency but requires more
overhead to maintain.
Performing locks at a higher (lot) level prevents other transactions from accessing a whole group
of items with a single lock – this LOW GRANULARITY approach provides for minimum overhead
but reduces concurrency.
A consistent locking protocol must be used:
To create an “S” or “IS” lock – one must hold an “IX” or “IS” lock on the parent
To create an “X”, “SIX” or “IX” lock – one must hold an “SIX” or “IX” lock on the parent.
Locks always proceed from root nodes on down
Locks always released from leaf on up
Who decides the granularity locks are established at?
The optimizer does: by evaluating the needs of the query against the statistics and catalog
and passing it to the locking mechanism as part of the access plan.
A more thorough definition of Transaction:
*
Some UNITS OF WORK may be a single statement, but others may span multiple
statements that depend on each other (e.g. if one statement within the unit of work fails,
so should the others.)
*
Therefore DBMS's give us the ability to bundle statements into logical groups called
TRANSACTIONS, which have well defined starting and ending points controlled by the
DBMS client so that it can guarantee the logical units of work move the database from
one consistent state to another.
*
As work is done within a transaction the original values are preserved and clients can end
a transaction by rolling back (undoing) all work performed since the transaction was
begun OR by committing that work which makes it available to all future transactions
(and abdicates the right to rollback that work).
*
Data that has been added, changed or deleted within a transaction but has not yet been
committed can be called "dirty" data.
*
The transaction creates "locks" to signal other clients’ transactions that this transaction is
changing the data or requires data it has read to remain unchanged (usually).
Transaction Consistency:
Transactions help to guarantee the database is VALID and RECOVERABLE by
providing mechanisms that:
1) Guarantee data the transaction is working with will not change during the
transaction.
2) Remember work the transaction performed so the database can be rolled back
to the last consistent state (before the transaction began) if a “bad thing”
happens while running the transaction.
1)
2)
3)
4)
5)
6)
7)
Power Failure
Hardware or System Software failure
Deadlock conflict with another transaction’s lock(s)
Security Violation (Are you authorized to work with that table/object?)
Constraint violation (Did you put in a valid value for that column?)
Referential Integrity Violation (Did you delete a record another relation uses?)
Flow Integrity (Did a dependency step fail in a multi-step process?)
Degrees of consistency (Isolation):
The "degree of consistency" desired by a client depends on how it needs to isolate its
view of the database from other transactions -- hence it is also called the "isolation level“.
DBMS’s use different combinations of locks and transaction/audit log entries to effect their
degree of consistency.
Sometimes it is not required that a
transaction’s data be “perfectly”
consistent. When this is the case, we
can reduce overhead and increase
concurrency by telling the DBMS to use
a lower isolation level when beginning a
transaction.
The Isolation level of the transaction
then guides the DBMS when setting and
respecting locks.
All Transactions must adhere to level 0
and create “well-formed” locks (set lock
before touching data)
General Isolation Levels
0
Transactions don't
overwrite other
transactions dirty data
Write Locks
1-Phase
1
Transactions don't
commit data until the end
of the transaction
Write Locks
2-Phase
Write Locks
2-Phase
2
Does not read dirty data
from other transactions
Read Locks
1-Phase
Write Locks
2-Phase
Read Locks
2-Phase
3
Other Transactions do
not dirty any data read by
other transactions
1-Phase - Data can be flushed and locks released in mid-transaction
2-Phase - Locks are held until end of transaction
Transaction Interdependency:
“The principle application of dependency definition is as a proof technique and
for discussing schedules and recovery issues”
Transactions can work with the same data!
T1
And - the order in which transactions set locks
(which is controlled by the degree of consistency)
affects how well a transaction can be recovered
when “something bad” happens.
T2
Degree
Reads
<<<
Writes
3
Writes
<<
Reads
2
Writes
<
Writes
1
Reads
-
Reads
0
Note: A schedule's consistency degree is
guaranteed only if the closure of (<, <<, <<<)
is a partial order
So- along with maintaining locks, a DBMS also needs
to keep track of WHEN locks are established and
data is changed. Most DBMS’s employ a
“Transaction Log” to handle this.
T2 <<< T1
T2 < T1
T1 <<< T2
T2 << T1
T1 - T2
T1
Begin
T2
Begin
L(A)
R(A)
T2 - T1
U(A)
L(B)
L(A)
R(A)
W(A)
L(B)
R(B)
W(B)
U(A)
U(B)
R(B)
W(B)
U(B)
End
End
Transaction Recovery:
When “something bad” happens. . .
Transactions make sure to follow an
exact order, establish lock, update log,
update record, commit, unlock – this
“well-formed”, “two-phase”
protocol guarantees changes are safely
in the log if we crash.
Periodically, the DBMS suspends
commits, does a quick flush of all
committed changes, and records a
“check point” in the log – so the whole
log need not be replayed again.
If we need to recover – we clear all the
dirty data from the db and cache and
replay the log from the last checkpoint
according to guidelines to the left:
Degree
Plan
Side Effects
0
Operations are "replayed" with no
dependencies (in any order)
Since operation may have
used data not committed, it's
"old" value may not be
accurate
1
Sort operations according to writes
and then time (< order), try to detect
change has not been made
Since operation may have
used data not committed, it's
"old" value may not be
accurate
2
Transactions can be "replayed" in the
Syncronous operations may
order they were created (data changes
suffer as reads can be
are guaranteed "idempotent" by
performed in a different order locking protocol)
- make sure log order accurate
3
Sort operations by writes, then writereads, then read-writes then time (<<<
order) -- changes are guaranteed
"idempotent" by locking protocol
No side effects!
(other than slowness)
Summary:
*
Whenever multiple users have simultaneous access to the database, additional overhead
(usually in the form of locking) must be added to avoid conflicts.
*
Locks can be granularized at different levels of objects within the database – those
objects are organized into a acyclic hierarchy to promote consistency.
*
Locks come in different flavors. The protocol suggested in this paper uses Read and
Write locks with an Intention modifier. A table specifying the compatibility rules
between these locks was established.
*
A transaction is a logical grouping of all operations into a single unit of work. The DBMS
automatically maintains locks and log entries for the transaction to keep it from
conflicting with other transactions and enabling it to recover if failure is encountered.
*
Transactions are “Isolated” from each other based on a desired degree of consistency
given by the client. Isolation controls how locks are established and respected.
*
The order of transactions differs at different isolation levels and this ordering can affect
the recoverability of the database. Dependency identification rules were addressed and
recovery plans were discussed.
Look at the readings book, page 189 it pretty much sums it all up nice and neat!
Download