Granularity of Locks and Degrees of Consistency in a Shared Data

advertisement
.
Modelling in Data Base Management Systems. G.M. Ni]ssen, (ed.)
North Holland Publishing Company, 1976
Granularity
J.N.
Gray,
of Locks and Degrees of C o n s i s t e n c y
in a Shared Data Base
R.A.
Lorie,
G.R. Putzolu,
I.L.
Traiger
÷
IBM ~esearch Laboratory
San 3ose, California
The problem of choosing the appropriate H r a n u l a r i t ~ (size)
of lockable objects is introduced and the tradeoff between
concurrency and overhead is discusseS. A locking protocol
which allows s i m u l t a n e o u s locking at v a r i o u s g r a n u l a r i t i e s
by different
t r a n s a c t i o n s is presented.
It is
based on
the
introduction of
additional
lock
m o d e s besides
the
conventional share
mode an5 exclusive
mode.
A proof is
given
of
the
equivalence
of this
protocol
to
a
conventional one.
.L
Next the issue
of consistency in a shared e n v i r o n m e n t is
analyze~. This discussion is
motivated by the r e a l i z a t i o n
that some
existing data base
systems use
a u t o m a t i c lock
protocols which insure protection
only from certain types
of
inconsistencies
(for
instance
those
arising
from
transaction
backup),
thereby a u t o m a t i c a l l y
providing
a
limited
degree
of consistency.
Four
~ S ~
~
consistency
are
introduced.
They
can
be
roughly
c h a r a c t e r i z e d as
follows: degree
0 protects
others from
your updates,
degree I additionally
p r o v i d e s protection
from
losing
updates,
degree
2 additionally
provides
protection from reading incorrect data iteas, and degree 3
additionally
provides protection
from r e a d i n g
incorrect
relationships among data items (i.e. total protection).
A
discussion
follows
on
the
relationships
of
the
four
degrees
to
locking
protocols,
concurrency,
overhead,
recovery and t r a n s a c t i o n structure.
Lastly,
these
ideas
management systems.
I. GRANULARITY
are
compared
with
existingdata
OF LOCKS:
An
important issue
which
arises
in the
design
of
a data
Dase
management system
is the
choice of lockable
unitE, i.e.
the data
aggregates
which
are
atomically
locked
to
insure
consistency.
Examples of
lockable units
are areas,
files, i n d i v i d u a l
records,
field values, and i n t e r v a l s of field values.
The choice of l o c k a b l e units presents a tradeoff between concurrency
and overhead,
which is r e l a t e d
to the
size or K r a n u l a r i t Z
of the
units themselves.
On.the one hand,
concurrency is increased
if a
fine lockable unit (for example a record or field) is chosen.
Such
unit is
appropriate for a "simple" t r a n s a c t i o n which
a c c e s s e s few
records.
On the other
hand a fine unit of locking
would be costly
for
a "complex"
transaction
which accesses
a
large
number
of
records.
Such
a t r a n s a c t i o n would
have to
set and r e s e t
a large
365
366
J.N. Gray, R.A. Lorie, G.R. Putzolu and.LL. Traiger
number
of
locks,
i n c u r r i n g the
computational
overhead
of
many
invocations
of the
lock
s u b s y s t e m , and
the
storage overhead
of
representing
many locks.
A coarse
l o c k a b l e unit
(for e x a m p l e
a
file) is p r o b a b l y
c o n v e n i e n t for a t r a n s a c t i o n
which a c c e s s e s many
records.
However,
such
a
coarse
unit
discriminates
against
transactions which only
want to lock one m e m b e r o f
t h e file.
From
this
~ i s c u s s i o n it
follows
that it
would
be
d e s i r a b l e to
have
lockable units
of d i f f e r e n t
granularities coexisting
in the
same
system.
This paper
p r e s e n t s a lock
and
discusses
the
related
g r a n t i n g and c o n v e r t i n g lock
Hierarchical
protocol satisfying
these
implementation
issues
of
requests.
requirements
scheduling,
locks:
We will
first a s s u m e
that t h e
set of
r e s o u r c e s to
be l o c k e d
is
o r g a n i z e d in a
hierarchy.
N o t e that t h i s h i e r a r c h y is
used in the
c o n t e x t of a c o l l e c t i o n of r e s o u r c e s and
has n o t h i n g to do with t h e
data model used
in a data base
system.
The h i e r a r c h y of
Figure I
may be
suggestive.
We a d o p t
the n o t a t i o s
that each l e v e l
of t h e
h i e r a r c h y is given a
n o d e t y p e w h i c h is a g e n e r i c
name for a l l the
n o d e i n s t a n c e s of
t h a t type.
For e x a m p l e , the d a t a
base has n o d e s
of type
area as its
immediate descendants,
e a c h area in
turn h a s
n o d e s of
type file as its
i m m e d i a t e d e s c e n d a n t s and e a c h
file has
n o ~ e s of type r e c o r d as its
i m m e d i a t e d e s c e n d a n t s in the h i e r a r c h y .
S i n c e it is a h i e r a r c h y , each node has a u n i q u e p a r e n t .
DATA
BASE
i
l
AREAS
i
I
FILES
l
i
RECORDS
Figure
1.
A sample
lock
hierarchy.
Each node of t h e h i e r a r c h y can be locked.
If one r e q u e s t s e x c l u s i v e
a c c e s s (X) to
a p a r t i c u l a r node, then when the
r e q u e s t is granted,
the r e q u e s t o r
has e x c l u s i v e a c c e s s to
that n o d e and
implicitly to
each of
its d e s c e n d a n t s .
If
one requests
s h a r e d a c c e s s (S)
to a
p a r t i c u l a r node, then when the r e q u e s t is g r a n t e d , the r e g u e s t o r h a s
s h a r e d a c c e s s to t h a t node and i_mp_l_icitly to e a c h d e s c e n d a n t of that
node.
These two
a c c e s s modes lock an e n t i r e s u b t r e e
r o o t e d at t h e
r e q u e s t e d node.
3ur goal is to find s o m e
t e c h n i q u e for i_m~_l_icitl~ l o c k i n g an e n t i r e
subtree.
In o r d e r
to lock a s u b t r e e
r o o t e d at n o d e R
in s h a r e or
e x c l u s i v e mode it is i m p o r t a n t to
prevent
s h a r e or e x c l u s i v e l o c k s
on
the
a n c e s t o r s of
R
which
would
implicitly
lock R and
its
descendants.
Hence
a new
access
mode,
i n t e n t i o n mode
(I),
is
introduced.
I n t e n t i o n mode is used to "tag" (lock) all a n c e s t o r s of
a node to
be l o c k e d in s h a r e
or e x c l u s i v e mode. These
tags s i g n a l
t h e fact that l o c k i n g
is being done at a " f i n e r "
level and t h e r e b y
prevents
implicit
or e x p l i c i t
exclusive
or
share l o c k s
on
the
ancestors.
Gra.ularity oflocks and degrees o f eonsistemy
367
The p r o t o c o l
to lock
a subtree rooted
at node
R in
e x c l u s i v e or
s h a r e m o d e is to first lock all a n c e s t o r s of R in i n t e n t i o n m o d e and
%hen to lock node R in e x c l u s i v e
or s h a r e mode.
For e x a m p l e , using
F i g u r e I,
to lock
a particular
f i l e one
should obtain
intention
~ c z e s s to the
data base, to t h e
a r e a c o n t a i n i n g the f i l e
and then
request
e x c l u s i v e (or
share)
a c c e s s to
the
file i t s e l f .
This
i m p l i c i t l y locks
all r e c o r d s
of t h e file
in e x c l u s i v e
(or share)
mode.
Access
modes
and
compatibility:
we say
t h a t t w o lock
r e q u e s t s for the
same n o d e by
two d i f f e r e n t
t r a n s a c t i o n s are
compatible if t h e y can be g r a n t e d
concurrently.
The mode of
the r e q u e s t d e t e r m i n e s its
c o m p a t i b i l i t y with r e q u e s t s
made
by
other t r a n s a c t i o n s .
The
three
modes
X,
S and
I
are
i n c o m p a t i b l e with one a n o t h e r but d i s t i n c t S r e q u e s t s may be g r a n t e d
t o g e t h e r and d i s t i n c t I r e q u e s t s m a y be g r a n t e d t o g e t h e r .
The c o m p a t i b i l i t i e s among m o d e s d e r i v e
from their s e m a n t i c s .
Share
mode
allows
reading
but not
modification
of
the
corresponding
r e s o u r c e by the r e q u e s t o r and
by other transactions.
The semantics
of
e x c l u s i v e mode
is
that t h e
g r a n t e e may
read
and modify
%he
r e s o u r c e but
no other t r a n s ~ c t i o n m a y
read or m o d i f y
the resource
while the e x c l u s i v e lock is set.
T h e r e a s o n for d i c h o t o m i z i n g share
and e x c l u s i v e a c c e s s
is that s e v e r a l share r e q u e s t s
can be g r a n t e d
concurrently
(are
c o m p a t i b l e ) w h e r e a s an
exclusive request
is not
c o m p a t i b l e with any other r e q u e s t .
I n t e n t i o n mode was i n t r o d u c e d to
be i n c o m p a t i b l e with s h a r e and e x c l u s i v e
mode (to p r e v e n t s h a r e and ~
e x c l u s i v e locks).
However, i n t e n t i o n mode is c o m p a t i b l e with itself
since
two
transactions having
intention
access
to a
node
will
explicitly
lock d e s c e n d a n t s
of t h e
node in
X,
S or
I mode
and
thereby
will either
be
c o m p a t i b l e with
one
another
or w i l l
be
s c h e d u l e d on
the basis of t h e i r
r e q u e s t s at the f i n e r
level.
For
example, two
t r a n s a c t i o n s can
s i m u l t a n e o u s l y be
granted the
data
base and some
area and s o m e f i l e
in i n t e n t i o n mode.
In
t h i s case
t h e i r e x p l i c i t locks on p a r t i c u l a r r e c o r d s
in the file w i l l r e s o l v e
a n y c o n f l i c t s among them.
The n o t i o n of i n t e n t i o n m o d e is r e f i n e d to i n t e n t i o n s h a r e mode (IS)
and i n t e n t i o n
e x c l u s i v e mode
(IX) for
two reasons:
the intention
s h a r e m o d e only r e q u e s t s share or i n t e n t i o n s h a r e locks at t h e l o w e r
n o d e s of the
tree (i.e. n e v e r r e q u e s t s an e x c l u s i v e
l o c k b e l o w the
intention share
node), hence IS is
c o m p a t i b l e with S
mode.
Since
read
o n l y is
a common
form of
access
it will
be p r o f i t a b l e
to
distinguish
this
for
greater
concurrency.
Secondly,
if
a
t r a n s a c t i o n has
an i n t e n t i o n share
lock on
a n o d e it
can c o n v e r t
% h i s to
a share
lock at a
l a t e r time, but
one c a n n o t
c o n v e r t an
i n t e n t i o n e x c l u s i v e lock to
a s h a r e lock on a node.
Rather to get
the c o m b i n e d rights
of s h a r e m o d e and i n t e n t i o n
e x c l u s i v e mode one
must o b t a i n an X or SIX mode
lock.
(This issue is d i s c u s s e d in t h e
s e c t i o n on r e r e q u e s t s below).
We
r e c o g n i z e one
further
r e f i n e m e n t of
modes,
namely share
and
intention exclusive
mode (SIX).
Suppose
one t r a n s a c t i o n
wants to
read
an e n t i r e
subtree
and to
update
particular
n o d e s of
that
subtree.
Using the modes p r o v i d e d so
far it would h a v e t h e o p t i o n s
of: (a) r e q u e s t i n g
e x c l u s i v e a c c e s s to the root of
%he s u b t r e e and
doing
no f u r t h e r
locking
or
(b) r e q u e s t i n g
intention
exclusive
a c c e s s to the
root of the s u b t r e e and e x p l i c i t l y
locking the lower
n o d e s in
i n t e n t i o n , s h a r e or
e x c l u s i v e mode.
Alternative
(a) h a s
low c o n c u r r e n c y .
If o n l y a
small f r a c t i o n
of the r e a d
n o d e s are
368
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
updated then alternative
(b) has high l o c k i n g o v e r h e a d .
The c o r r e c t
a c c e s s mode
w o u l d be share a c c e s s
to the s u b t r e e
thereby allowing
the t r a n s a c t i o n
to r e a d
all n o d e s of
the s u b t r e e
without further
locking
_and intention
exclusive access
to
the
subtree
thereby
a l l o w i n g the t r a n s a c t i o n
to set
e x c l u s i v e l o c k s on
t h o s e n o d e s in
the s u b t r e e
w h i c h are
to be
u p d a t e d and
IX or
SiX l o c k s
on the
i n t e r v e n i n g Nodes.
Since this is such
a c o m m o n case, SIX
m o d e is
i n t r o d u c e d for
this purpose.
It is
c o m p a t i b l e with IS
mode since
other t r a n s a c t i o n s
r e q u e s t i n g IS
m o d e will
e x p l i c i t l y lock
lower
nodes in IS
or S m o d e t h e r e b y
a v o i d i n g any u p d a t e s (IX
or X mode)
produced
by the
SIX mode
transaction.
H o w e v e r SIX
m o d e is
not
c o m p a t i b l e with IX, S, SIX or X m o d e r e q u e s t s .
Table 1
g i v e s the
c o m p a t i b i l i t y of
the request
modes, where
for
completeness
we
have also
introduced
the
null mode
(NL) which
r e p r e s e n t s t h e a b s e n c e of r e q u e s t s of a r e s o u r c e by a t r a n s a c t i o n .
I_
I
I
1
I
NL
NL
Y ES
IS
YES
IX
Y ES
S
YES
I S IX
Y ES
I_X___I_YES
Table
To s u m m a r i z e ,
IS
YES
YES
YES
YES
[ES
NO
IX
YES
YES
YES
NO
NO
NO
1. C o m p a t i b i l i t i e s
we r e c o g n i z e
six m o d e s
S
YES
YES
NO
YES
NO
NO
SIX
YES
YES
NO
NO
NO
NO
among
access
of
i.e.
access
X
YES
NO
NO
NO
NO
NO
to
represents
modes.
a resource:
NL:
Gives
no a c c e s s
to a node,
r e q u e s t of a r e s o u r c e .
the
absence
IS:
G i v e s •intention share a c c e s s to
the r e q u e s t o r
to lock d e s c e n d a n t
does no i m p l i c i t locking.)
IX:
Gives intention
exclusive
a c c e s s to
the
requested
a l l o w s the
r e g u e s t o r to
exRli____qcit_~l
x lock
descendants
SiX, IX or IS mode.
(It does n o i m p l i c i t locking.)
t h e r e q u e s t e d node
n o d e s in
S or IS
of
a
and a l l o w s
mode.
(It
node
and
in
X, S,
S: G i v e s s h a r e
access to the r e q u e s t e d n o d e and
to all d e s c e n d a n t s
of
the
requested
node
without
setting
further
locks.
(It
implicitly
sets S locks on
all d e s c e n d a n t s
of the
requested
node. )
SIX:
Gives
s h a r e and
intention exclusive
a c c e s s to
the requested
node.
(In p a r t i c u l a r it i m p l i c i t l y l o c k s all d e s c e n d a n t s of the
n o d e in s h a r e
mode ~nd a l l o w s the r e q u e s t o r
to e x p l i c i t l y lock
d e s c e n d a n t nodes in X, SIX or IX mode.)
X:
Gives
exclusive
access
to
the
requested
node
and
to
all
d e s c e n d a n t s of the r e q u e s t e d n o d e w i t h o u t s e t t i n g f u r t h e r locks.
(It i m p l i c i t l y s e t s
X l o c k s on all
descendants.
L o c k i n g lower
n o d e s in S or IS mode w o u l d g i v e no i n c r e a s e d access.)
IS m o d e is
the weakest n o n - n u l l form
of a c c e s s to a
resource.
It
c a r r i e s f e w e r p r i v i l e g e s than IX or S modes.
IX mode a l l o w s IS, IX,
S, S I X and X
mode lecks to be set on d e s c e n d a n t
n o d e s while S mode
allows
r e a d only
access to
all
d e s c e n d a n t s of
the n o d e
without
f u r t h e r locking.
SIX mode c a r r i e s
the privileges
of S and
of IX
Granularity oflocks and degrees of consistency
369
mode (hence the
n a m e SIX).
X mode
is th.= m o s t p r i v i l e g e d
form of
a z c e s s and a l l o w s
r e a d i n g and w r i t i n g of all d e s c e n d a n t s
of a node
without
f u r t h e r locking.
H e n c e the
mod~s
can be
r a n k e d in
the
p a r t i a l o r d e r (lattice) of p r i v i l e g e s s h o w n
in F i g u r e 2.
Note that
it is not a t o t a l o r d e r since IX and S are i n c o m p a r a b l e .
X
I
I
SIX
I
I
I
IX
I
I
I
I
IS
I
I
NL
Figure
Rules
for
2. The p a r t i a l
ordering
of
modes
by
their
privileges.
request_in S nodes:
Th-= i m p l i c i t
l o c k i n g of
nodes will
not w o r k
if t r a n s a c t i o n s
are
a l l o w e d to leap into the m i d d l e of
the tree and b e g i n l o c k i n g n o d e s
at
random.
The
implicit
locking implied
by t h e
S
and X
modes
d e p e n d s on all t r a n s a c t i o n s o b e y i n g the f o l l o w i n g protocol:
(a)
Before requesting
of the
requested
reguestor.
an S or IS
lock on a node, all a n c e s t o r n o d e s
node
m u s t be
held in
IX or
IS m o d e
by t h e
(b)
B e f o r e r e q u e s t i n g an
X, SIX
n o d e s of the
r e q u e s t e d node
or IX lock on
a node, a l l a n c e s t o r
must be
h e l d in SIX or
IX m o d e by
the requestor.
(c)
Locks
s h o u l d be r e l e a s e d e i t h e r
at t h e
(in any order) or in leaf to root order.
are not held to
end of t r a n s a c t i o n , one
after r e l e a s i n g i t s a n c e s t o r s .
end of
the t r a n s a c t i o n
In p a r t i c u l a r , if locks
s h o u l d not
hold a l o c k
To p a r a p h r a s e this,
l o c k s are requested, r o o t to
le_aaf, a_n__dd_released
leaf
to
root.
Notice
that
leaf
nodes
are never
requested
in
i n t e n t i o n mode since they have no d e s c e n d a n t s .
Several
examples:
To lock r e c o r d R for read:
lock d a t a - b a s e
with
lock a r e a c o n t a i n i n g R
with
l o c k file c o n t a i n i n g R
with
lock r e c o r d R
with
D o n ' t panic,
the t r a n s a c t i o n
a r e a and file lock.
mode = IS
mode = IS
mode = IS
mode = S
probably already
has
the
data
base,
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
370
To lock r e c o r d R for w r i t e - e x c l u s i v e
access:
lock data-base
with m o d e = IX
lock a r e a c o n t a i n i n g R
with mode = IX
lock f i l e c o n t a i n i n g R
w i t h m o d e = IX
lock r e c o r d R
with mode = X
Note
t h a t if
the
r e c o r d s of
t h i s and
the
previous
distinct, each
r e q u e s t can be
granted simultaneously
t r a n s a c t i o n s even t h o u g h both r e f e r to t h e s a m e file.
To lock a f i l e F for read and w r i t e access:
lock data-base
with mode = IX
lock a r e a c o n t a i n i n g F
w i t h m o d e = IX
lock f i l e F
with mode = X
Sinc~ t h i s
r e s e r v e s e x c l u s i v e a c c e s s to
the file,
uses t h e
s a m e file
as the p r e v i o u s
two e x a m p l e s
t r a n s a c t i o n s will h a v e to wait.
example
are
to d i f f e r e n t
if
this request
it or
the other
To l o c k a f i l e F for c o m p l e t e scan and o c c a s i o n a l update:
lock d a t a - b a s e
with mode = IX
lock a r e a c o n t a i n i n g F
w i t h m o d e = IX
lock f i l e F
with mode = SIX
Thereafter, particular
r e c o r d s in
F can
be l o c k e d
for update
by
locking
records
in
X
mode.
Notice
that
(unlike
the
previous
example)
this transaction
is c o m p a t i b l e
with
the f i r s t
example.
This is t h e r e a s o n for i n t r o d u c i n g SIX rood_
~.
To q u i e s c e t h e data base:
lock d a t a base with mode = X.
N o t e that t h i s l o c k s e v e r y o n e else
Directed
out.
acycl_~ic qra]~hs of locks:
The
notions
so far
introduced
can
be
g e n e r a l i z e d to
work
for
~irected
acyclic
graphs
(DAG) of
resources
rather
than
simply
hierarchies
of
resources.
A
tree is
a
simple
DAG.
The
key
o b s e r v a t i o n is
that to
i m p l i c i t l y or e x p l i c i t l y
lock a
node, one
should
l o c k _all
the p a r e n t s
of
the nod~
in
t h e DAG
and so
by
i n d u c t i o n l o c k all a n c e s t o r s of the
node.
In p a r t i c u l a r , to l o c k a
s u b g r a p h one must i m p l i c i t l y or e x p l i c i t l y l o c k all a n c e s t o r s of the
subgraph
in t h e
appropriate mode
(for a
tree t h e r e
is only
one
parent).
To
give
an e x a m p l e
of
a
non-hierarchical
structure,
i m a g i n e t h e l o c k s are o r g a n i z e d as in F i g u r e 3.
DATA
BASE
I
I
AREAS
I
I
I
FILES
INDICES
I
I
I
__I
I
I
R ECOR DS
Figure
3. A n o n - h i e r a r c h i c a l
lock
graph.
Gramdarity of looks and degrees of consisteney
371
We
p o s t u l a t e that
areas
are " p h y s i c a l "
nDtions
and that
files,
indices
and
r e c o r d s are
logical
notions.
The
data b a s e
is
a
collection
of
areas.
Each
area
is
a
c o l l e c t i o n of
files
and
indices.
Each
file has
a corresponding
i n d e x in
the same
area.
Each record b e l o n g s to s o m e f i l e
and t o its c o r r e s p o n d i n g index.
A
r e c o r d is c o m p r i s e d of field v a l u e s and so~e f i e l d is i n d e x e d by t h e
index
associated with
the file
containing the
record.
The
file
g i v e s a s e q u e n t i a l a c c e s s path to the r e c o r d s and the i n d e x gives an
a s s o c i a t i v e a c c e s s p a t h to t h e r e c o r d s based on field values.
Since
individual fields are
n e v e r locked, they ~o not a p p e a r
in t h e l o c k
graph.
To write a r e c o r d R in file F w i t h index I:
l o c k d a t a base
with mode = IX
lock area c o n t a i n i n g F
w i t h m o d e = IX
lock file F
with mode = IX
lock i n d e x I
w i t h m o d e = IX
lock r e c o r d E
with mode = X
Note that
a l l paths
to r e c o r d
could lock F and I in e x c l u s i v e
e x c l u s i v e mode.
R
are locked.
Alternaltively,
one
mode t h e r e b y i m p l i c i t l y l o c k i n g R in
To give a
more
c o m p l e t e e x p l a n a t i o n we o b s e r v e that a
node can be
l o c k e d e_/x~lici_t!~ (by
r e q u e s t i n g it) or implici_tl I
(by a p p r o p r i a t e
e x p l i c i t l o c k s o n the
a n c e s t o r s of the node) in o n e
of ~ive modes:
IS, IX,
S, SIX, X.
However,
t h e d e f i n i t i o n of i m p l i c i t
locks and
the protocols
for s e t t i n g
e x p l i c i t l o c k s have
to be
e x t e n d e d for
D A G ' s as f o l l o w s :
A n o d e is i_m~licit_!l~ g r a n t e d in S
mode to a t r a n s a c t i o n if at least
of
its p a r e n t s
is ( i m p l i c i t l y
or e x p l i c i t l y )
g r a n t e d to
the
t r a n s a z t i o n in S,
SIX or X mode.
By i n d u c t i o n that m e a n s
t h a t at
l e a s t one of
the n o d e ' s a n c e s t o r s m u s t be e x p l i c i t l y
g r a n t e d in S,
SIX or X mode to the t r a n s a c t i o n .
one
A node
is imDl~Gitl__[ ~ r a n t e d
in X mode if
~ ! ! of its
p a r e n t s are
( i m p l i c i t l y or e x p l i c i t l y )
g r a n t e d to the t r a n s a c t i o n in X mode.
By
i n d u c t i o n , this
is e q u i v a l e n t
to the c o n d i t i o n
that all
n o d e s in
some cut set of the c o l l e c t i o n of all paths l e a d i n g f r o m the node to
t h e r o o t s of the g r a p h are
e x p l i c i t l y g r a n t e d to the t r a n s a c t i o n in
X mode
and all
a n c e s t o r s of n o d e s
in the
c u t set
are e x p l i c i t l y
g r a n t e d in IX or SIX mode.
From Figure
2, a n o d e is i m p l i c i t l y
g r a n t e d in IS
m o d e if
it is
implicitly granted
in S mode, a n d
a n o d e is i m p l i c i t l y
g r a n t e d in
IS, IX, S and SIX mode if it is i m p l i c i t l y g r a n t e d in X mode.
~h~
protocol
for ~ ! ~ ! ~
~s~t~n~
locks
on
a DAG:
(a) B e f o r e r e q u e s t i n g an S or IS
lock on a node, one s h o u l d r e q u e s t
at least one
parent
(and by i n d u c t i o n
a path to a
root) in IS
(or greater) mode.
As a c o n s e q u e n c e none of the a n c e s t o r s a l o n g
this
path can
be
g r a n t e d to
another
transaction
in a
mode
i n c o m p a t i b l e with IS.
(b) B e f o r e r e q u e s t i n g IX, SIX or X m o d e a c c e s s to a node, one should
r e q u e s t all p a r e n t s of
the node in IX (or greater)
mode.
As a
c o n s e q u e n c e all a n c e s t o r s
will be held in IX
(or g r e a t e r mode)
and c a n n o t be held b y o t h e r
t r a n s a c t i o n s in a mode i n c o m p a t i b l e
with IX (i. e. S, S I X , X).
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
372
(c)
Locks
s h o u l d be r e l e a s e d e i t h e r
at the end of
the transaction
(in a n y
order) or
in leaf
to root
order.
In
particular,
zt
locks are
not held to
the end
of t r a n s a c t i o n ,
one
s h o u l d not
hold a lower lock after releasing
its ancestors.
To g i v ~
in f i l e
l o c k on
lock
lock
lock
an e x a m p l e u s i n g F i g u r e 3,
a sequential
scan
F n e e d n o t use
a n i n d e x so
o n e can g e t an
e a c h r e c o r d in t h e f i l e by:
data
area
file
base
containing
F
F
with
with
with
mode
mode
mode
of all records
implicit share
= IS
= IS
= S
This gives implicit S mode access
to all r e c o r d s i n F.
to r e a d a r e c o r d in a f i l e via t h e
i n d e x I f o r f i l e F,
g e t a n i m p l i c i t or e x p l i c i t l o c k o n f i l e F:
lock
lock
lock
data base
area containing
index I
R
with
with
with
mode
mode
mode
= IS
= IS
= S
implicit S mode access
to all records
This again gives
both these cases,
~n__II o n ~
9ath was
(in f i l e
F).
In
readinq.
But
one
Conversely,
o n e n e e d not
to insert,
d e l e t e or u p d a t e a r e c o r d
m u s t g e t an i m p l i c i t or e x p l i c i t
lock
in
index I
locked for
R in f i l e F
with index
on all ancestors
of R.
I
The first example of
r e c o r d is o b t a i n e d .
file one can simply
a r e a in X mode.
The
file
without further
implicitly
g r a n t e d in
this section showed how an explicit
X l o c k on a
To
g e t a n i m p l i c i t X l o c k on
a l l r e c o r d s in a
lock the index and file in X
m o d e , o r l o c k the
latter examples
a l l o w b u l k l o a d o r u p d a t e of a
locking since
all
r e c o r d s in
the file
are
X mode.
Proof
of the
of
!~uivalence
lock
protocol.
We will now prove that the
described
lock protocol is equivalent
to
a conventional
one
which uses only two
modes
(S a n d X),
and which
explicitly
locks atomic resources
(the l e a v e s
of a t r e e o r s i n k s of
a DAG).
Let G =
(N,A) be a f i n i t e
( d i r e c t e d a c y c l i c ) ~_raph w h e r e
N is t h e
set of nodes and
A i s t h e s e t of a r c s .
G is
a s s u m e d to b e w i t h o u t
circuits
(i.e. t h e r e
is no n o n - n u l l
path
leading from a node
n to
itselt~.
A n o d e p is a p a r e n t of a n o d e
a a n d n is a c h i l d o f p if
t h e r e is a n a r c f r o m
p t o n.
A node n is a
source
(sink) i f n has
no parents
(no c h i l d r e n ) .
L e t SI b e
th~ s e t o f
sinks of
G.
An
a n c e s t o r of n o d e n is any n o d e
( i n c l u d i n g n) i n a p a t h f r o m a s o u r c e
t o n.
A node-slice
o f a s i n k n is a c o l l e c t i o n
of nodes such that
each path
from a
source to
n contains
at l e a s t
one node
of the
slice.
We a l s o i n t r o d u c e
the set of lock modes M
= [NL, IS, I K , S , S I X , X }
and
the compatibility
matrix C
: MxM->{YES,N3}
described
in
T a b l e I.
Let c : mxm->{YES,NO}
be the r e s t r i c t i o n
of C t o m = [NL, S , X } .
373
Granularity of locks and degrees of consistency
A lock_-qr_a~h is a m a p p i n g L : N->M such that:
(a) if L(n) e {IS,S} then
e i t h e r n is
a s o u r c e or t h e r e
exists a
p a r e n t p of n such that
L(p) e [ I S , I X , S , S I X , X } .
By i n d u c t i o n
there e x i s t s a
path from a source
to n such that
L t a k e s only
v a l u e s in {IS,IX,S,SIX, X] on it.
E q u i v a l e n t l y L is not e q u a l to
NL on the path.
(by if L(n) e {IX,SIX,X] then e i t h e r n
is a root or for all p a r e n t s
pl...pk of n we h a v e L(pi) e {IX, SIX, X} (i=1...k).
By i n d u c t i o n
L t a k e s o n l y v a l u e s in {IX,SIX, X] on all the a n c e s t o r s of n.
The i n t e r p r e t a t i o n
of a l o c k - g r a p h
is t h a t it
gives a map
of the
explicit locks
held by a
particular transaction observing
the six
state lock p r o t o c o l d e s c r i b e d above.
The
n o t i o n of p r o j e c t i o n of a
l o c k - g r a p h is now
i n t r o d u c e d to m o d e l the set of
i m p l i c i t l o c k s on
a t o m i c r e s o u r c e s a c q u i r e d by a t r a n s a c t i o n .
The ~ r o j e c t i o n of a l o c k - g r a p h L is the m a p p i n g I: SI->m c o n s t r u c t e d
as f o l l o w s :
(a) l ( n ) = X
if there e x i s t s
a node-slice
[nl...ns} of n
such that
L(ni) =X for each node in t h e slice.
(b) 1 (n)=S if (a) is not s a t i s f i e d and there e x i s t s an a n c e s t o r a of
n such that L(a) q [S,SIX,X].
(c) I ( n ) = N L if (a) and (b) are not satisfied.
Two
lock - g r a p h s
LI
an d
L2
are
said
to
be
comma tible
C(LI(n) , L 2 ( n ) ) = Y E S for all n e N.
S i m i l a r l y two p r o j e c t i o n s 11
12 are c o m p a t i b l e if c(11 (n),12 (n))=YES for all n e SI.
if
and
Theorem:
If two l o c k - g r a p h s
LI and
11 and 12 are
compatible.
by two t r a n s a c t i o n s do not
i m p l i c i t l y a c q u i r e d do not
L2 are c o m p a t i b l e
then their p r o j e c t i o n s
In o t h e r words if the
e x p l i c i t locks set
conflict
t h e n a l s o the t h r e e - s t a t e l o c k s
conflict.
Proof: Assume
that 11 and
12 are
incompatible.
We want
to p r o v e
that LI
and L2
are i n c o m p a t i b l e .
By d e f i n i t i o n
of c o m p a t i b i l i t y
there must e x i s t
a sink n s u c h
that l l ( n ) = X and 12(n)
e [S,X} (or
vize
versa).
By
definition
of p r o j e c t i o n
there
must
exist
a
n o d e - s l i c e {nl...ns} of n such that L I ( n l ) = . , . = L I (ns)=X.
Also t h e r e
must exist an
a n c e s t o r nO of n
such t h a t L2 (nO) e
[S,SIX,X}. F r o m
the d e f i n i t i o n of l o c k - g r a p h t h e r e is a
path P1 from a source to nO
on w h i c h L2 d o e s not take the v a l u e NL.
If
P 1 intersects
the
node-slice
at
ni
then
i n c o m p a t i b l e s i n c e L l ( n i ) = X w h i c h is
incompatible
v a l u e of L2 (hi).
H e n c e the t h e o r e m is proved.
LI
and
L2
are
with the n o n - n u l l
Alternatively
t h e r e is
a
path P2
from
n0 to
the
sink n
which
i n t e r s e c t s the n o d e - s l i c e at hi.
F r o m the d e f i n f t i o n of l o c k - g r a p h
LI
takes
a value
in
{IX,SIX,X] on
all
ancestors
of
hi.
In
p a r t i c u l a r LI (nO)
e {IX, SiX, X] . S i n c e
L2 (n0) e [S,SIX,X}
we h a v e
C (L1(n0) ,L2 (n0)) =NO.
Q.E.D.
Dynamic lock
qr_a~hs:
Thus far we h a v e p r e t e n d e d that
the lock g r a p h is static.
However,
examination
of
Figure
3 suggests
otherwise.
Areas,
files
and
i n d i c e s are d y n a m i c a l l y c r e a t e d and d e s t r o y e d , and of c o u r s e r e c o r d s
are c o n t i n u a l l y i n s e r t e d ,
updated, and d e l e t e d .
(If
the data D a s e
is only read, t h e n there is no need for l o c k i n g at all.)
374
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
We
introduce
the
lock
p r o t o c o l for
dynamic
DAG' s
by
example.
C o n s i d e r the
i m p l e m e n t a t i o n of i n d e x
i__nterval locks.
Rather than
b e i n g f o r c e d to lock e n t i r e i n d i c e s or i n d i v i d u a l
r e c o r d s , we would
l i k e to be able to l o c k all
r e c o r d s with a c e r t a i n c o n t i g u o u s r a n g e
of i n d e x values;
for e x a m p l e , l o c k all r e c o r d s in
the bank account
f i l e with the l o c a t i o n f i e l d e q u a l to N a p a .
T h e r e f o r e , t h e index is
p a r t i t i o n e d into l o c k a b l e key v a l u e
intervals.
E a c h i n d e x e d record
"belongs', to a
p a r t i c u l a r i n d e x i n t e r v a l and all r e c o r d s
in a file
with the
same field v a l u e
on an i n d e x e d
field will belong
to the
s a m e key v a l u e
i n t e r v a l (i.e. all Napa a c c o u n t s will
b e l o n g to the
same i n t e r v a l ) .
This new s t r u c t u r e is d e p i c t e d in F i g u r e 4.
In [ I]
s u c h locks
were c a l l e d p r e d i c a t e l o c k s
and and an
a l t e r n a t e (more
g e n e r a l but less e f f i c i e n t ) i m p l e m e n t a t i o n was proposed.
DATA
BASE
I
i
AREAS
l
l
FILE
l
l
I
INDICES
l
l
INDEX
INTERVALS
........
I
I
I
UN-INDEXED
FIELDS
I
I
I
I
I
~ECORD
IDENTIFIERS
F i g u r e 4. The l o c k g r a p h
l
I
I
I
I
I
I
I
I
INDEXED
FIELDS
with index i n t e r v a l
locks.
The only subtle a s p e c t of F i g u r e
4 is the d i c h o t o m y b e t w e e n i n d e x e d
and u n - i n d e x e d
fields.
Since
t h e indexed
field value
and record
i d e n t i f i e r (logical address)
a p p e a r in the index, one
can r e a d the
i n d e x e d field d i r e c t l y (i.e. w i t h o u t
" t o u c h i n g " the record).
Hence
an index
i n t e r v a l is
a parent of
the c o r r e s p o n d i n g
field values.
F u r t h e r , the
index "points,, via
r e c o r d i ~ e a t i f i e r s to
all r e c o r d s
with that value and
so is a p a r e n t of all
such record identifiers.
On t h e o t h e r hand, one c a n read
and update u n - i n d e x e d fields of the
re=ord
without a f f e c t i n g
t h e index
and so
the f i l e
is the
only
p a r e n t of such fields.
When an i n d e x e d field is u p d a t e d ,
it and its r e c o r d i d e n t i f i e r move
from
one index
interval
to another.
For
example,
when a
Napa
a c c o u n t is
moved to the St.
H e l e n a branch, the a c c o u n t
r e c o r d and
its l o c a t i o n field
" l e a v e " t h e N a p a i n t e r v a l of
the l o c a t i o n i n d e x
and "join"
the St.
Helena index
interval.
When
a new
r e c o r d is
i n s e r t e d it "joins" the i n t e r v a l c o n t a i n i n g
the new f i e l d value and
a l s o it
"joins" the
file.
Deletion
rem3ves t h e
r e c o r d from
the
i n d e x i n t e r v a l and from
t h e file.
index is not a
l o c k a n c e s t o r ot
s u c h fields.
Granularity oflocks and degrees o f consistency
375
S i n c e F i g u r e 4 d e f i n e s a DAG, a l b e i t
a d y n a m i c DAG, the p r o t o c o l of
the p r e v i o u s
s e c t i o n can
be used
to lock
the n o d e s
of the
DaG.
However,
the
protocol should
be
extended
as f o l l o w s
to
handle
d y n a m i c c h a n g e s to the lock graph:
(d}
Before moving
a
nods in
the
lock graph,
the
n o d e must
be
i m p l i c i t l y or e x p l i c i t l y
g r a n t e d in X mode in both
its old and
its new
p o s i t i o n in the graph.
F u r t h e r , the n~de must
not be
moved in such a way as to c r e a t e a c y c l e in the graph.
Carrying
out the
example
of this
section, to
move
a Napa
Dank
a c c o u n t to the St. H e l e n a branch:
lock d a t a base
in m o d e = IX
lock a r e a c o n t a i n i n g a c c o u n t s in mode = IX
in m o d e = IX
lock a c c o u n t s file
in mode = IX
lock l o c a t i o n index
in m o d e = IX
lock Napa i n t e r v a l
in mode = IX
lock St. H e l e n a i n t e r v a l
in m o d e = IX
lock r e c o r d
in m o d e = X.
lock f i e l d
get
an i m p l i c i t
lock
on
the field
by
Alternatively,
one c o u l d
r e q u e s t i n g e x p l i c i t X mode l o c k s on the r e c o r d and index i n t e r v a l s .
Schedulinq and ~ran~n~
r~ues~s:
Thus far
we have
d e s c r i b e d the
s e m a n t i c s of
the v a r i o u s
request
m o d e s and h a v e d e s c r i b e d the
p r o t o c o l w h i c h r e q u e s t o r s must f o l l o w .
To c o m p l e t e the d i s c u s s i o n we d i s c u s s how r e q u e s t s are s c h e d u l e d and
granted.
The set
of a l l
r e q u e s t s for a p a r t i c u l a r r e s o u r c e
are kept
in a
queue sorted
by some
fair s c h e d u l e r .
By " f a i r "
we mean
t h a t no
particular
transaction
will
be
delayed
indefinitely.
First-in
first-out
is
the s i m p l e s t
fair
scheduler
and
we adopt
such
a
s c h e d u l e r for t h i s d i s c u s s i o n m o d u l o d e a d l o c k p r e e m p t i o n d e c i s i o n s .
The g r o u p of
m u t u a l l y c o m p a t i b l e r e q u e s t s for
a resource appearing
at t h e
head of the
queue is c a l l e d
the ~ r a n t e d ~ro_u~.
all t h e s e
requests
can
be
granted
concurrently,
assuming
that
each
transaction
has
at
most
one
request
in
the
queue
then
the
c o m p a t i b i l i t y of two r e q u e s t s by d i f f e r e n t t r a n s a c t i o n s d e p e n d s o n l y
on the
modes of
the r e q u e s t s and
may be
c o m p u t e d using
T a b l e I.
Associated
with the
granted g r o u p
is a
Wro_u~ m o d e
w h i c h is
the
s u p r e m u m mode
of the m e m b e r s of
the g r o u p which is
computed using
F i g u r e 2 or T a b l e 3.
T a b l e 2 gives
a list of the p o s s i b l e t y p e s of
r e q u e s t s t h a t can
c o e x i s t in a g r o u p and the
c o r r e s p o n d i n g m o d e of
the group.
T a b l e 2. P o s s i b l e r e q u e s t g r o u p s and t h e i r group mode.
Set b r a c k e t s i n d i c a t e that several such r e q u e s t s may be p r e s e n t .
[
MODES OF
L____RE.Q_B_E_S,TS
I
X
I
I
I
szx, {zs}
s, {s}, [Is}
I ....
Figure 5
i;
IX, ~{IX], [IS}
IS-[I_S}
....
d e p i c t s the queue for
M O D E OF
GR 0 UP
X
SIX
S
IX
IS
a particular
resource,
showing the
AN. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
376
requests
and
t h e i r modes.
The
granted
group consists
of
five
r e q u e s t s and has
g r o u p mode IX.
The
next r e q u e s t in the
q u e u e is
for S m o d e
w h i c h is i n c o m p a t i b l e w i t h
the g r o u p mode IX
and hence
must wait.
* G R A N T E D G R O U P : G R O U P ~ O D E = IX *
t is i--i ix i--i ZSl--i ZSl--i ZSl--,- i s [- i is [- ix i- iZSl- fix i
Figure
5. The q u e u e
of r e q u e s t s for
a resource.
When a new r e q u e s t for a r e s o u r c e arrives, t h e s c h e d u l e r a p p e n d s it
to the end
of the queue.
There
are two c a s e s to
consider: e i t h e r
someone
is a l r e a d y
w a i t i n g or
all o u t s t a n d i n g
r e q u e s t s for
this
r e s o u r c e are g r a n t e d (i.e. no one is w a i t i n g ) .
If no one is waiting
and the new
r e q u e s t is c o m p a t i b l e with the g r a n t e d
group m o d e t h e n
the
new r e q u e s t
can
be g r a n t e d
immediately.
O t h e r w i s e the
new
r e q u e s t must wait its turn in the
queue and in t h e c a s e of d e a d l o c k
it
may
preempt
some
incompatible
requests
in
the
queue.
( A l t e r n a t i v e l y the new
r e q u e s t could be c a n c e l e d .
In
F i g u r e 5 all
the r e q u e s t s d e c i d e d to wait.)
When a p a r t i c u l a r r e q u e s t l e a v e s the
g r a n t e d g r o u p the g r o u p
mode of the group may c h a n g e .
If the mode
of t h e first w a i t i n g r e q u e s t in the queue is c o m p a t i b l e w i t h t h e new
mode of the g r a n t e d group, t h e n
the w a i t i n g r e q u e s t is granted.
In
F i g u r e 5, if
the IX r e q u e s t l e a v e s
the g r o u p , then the
group mode
b e c o m e s IS which is
c o m p a t i b l e w i t h S and so the
S may be granted.
The new g r o u p
mode will be S
and s i n c e t h i s is
c o m p a t i b l e with IS
mode
the IS
request
f o l l o w i n g the
S request
may
also join
the
g r a n t e d group.
T h i s p r o d u c e s the s i t u a t i o n d e p i c t e d in F i g u r e 6:
•
Figure
GRANTED GROUP GROUPMODE = S
IIS I--IISI-IISl--I
ZSl--I Sl-,
6. The q u e u e a f t e r
*
I ZSl--*- I Xl- IIS I- IIX I
the IX r e q u e s t
is r e l e a s e d .
The X r e q u e s t of F i g u r e 6 w i l l not be g r a n t e d u n t i l all the r e q u e s t s
l e a v e the g r a n t e d g r o u p since it is not c o m p a t i b l e with any mode.
conversions:
A
transaction
might
re-request
the
same
resource
for
seve ~-al
reasons: P e r h a p s it has f o r g o t t e n that
it a l r e a d y has access to the
record: after all, if it is s e t t i n g
many l o c k s it may be s i m p l e r to
just a l w a y s
r e q u e s t a c c e s s to the
record r a t h e r than
first a s k i n g
i t s e l f "have I seen t h i s r e c o r d before".
The l o c k s u b s y s t e m has all
the i n f o r m a t i o n
to a n s w e r
t h i s q u e s t i o n and
it s e e m s
w a s t e f u l to
duplicate.
A l t e r n a t i v e l y , t h e t r a n s a c t i o n may k n o w it has a c c e s s to
t h e record, but want to i n c r e a s e its a c c e s s m o d e (for e x a m p l e from S
to X mod ~- if
it is in a read, test, and s o m e t i m e s
u p d a t e scan of a
file).
So the lock s u b s y s t e m must
be p r e p a r e d for r e - r e q u e s t s b y a
t r a n s a c t i o n for a lock.
We c a l l such r e - r e q u e s t s c o n v e r s i o n s .
When a r e q u e s t is
f o u n d to be a c o n v e r s i o n , the
old (granted) m o d e
of the r e g u e s t o r
to t h e r e s o u r c e and the n e w l y
r e q u e s t e d m o d e are
c o m p a r e d using T a b l e 3 to c o m p u t e the new m o d e w h i c h is the s u p r e m u m
of the old and the r e q u e s t e d m o d e ( r e f . F i g u r e 2).
Granu[arity of looks and degrees o f consistency
Table
3. The new mode g i v e n
I
I. . . . I
I IS
I
I IX
I
I S
I
I SIX I
l__X___i
So for e x a m p l e ,
mode is S I X .
IS
IS
IX
S
SIX
X
the
requested
NEW 8ODE
IX
S
IX
S
IX
SIX
SIX
S
SIX
SIX
X
X
if one has IX m o d e
SiX
SIX
SIX
SIX
SIX
X
377
and old mode.
X
X
X
X
X
X
and r e q u e s t s
S mode t h e n t h e new
If the new mode is equal to t h e o l d m o d e (note it is n e v e r l e s s than
the old
mode) then the r e q u e s t
can be g r a n t e d i m m e d i a t e l y
and the
g r a n t e d m o d e is
unchanged.
If t h e new mode is
c o m p a t i b l e with the
g r o u p m o d e of the other m e m b e r s of the g r a n t e d group (a r e q u e s t o r is
~lways
compatible
with himself)
then
again
the r e q u e s t
can
be
g r a n t e d i m m e d i a t e l y . The g r a n t e d mode is
the new mode and t h e g r o u p
mode
is
recomputed
using
T a b l e 2.
In
all
other
cases,
the
requested conversion
must wait
u n t i l the g r o u p
m o d e of
the other
granted requests
is c o m p a t i b l e w i t h t h e
new mode.
Note
that t h i s
i m m e d i a t e g r a n t i n g of
conversions over waiting requests
is a m i n o r
v i o l a t i o n of fair s c h e d u l i n g .
If two c o n v e r s i o n s
are waiting, e a c h of w h i c h
is i n c o m p a t i b l e with
an a l r e a d y g r a n t e d r e q u e s t of the o t h e r t r a n s a c t i o a , then a d e a d l o c k
e x i s t s and
the already
granted access
of one
must be
preempted.
3therwise
t h e r e is
a way
of s c h e d u l i n g
the w a i t i n g
conversions:
namely, g r a n t
a conversion
w h e n it
is c o m p a t i b l e
with all
other
granted modes
in the
g r a n t e d group.
(Since
there is
no d e a d l o c k
c y c l e t h i s is a l w a y s possible.)
The f o l l o w i n g e x a m p l e may h e l p to c l a r i f y
queue for a p a r t i c u l a r r e s o u r c e is:
G R O U P M O D E = is
IISI---IIS
these
points.
Suppose
the
*
I
Figure 7. A s i m p l e queue.
Now s u p p o s e
the first t r a n s a c t i o n w a n t s
to c o n v e r t to X mode.
It
m u s t wait
for the
s e c o n d (already
granted) request
to l e a v e t h e
queue.
If it d e c i d e s to wait t h e n t h e s i t a a t i o n becomes:
G R O U P M O D E = IS
IIS<-XI---IISI-
Figure
8. ~ c o n v e r s i o n
to X m o d e waits.
NO new
r e q u e s t may
e n t e r the g r a n t e d
group since
t h e r e is
now a
conversion request
waiting.
In g e n e r a l , c o n v e r s i o n s
are s c h e d u l e d
b e f o r e new requests.
If the s e c o n d
t r a n s a c t i o n now c o n v e r t s to I X ,
SIX, or
S mode it
may be g r a n t e d
immediately since this
d o e s not
c o n f l i c t with the ~_ranted (IS) m o d e
of the f i r s t t r a n s a c t i o n .
When
the
second
transaction
eventually leaves
the
queue,
the
first
c o n v e r s i o n can be made:
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
378
G R O U P M O D E = IS
IIXl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figure
9. One t r a n s a c t i o n
l e a v e s and
H o w e v e r , if
the s e c o n d
transaction
m o d e one o b t a i n s the queue:
the c o n v e r s i o n
tries
to convert
is granted.
to e x c l u s i v e
G R O U P Z O D E = IS
IIS<-XI-~-WIS<-XI
Figure
10. T w o c o n f l i c t i n g
conversions
are
waiting.
Since
X is
incompatible
with IS
(see
Table
I), this
situation
i m p l i e s that each t r a n s a c t i o n is w a i t i n g
for the o t h e r to l e a v e the
queue (i.e. deadlock) and so one
transaction mu~
be p r e e m p t e d .
In
all
other c a s e s
(i.e. when
no c y c l e
exists)
there is
a way
to
schedule
the
c o n v e r s i o n s so
that
no
already granted
access
is
violated.
Deadlock
and lock t h r a s h i n q :
W h e n e v e r a t r a n s a c t i o n waits for a
r e q u e s t to be granted,
it r u n s
the risk of
w a i t i n g f o r e v e r in a d e a d l o c k c y c l e .
For the p u r p o s e s
of d e a d l o c k
d e t e c t i o n it is
i m p o r t a n t to
k n o w who is
w a i t i n g for
whom.
The r e q u e s t
queues
give
this information.
Consider
any
w a i t i n g r e q u e s t R by t r a n s a c t i o n T.
There
are two cases: If R is a
conversion, r
is W A I T I N G _ F O R all t r a n s a c t i o n s
granted incompatible
r e q u e s t s to the queue.
If R is not a c o n v e r s i o n ,
r is W A I T I N G FOR
all t r a n s a c t i o n s
a h e a d of it
in the
q u e u e g r a n t e d or
waiting f o r
incompatible requests.
G i v e n this W A I T I N G _ F O R r e l a t i o n c o m p u t e d for
all
waiting t r a n s a c t i o n s ,
there
is no
deadlock
if
and o n l y
if
W A I T I N G _ F O R is a c y c l i c .
The W A I T I N G FOR
r e l a t i o n may c h a n g e
whenever a request
or r e l e a s e
o c c u r s and when a c o n v e r s i o n is
granted.
If a t r a n s a c t i o n may wait
for at most one r e q u e s t at a
time, then the d e a d l o c k s t a t e can only
c h a n g e when
some process
decides to
wait.
In
this s p e c i a l
case
( s y n c h r o n o u s c a l l s to l o c k system) , o n l y w a i t s r e q u i r e recomputation
of the
WAITING_FOR relation.
If
d e a d l o c k is
improbable, deadlock
t e s t i n g can be
done p e r i o d i c a l l y r a t h e r than on
each wait, f u r t h e r
r e d u c i n g c o m p u t a t i o n a l overhead.
One new
r e q u e s t may form
many c y c l e s and
e a c h such c y c l e
must be
broken.
When a c y c l e is detected, to break the
c y c l e some g r a n t e d
or w a i t i n g
request must
be p r e e m p t e d .
The l o c k
scheduler should
c h o o s e a m i n i m a l c o s t set of v i c t i m s
to p r e e m p t , so that all c y c l e s
are
broken, u n d o
all the
c h a n g e s to
the
d a t a base
m a d e by
the
victims since the p r e e m p t e d r e s o u r c e s w e r e g r a n t e d , and t h e n p r e e m p t
the r e s o u r c e and s i g n a l the v i c t i m s that t h e y h a v e been b a c k e d up.
The issues d i s c u s s e d so f a r - - l o c k s c h e d u l i n g , d e t e c t i n g and b r e a k i n g
deadlocks--are
low
level
scheduling
decisions.
They
must
be
c o n n e c t e d with
a high level
transaction scheduler
which regulates
the
load on
the s y s t e m
and r e g u l a t e s
the e n t r y
and p r o g r e s s
of
transactions
to p r e v e n t
long waits,
high
p r o b a b i l i t y o~
waiting
Granularity oflocks and degrees o f consistency
379
(lock
thrashing),
and
deadlock.
By
analogy,
a page
management
s y s t e m with only
a low l e v e l p a g e f r a m e
scheduler, which allocates
and p r e e m p t s page f r a m e s in a f a i r l y n a i v e way, is likely to produce
page thrashing
u n l e s s it
is c o u p l e d with
a working
set s c h e d u l e r
which r e g u l a t e s the n u m b e r and
c h a r a c t e r of p r o c e s s e s c o m p e t i n g f o r
p a g e frames.
II.
DEGREES
OF CONSISTENCY:
We now focus on how l o c k s can
be used to c o n s t r u c t t r a n s a c t i o n s out
of a t o m i c
actions.
The
data base consists
of e n t i t i e s
which are
r e l a t e d in c e r t a i n ways.
T h e s e r e l a t i o n s h i p s are best t h o u g h t o f as
a s s e r t i o n s about the data.
E x a m p l e s of s u c h a s s e r t i o n s are:
'Names is an index for T e l e p h o n e n u m b e r s . '
'The
value
of C o u n t of x g i v e s
the
number
of
employees
in
d e p a r t m e n t x.'
The data
base is
s a i d to
be c o n s i s t e n t
if it
satisfies all
its
assertions
[I].
In
some
cases,
the
data
base
must
become
temporarily
inconsistent
in
order
to
transform
it
to
a
new
consistent
state.
For
example,
adding
a new
employee
involves
s e v e r a l a t o m i c a c t i o n s and the u p d a t i n g of s e v e r a l fields.
T h e data
base
may
be
inconsistent
until
all
these
updates
have
been
completed.
To
c o p e with
these t e m p o r a r y i n c o n s i s t e n c i e s , s e q u e n c e s
of a t o m i c
actions
are g r o u p e d
to form
transactions.
Transactions are
the
u n i t s of
consistency.
T h e y are l a r g e r
a t o m i c a c t i o n s on
the data
base
which
transform
it
from
one
consistent
state
to
a new
consistent
state.
Transactions
preserve
consistency.
If
some
action
of
a
transaction
f a i l s then
the
entire
transaction
is
'undone'
thereby returning
t h e data
base to
a consistent
state.
T h u s t r a n s a c t i o n s are a l s o the u n i t s of r e c o v e r y .
Hardware failure,
s y s t e m error, d e a d l o c k , p r o t e c t i o n v i o l a t i o n s
and p r o g r a m e r r o r are
e a c h a s o u r c e of such failure.
If t r a n s a c t i o n s are run one a% a time then each t r a n s a c t i o n w i l l see
the c o n s i s t e n t state left b e h i n d by i t s p r e d e c e s s o r .
But if s e v e r a l
t r a n s a c t i o n s are s c h e d u l e d c o n c u r r e n t l y t h e n
locking is r e q u i r e d to
i n s u r e that the i n p u t s to e a c h t r a n s a c t i o n are c o n s i s t e n t .
Responsibility
for r e q u e s t i n g
and r e l e a s i n g
locks
can either
be
a s s u m e d by the user or be
d e l e g a t e d to the system.
User c o n t r o l l e d
locking
results
in
potentially
f e w e r locks
due
to
the
user's
k n o w l e d g e of
the s e m a n t i c s o f
the data.
O n the o t h e r
hand, user
controlled
locking r e q u i r e s
difficult
and p o t e n t i a l l y
unreliable
application programming.
H e n c e t h e a p p r o a c h t a k e n by some data base
s y s t e m s is to
use a u t o m a t i c l o c k p r o t o c o l s
which i n s u r e p r o t e c t i o n
from g e n e r a l types of i n c o n s i s t e n c y , w h i l e still r e l y i n g on the user
to p r o t e c t
himself against other
s o u r c e s of
inconsistencies.
For
example, a
system may
a u t o m a t i c a l l y lock
updated records
b u t not
r e c o r d s which are read.
Such a s y s t e m p r e v e n t s l o s t u p d a t e s a r i s i n g
from t r a n s a c t i o n
backup.
Still,
t h e user
should e x p l i c i t l y
lock
r e c o r d s in a r e a d - u p d a t e S e q u e n c e to i n s u r e that the r e a d v a l u e does
not c h a n g e
b e f o r e the
actual update.
In o t h e r
words, a
user is
g u a r a n t e e d a limited
a u t o m a t i c d e s r e e of c o n s i s t e n c y .
This d e g r e e
of c o n s i s t e n c y may be s y s t e m wide
or the system may p r o v i d e o p t i o n s
to s e l e c t it (for i n s t a n c e a l o c k
p r o t o c o l may be a s s o c i a t e d with a
t r a n s a c t i o n or with an e n t i t y ) .
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
380
we now
present several e~uivalent
d e f i n i t i o n s of
four consistency
degrees.
The first
d e f i n i t i o n is an o p e r a t i o n a l
and i n t u i t i v e one
useful
in d e s c r i b i n g
the
system behavior
to
users.
The
second
d e f i n i t i o n is
a p r o c e d u r a l one
in t e r m s
of lock protocols,
it is
useful
in
explaining
the
system
implementation.
The
third
d e f i n i t i o n is
in terms
of a t r a c e of
the s y s t e m
a c t i o n s , it
is
u s e f u l in
f o r m a l l y stating
and p r o v i n g
p r o p e r t i e s of
the v a r i o u s
c o n s i s t e n c y degrees.
Informal
definition
of c o n s i s t e n c i : ~
An o u t p u t (write) of a t r a n s a c t i o n is c o m m i t t e d when the t r a n s a c t i o n
a b d i c a t e s the right to 'undo' the w r i t e t h e r e b y making the new value
available
to
all
other
transactions.
Outputs
are
said
to
be
u n c o m m i t t e d or
d i r t y if they are
not yet c o m m i t t e d by
the writer.
Concurrent
e x e c u t i o n raises
the p r o b l e m
that
reading or
writing
other t r a n s a c t i o n s ' d i r t y data m a y y i e l d
i n c o n s i s t e n t data.
Using t h i s n o t i o n
defined as:
Definition
of dirty data,
the
degrees
of
consistency
may
be
1:
Degree 3: T r a n s a c t i o n T sees d e q r e e 3 c o n s i s t e n c y if:
(a) T does not o v e r w r i t e d i r t y data of other t r a n s a c t i o n s .
(b) T does not commit any writes
u n t i l it c o m p l e t e s all its w r i t e s
(i.e. u n t i l the end of t r a n s a c t i o n (EOT)).
(z) T does not read dirty data from other t r a n s a c t i o n s .
(d) O t h e r
t r a n s a c t i o n s do not
d i r t y any data
read by T b e f o r e T
completes.
Degree
(a) T
(b) T
(c) T
2: T r a n s a c t i o n T sees d e q r e e 2 c o n s i s t e n ~ x if:
d o e s not o v e r w r i t e dirty data of other t r a n s a c t i o n s .
does not c o m m i t a n y writes b e f o r e EOT.
d o e s not read dirty data of o t h e r t r a n s a c t i o n s .
Degree I: T r a n s a c t i o n T sees d e q r e e I c o n s i s t e n c y if:
(a) T does not o v e r w r i t e dirty data of other t r a n s a c t i o n s .
(b) T does not c o m m i t any w r i t e s b e f o r e EOT.
Degree 0: T r a n s a c t i o n T sees d e q r e e 0 c o n s i s t e n c y if:
(a) T d o e s n o t o v e r w r i t e dirty data o f o t h e r t r a n s a c t i o n s .
Note that if a t r a n s a c t i o n sees a high
also sees all the l o w e r degrees.
degree of c o n s i s t e n c y
then it
Degree 0 consistent transactions
commit writes
before the
end of
transaction.
Hence backing up a d e g r e e 0 c o n s i s t e n t t r a n s a c t i o n may
require
undoing
an
update
to
an
entity
locked
by
another
transaction.
In
this
sense,
degree
0
transactions
are
unrecoverable.
Degree 1 t r a n s a c t i o n s do not
committ writes
until the end
of the
transaction.
Hence one
may undo (back up) an
in-progress degree I
transaction
without
setting
a d d i t i o n a l locks.
This
means
that
t r a n s a c t i o n b a c k u p does not e r a s e o t h e r t r a n s a c t i o n s ' updates.
This
is the
p r i n c i p a l reason
one data
m a n a g e m e n t system
automatically
p r o v i d e s d e g r e e I c o n s i s t e n c y to all t r a n s a c t i o n s .
Degree
2
consistency
isolates
a transaction
from the
uncommitted
Granulari~ of locks and degrees of consistency
381
data of other t r a n s a c t i o n s .
With d e g r e e I c o n s i s t e n c y a t r a n s a c t i o n
m i g h t read u n c o m m i t t e d v a l u e s which
are s u b s e q u e n t l y u p d a t e d or are
undone.
In degree 2 no dirty d a t a values are read.
Degree
3 consistency
isolates
the
transaction
from
dirty
r e l a t i o n s h i p s among values.
R e a d s are r _ ~ e a t a b l e .
For
example, a
degree 2 consistent
t r a n s a c t i o n may read two
d i f f e r e n t (committed)
values
if it
reads
the
same e n t i t y
twice.
This
is b e c a u s e
a
t r a n s a c t i o n which u p d a t e s the e n t i t y could
begin, u p d a t e and end in
t h e i n t e r v a l of time b e t w e e n the two reads.
More e l a b o r a t e k i n d s of
a n o m a l i e s due to
c o n c u r r e n c y are p o s s i b l e if one
u p d a t e s an e n t i t y
a f t e r r e a d i n g it or if more than one e n t i t y is i n v o l v e d (see e x a m p l e
below).
Degree
3 consistency
completely isolates
the t r a n s a c t i o n
f r o m i n c o n s i s t e n c i e s due to c o n c u r r e n c y [ I].
Each t r a n s a c t i o n can e l e c t the
d e g r e e of c o n s i s t e n c y a p p r o p r i a t e to
its function.
When the t h i r d d e f i n i t i o n is g i v e n we will be able to
s t a t e the c o n s i s t e n c y and r e c o v e r y p r o p e r t i e s
of such a s y s t e m more
formally.
Briefly:
If one
elects d e g r e e i c o n s i s t e n c y then
consistent
s t a t e (so
l o n g as
all o t h e r
least degree 0 c o n s i s t e n t )
one sees a
degree i
t r a n s a c t i o n s run
at
If all
t r a n s a c t i o n s run at
least degree I consistent, system
b a c k u p (undoing all i n - p r o g r e s s
t r a n s a c t i o n s ) l o s e s no u p d a t e s
of c o m p l e t e d t r a n s a c t i o n s .
If
all
transactions
transaction
backup
produces a consistent
run
at
least
degree
2
consistent,
(undoing
any
in-progress
transaction)
state.
To
give an
example
which demonstrates
the
a p p l i c a t i o n of
these
s e v e r a l d e g r e e s of c o n s i s t e n c y , i m a g i n e
a p r o c e s s c o n t r o l system in
which
some
transaction
is
dedicated
to
reading
a gauge
and
periodically
w r i t i n g batches
of v a l u e s
into a
list.
Each
gauge
reading
is an
individual entity.
For
performance reasons,
this
t r a n s a c t i o n sees d e g r e e 0 c o n s i s t e n c y , c o m m i t t i n g all g a u g e r e a d i n g s
as
s o o n as
they
e n t e r the
d a t a base.
This
t r a n s a c t i o n is
not
recoverable
(can't
be
undone).
A
second
transaction
is
run
p e r i o d i c a l l y which r e a d s
all t h e r e c e n t g a u g e
readings, computes a
m e a n and
v a r i a n c e and w r i t e s these
c o m p u t e d v a l u e s as
e n t i t i e s in
t h e d a t a base.
S i n c e we want t h e s e two v a l u e s to be c o n s i s t e n t with
o n e a n o t h e r , they must be c o m m i t t e d t o g e t h e r (i.e. one cannot c o m m i t
the f i r s t
before the second
is w r i t t e n ) .
This
allows transaction
u n d o in the
case t h a t it a b o r t s
after writing only one
of the two
values.
Hence
this
statistical summary
transaction
should
see
d e g r e e I.
A third t r a n s a c t i o n which r e a d s the mean and w r i t e s it on
a d i s p l a y sees degree 2 c o n s i s t e n c y .
It
w i l l not read a mean which
m i g h t be 'undone' by a b a c k u p .
~ n o t h e r t r a n s a c t i o n w h i c h r e a d s both
the mean
and the v a r i a n c e must
see degree 3 consistency
to insure
t h a t the
mean and v a r i a n c e d e r i v e
from the same
c o m p u t a t i o n (i.e.
th~ same run which w r o t e the mean
a l s o w r o t e the variance).
Lock
protocol
definition
of consistenc_z:
W h e t h e r an i n s t a n t i a t i o n of
a t r a n s a c t i o n s e e s d e g r e e o, I,
2 or 3
consistency
depends
on
the
actions
of
other
concurrent
transactions.
Lock p r o t o c o l s are used by a t r a n s a c t i o n to g u a r a n t e e
itself a certain
d e g r e e of c o n s i s t e n c y i n d e p e n d e n t
of the b e h a v i o r
o f o t h e r t r a n s a c t i o n s (so long as
all t r a n s a c t i o n s at l e a s t o b s e r v e
11%'. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
382
the d e g r e e 0 protocol).
The d e g r e e s of
c o n s i s t e n c y can be p r o c e d u r a l l y d e f i a e d
by the l o c k
p r o t o c o l s which
p r o d u c e them.
A transaction
l o c k s its
i n p u t s to
guarantee their
c o n s i s t e n c y and l o c k s its
o u t p u t s to mark
them as
dirty (uncommitted).
For this s e c t i o n ,
l o c k s are d i c h o t o m i z e d as s h a r e
m o d e l o c k s which
allow multiple readers
of the s a m e e n t i t y and
e x c l u s i v e m o d e locks
which r e s e r v e
e x c l u s i v e access
to an
entity.
(This
is the
"two
mode" l o c k protocol.
Its g e n e r a l i z a t i o n to the
"six m o d e " p r o t o c o l
of
the p r e v i o u s
section should
be
obvious.)
Locks
may a l s o
be
c h a r a c t e r i z e d by
t h e i r d u r a t i o n : l o c k s held
for the d u r a t i o n
of a
single action
are c a l l e d s h o r t d u r a t i o n
l o c k s while l o c k s
held to
the end
of the t r a n s a c t i o n are
called lonq duration
locks.
Short
duration locks
are u s e d
to m a r k
or test
for dirty
data f o r
the
duration
of
an
action
rather
than
for
the
duration
of
the
transaction.
The
lock
p r o t o c o l s are:
Definition
2:
D e g r e e 3: t r a n s a c t i o n T o b s e r v e s d e ~ r e e 3 lock p r o t o c o l if:
(a) T s e t s a long e x c l u s i v e lock on any data it d i r t i e s .
(b) T s e t s a long share lock on a n y d a t a it reads.
D e g r e e 2: t r a n s a c t i o n T o b s e r v e s d e ~ r e e 2 lock ~ r o t o c o l if:
(a) T s e t s a long e x c l u s i v e lock on any d a t a it dirties.
(b) T sets a (possibly short) s h a r e lock on any data it reads.
D e g r e e I: t r a n s a c t i o n T o b s e r v e s de__gree I lock P [ ~ E ~ !
if:
(a) T sets a long e x c l u s i v e lock on any data it dirties.
D e g r e e 3: t r a n s a c t i o n T o b s e r v e s d e g r e e 0 lock p r o t o c o l
(a)
T sets
a
(possibly
short) e x c l u s i v e
lock
on
dirties.
if:
a n y data
it
The lock
p r o t o c o l d e f i n i t i o n s can be
stated m o r e t e r s e l y
w i t h the
introduction
of the f o l l o w i n g
notation.
A transaction
is
~!
formed with respect
to w r i t e s ([ead____ss) if it a l w a y s
l o c k s an e n t i t y
in e x c l u s i v e
(shared or
exclusive) mode
before writing
(reading)
it.
The t r a n s a c t i o n
is
well formed
if it
is
well f o r m e d
with
r e s p e c t to r e a d s and writes.
A t r a n s a c t i o n is _two p h a s e (with r_e_spect to r e a d s or _updates) if it
does not
(share or exclusive) l o c k
an e n t i t y after
unlocking some
entity.
A two p h a s e t r a n s a c t i o n has a g r o w i n g phase d u r i n g which it
acquires
locks
and a s h r i n k i n g
phase
d u r i n g which
it
releases
locks.
D e f i n i t i o n 2 is
too r e s t r i c t i v e in t h e s e n s e
that c o n s i s t e n c y will
not r e q u i r e t h a t a
t r a n s a c t i o n hold all l o c k s to the
EOT (i.e. the
EOT
is
the
shrinking
phase) . R a t h e r ,
the
constraint
that
the
t r a n s a c t i o n be two phase is a d e q u a t e
to i n s u r e c o n s i s t e n c y .
On the
o t h e r hand,
o n c e a t r a n s a c t i o n u n l o c k s an
u p d a t e d entity,
it has
committed
that entity
and so
c a n n o t be
undone without
cascading
backup
to any
t r a n s a c t i o n s which
m a y have
subsequently read
the
entity.
For that reason, the s h r i n k i n g p h a s e is u s u a l l y d e f e r r e d to
the
end
of
the
transaction;
thus,
the
transaction
is
always
recoverable
and
all
u p d a t e s are
committed
together.
The
lock
p r o t o c o l s can be r e d e f i n e d as:
Gramtlarity of looks and degrees of consistency
383
D e f i n i t i o n 3!:
Degree
3: T is well f o r m e d
and T is two phase.
Degree
2: T is well f o r m e d
and T is two p h a s e with r e s p e c t
Degree
Degree
to writes.
1: T is well f o r m e d with r e s p e c t to w r i t e s
and T is two p h a s e with r e s p e c t to writes.
0: T is well
f o r m e d w i t h r e s p e c t to writes.
All
transactions
are r e ~ i r e d
p r o t o c o l so
that t h e y
do not
others.
Degrees
I, 2 and 3
consistency.
Consisten~
to
observe
the d e g r e e
0 locking
updates
of
u p d a t e the
uncommitted
syst e s - g u a r a n t eed
provide increasing
of s c h e d u l e s :
The d e f i n i t i o n of what it m e a n s for a t r a n s a c t i o n to see a degree of
c o n s i s t e n c y was given in t e r m s of d i r t y
data.
In o r d e r to make the
notion
of dirty
data
e x p l i c i t it
is
necessary
to c o n s i d e r
the
e x e c u t i o n of a t r a n s a c t i o n in t h e c o n t e x t of a
set of c o n c u r r e n t l y
executing transactions.
To do
this we i n t r o d u c e
the n o t i o n
of a
s c h e d u l e for a set of t r a n s a c t i o n s .
A s c h e d u l e can be t h o u g h t of as
a history
or audit
trail of the
actions performed
by the
set of
transactions.
Given
a s c h e d u l e the
n o t i o n of a
particular entity
b e i n g d i r t i e d by a p a r t i c u l a r t r a n s a c t i o n is made e x p l i c i t and h e n c e
the n o t i o n of s e e i n g a c e r t a i n
d e g r e e of c o n s i s t e n c y is f o r m a l i z e d .
T h e s e n o t i o n s may t h e n be Used to c o n n e c t the v a r i o u s d e f i n i t i o n s of
c o n s i s t e n c y and show t h e i r e q u i v a l e n c e .
The
system directly
supports entities
and
actions.
Actions
are
categorized
as b e q ! n
actions,
en_dd actions, s h a r e
lock
actions,
exclusive
lock actions,
unlock actions,
read
a c t i o n s , and
write
actions.
An end action is p r e s u m e d to
u n l o c k any l o c k s held by the
t r a n s a c t i o n but not e x p l i c i t l y u n l o c k e d by the t r a n s a c t i o n .
F o r the
p u r p o s e s of the f o l l o w i n g d e f i n i t i o n s ,
s h a r e lock a c t i o n s and t h e i r
c o r r e s p o n d i n g u n l o c k a c t i o n s are a d d i t i o n a l l y
c o n s i d e r e d to be read
a c t i o n s and
e x c l u s i v e lock a c t i o n s
and t h e i r
corresponding unlock
a c t i o n s are a d d i t i o n a l l y c o n s i d e r e d to be w r i t e a c t i o n s .
A transaction
is any
a c t i o n and e n d i n g w i t h
or end actions.
s e q u e n c e of
actions beginning
with a b e g i n
an end
a c t i o n and not c o n t a i n i n g o t h e r b e g i n
Any
(sequence
preserving) merging
t r a n s a c t i o n s into a s i n g l e s e q u e n c e
of t r a n s a c t i o n s .
of
the
is c a l l e d
actions
of a set
a s c h e d u l e for the
of
set
A s c h e d u l e is a h i s t o r y of the
o r d e r in w h i c h a c t i o n s were e x e c u t e d
(it does not
record a c t i o n s which were u n d o n e due
to backup).
The
s i m p l e s t s c h e d u l e s run
all a c t i o n s of one t r a n s a c t i o n
and t h e n all
a c t i o n s of
another transaction,...
Such o n e - t r a n s a c t i o n - a t - a - t i m e
s c h e d u l e s are c a l l e d
s e r i a l b e c a u s e they h a v e
no c o n c u r r e n c y among
transactions.
Clearly, a s e r i a l s c h e d u l e has no c o n c u r r e n c y induced
i n c o n s i s t e n c y and no t r a n s a c t i o n sees d i r t y data.
Locking constrains the
s c h e d u l e is le_~a!l o n l y
set of a l l o w e d schedules.
if it does not s c h e d u l e a
In p a r t i c u l a r , a
lock a c t i o n on an
384
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
e n t i t y for
one t r a n s a c t i o n
when t h a t e n t i t y
some o t h e r t r a n s a c t i o n in a c o n f l i c t i n g mode.
is a l r e a d y
locked
by
An
initial state
and
a schedule
completely
d e f i n e the
system's
behavior.
At each s t e p of the
s c h e d u l e one can d e d u c e w h i c h entity
v a l u e s h a v e been c o m m i t t e d and w h i c h
are dirty: it l o c k i n g is used,
u p d a t e d d a t a is dirty until it is unlocked.
Since a
s c h e d u l e m a k e s the d e f i n i t i o n
of d ~ r t y data
can a p p l y D e f i n i t i o n 1 to d e f i n e c o n s i s t e n t s c h e d u l e s :
Definition
explicit,
one
3:
A t r a n s a c t i o n runs at ~ E ~ 9
~ (!, 2 or 3) c___onsistency in s c h e d u l e S
if T
sees degree
0 (I,
2 or
3) c o n s i s t e n c y
in S.
(Conversely,
t r a n s a c t i o n T sees degree i c o n s i s t e n c y if all legal s c h e d u l e s run T
at d e g r e e i c o n s i s t e n c y . )
If
all t r a n s a c t i o n s
run
at d e g r e e
0 (1,2
schedule S then
S is said to be
a _de_qree ~
schedule.
Given
these
Assertion
(a)
(b)
definitions
or
3) c o n s i s t e n c y
in
(I, 2 or 3) c o n s i s t e n t
one c a n show:
I:
If
each transaction
o b s e r v e s the
degree 0
(1, 2 or 3)
lock
p r o t o c o l (Definition 2) t h e n any legal s c h e d u l e is d e g r e e 0 (1,
2 or 3)
c o n s i s t e n t ( D e f i n i t i o n 3) (i.e,
each t r a n s a c t l o n sees
degree 0
(1, 2 or
3) c o n s i s t e n c y
in the s e n s e
of D e f i n i t i o n
1).
Unless transaction
T observes
the d e g r e e
I
(2 or 3)
lock
p r o t o c o l then it
is p o s s i b l e to d e f i n e
a n o t h e r t r a n s a c t i o n T'
w h i c h does
observe the d e g r e e
I (2 or 3) lock
protocol such
that T
and T' h a v e a l e g a l s c h e d u l e S
but T d o e s not
run at
d e g r e e 1 (2 or 3) c o n s i s t e n c y in S.
In
[ 1]
argument
we proved
Assertion
generalizes directly
I for
degree
to t h i s result.
3
consistency.
That
Assertion 1
says that if a
t r a n s a c t i o n o b s e r v e s the
lock p r o t o c o l
d e f i n i t i o n of c o n s i s t e n c y
( D e f i n i t i o n 2) then it is
a s s u r e d of the
i n f o r m a l d=_finition of c o n s i s t e n c y b a s e d on c o m m i t t e d and d i r t y ,data
(Definition
1).
Unless
a transaction
actually
sets
the
locks
prescribed
by degree
1
(2 or
3)
consistency
one can
construct
t r a n s a c t i o n m i x e s and s c h e d u l e s which
will c a u s e the t r a n s a c t i o n te
run at (see) a lower degree
of c o n s i s t e n c y .
However, in p a r t i c u l a r
c a s e s such t r a n s a c t i o n m i x e s may n e v e r occur due to the s t r u c t u r e or
use
of the
system.
In
these c a s e s
an a p p a r e n t l y
low d e g r e e
of
l o c k i n g may a c t u a l l y
provide degree 3 consistency.
For example, a
data base r e o r g a n i z a t i o n u s u a l l y need do
no l o c k l n g s i n c e it is run
as an
o f f - l i n e u t i l i t y w h i c h is
n e v e r run c o n c u r r e n t l y
with other
transactions.
Assertion
2:
If e a c h t r a n s a c t i o n
in a
d e g r e e 0 lock p r o t o c o l and
or 3)
lock p r o t o c o l then
(Definitions
1,
3)
in
transactions.
set of t r a n s a c t i o n s at
least o b s e r v e s the
if t r a n s a c t i o n T o b s e r v e s the d e g r e e I (2
T r u n s at d e g r e e
1 (2 or
3) c o n s i s t e n c y
any
legal
schedule
for
the
set
of
Granularity of locks and degrees of consistency
385
A s s e r t i o n 2 says t h a t
each transaction
can c h o o s e
its d e g r e e
of
c o n s i s t e n c y so
long as all t r a n s a c t i o n s
o b s e r v e at l e a s t
aegree 0
protocols.
Of
c o u r s e the o u t p u t s of
degree 0, I or
2
consistent
t r a n s a c t i o n s may be d e g r e e 0, I
or 2 c o n s i s t e n t (i.e. i n c o n s i s t e n t )
because
they were
c o m p u t e d with
potentially inconsistent
inputs.
3ne can i m a g i n e that
each data e n t i t y is t a g g e d with
t h e d e g r e e of
c o n s i s t e n c y of its
writer:
Degree 0 entities are
purple, d e g r e e I
e n t i t i e s are red, d e g r e e 2 e n t i t i e s are y e l l o w and degree 3 e n t l t i e s
are green.
The c o l o r of the o u t p u t s of a t r a n s a c t i o n is the m i n i m u m
of the t r a n s a c t i o n ' s
c o l o r and the c o l o r s of the
e n t i t i e s it r e a d s
( b e c a u s e they
are p o t e n t i a l l y i n c o n s i s t e n t ) .
Gradually
the s y s t e m
will turn p u r p l e or
red u n l e s s e v e r y o n e r u n s with a
high d e g r e e of
consistency.
If t h e t r a n s a c t i o n 's a u t h o r
k n o w s s o m e t h i n g about t h e
systems structure
w h i c h allows
an a p p a r e n t l y
degree 1 consistent
protocol
to produce
degree 3 c o n s i s t e n t r e s u l t s
then this
color
c o d i n g is p e s s i m i s t i c .
But, in g e n e r a l a t r a n s a c t i o n m u s t b e w a r e of
reading entities
t a g g e d with d e g r e e s l o w e r
than the d e g r e e
of the
transaction.
DeRendencies
amonq
transactions:
9ne t r a n s a c t i o n is said to depeP_~d on a n o t h e r if the first t a k e s s o m e
of its inputs from the second.
Thenotion
of d e p e n d e n c y is d e f i n e d
differently
for
each
degree
of
consistency.
These
dependency
r e l a t i o n s are c o m p l e t e l y d e f i n e d by a
s c h e d u l e and can be u s e f u l in
~ i s c u s s i n g c o n s i s t e n c y and r e c o v e r y .
Each s c h e d u l e d e f i n e s three
r e l a t i o n s : <, << and <<< on
the set of
t r a n s a c t i o n s as follows.
S u p p o s e that t r a n s a c t i o n T p e r f o r m s a c t i o n
a on entity e
at s o m e step in the s c h e d u l e
and that t r a n s a c t i o n T'
p e r f o r m s action
a' on
entity e at
a later
s t e p in
the schedule.
F u r t h e r s u p p o s e that T d o e s not e q u a l T'.
Then:
T <<<
T'
if
or
or
a is a write a c t i o n and a' is a w r i t e action
a is a w r i t e action and a' is a r e a d
action
a is a read
a c t i o n and a' is a w r i t e action
T << T'
if
or
a is a w r i t e action and a' is a w r i t e a c t i o n
a is a w r i t e action and a' is a read
action
T < T'
if
a is a write a c t i o n and
a' is a w r i t e a c t i o n
So degree I d o e s not care a b o u t
read d e p e n d e n c i e s at all.
Degree 2
c a r e s o n l y about one kind of
read dependency.
And degree 3 ignores
only r e a d - r e a d d e p e n d e n c i e s (reads c o m m u t e ) .
The f o l l o w i n g table is
n o t a t i o n a l l y c o n v e n i e n t way of s e e i n g t h e s e d e f i n i t i o n s :
<<<
:
W->W
I
W->R
<<
:
W->W
"I
W ->R
<
:
W ->W
I
R->W
m e a n i n g that (for e x a m p l e ) T
<<< T'
if T w r i t e s
read (R) by T'
or w r i t t e n (W) by T' or T
reads
w r i t t e n (W) by T'.
Let <~ be the t r a n s i t i v e c l o s u r e
BEFOBEI (T) = {T'I T' <~ T}
AFTERI (T} = {T' I T
<~ T'].
The sets B E F O R E 2 ,
AFTER2,
BEFOPF3
of <,
and
(W) s o m e t h i n g
(R) s o m e t h i n g
later
later
then define:
AFTER3
are d e f i n e d a n a l o g o u s l y
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
386
from
<< and
<<<.
The o b v i o u s i n t e r p r e t a t i o n f o r t h i s is
that each
B E F O R E set is the
set of t r a n s a c t i o n s which c o n t r i b u t e i n p u t s
to T and e a c h AFTER set
is the set of t r a n s a c t i o n s w h i c h t a k e their i n p u t s f r o m T (where the
o r d e r i n g only
c o n s i d e r s d=_pendencies
i n d u c e d by
the c o r r e s p o n d i n g
c o n s i s t e n c y degree).
If some t r a n s a c t i o n
is both b e f o r e T
and a f t e r T in
some s c h e d u l e
then
no s e r i a l
schedule could
give
such r e s u l t s .
In this
case
c o n c u r r e n c y has i n t r o d u c e d i n c o n s i s t e n c y .
On the o t h e r hand, if all
relevant transactions
are e i t h e r b e f o r e or
after T (but
not both)
then T will
see a c o n s i s t e n t s t a t e (of the c o r r e s p o n d i n g degree).
If all t r a n s a c t i o n s d i c h o t o m i z e o t h e r s in this way then t h e r e l a t i o n
<e (<<~ or <<<e) will be a p a r t i a l o r d e r and the w h o l e s c h e d u l e will
g i v e d e g r e e I (2 or 3) c o n s i s t e n c y .
This can be s t r e n g t h e n e d to:
Assertion
3:
A s c h e d u l e is degree I (2 or 3) c o n s i s t e n t if and o n l y
the r e l a t i o n <~ (<<e or <<<~) is a partial order.
if
The <,
<< and
<<< r e l a t i o n s
are v a r i a n t s
of the
d e p e n d e n c y sets
introduced
in [1].
In that
paper
only d e g r e e
3 consistency
is
i n t r o d u c e d and A s s e r t i o n 3 was proved
for that case.
In p a r t i c u l a r
such a s c h e d u l e is
e q u i v a l e n t to the
serial schedule
o b t a i n e d by
r u n n i n g the t r a n s a c t i o n s one at a t i m e
in <<< order.
T h e proofs of
[ I ] generalize
f a i r l y e a s i l y to h a n d l e
a s s e r t i o n 1 in the
case of
d e g r e e I or 2 c o n s i s t e n c y .
Consider
the
T1
TI
TI
T2
T2
T2
T2
T2
T2
TI
T1
T1
following example:
LOCK
A
READ
A
UNLOCK
A
LOCK
A
WRITE
A
LOCK
B
WRITE
B
UNLOCK
R
UNLOCK
B
LOCK
B
WHITE
B
UNLOCK
B
In this s c h e d u l e T2 g i v e s B to TI
and T2 u p d a t e s A a f t e r TI r e a d s A
so T2<TI,
T2<<T1, T 2 < < < T 1
and T I < < < T 2 .
The
s c h e d u l e is
degree 2
c o n s i s t e n t but
not d e g r e e
3 consistent.
It runs
TI at
degree 2
c o n s i s t e n c y and T2 at d e g r e e 3 c o n s i s t e n c y .
It w o u l d be
nice to d e f i n e a t r a n s a c t i o n
to s e e d e g r e e I
(2 or 3)
c o n s i s t e n c y if and only if the B E F O R E and A F T E R s e t s are d i s j o i n t in
some schedule.
However, t h i s is
not r e s t r i c t i v e enough; r a t h e r one
must
require that
the b e f o r e
and a f t e r
s e t s be
d i s j o i n t in all
s c h e d u l e s in order
to s t a t e D e f i n i t i o n 1 in
t e r m s of d e p e n d e n c i e s .
F u r t h e r , t h e r e seems to be no n a t u r a l way to d e f i n e the d e p e n d e n c i e s
of degree
0 consistency.
H e n c e the
principal application
of the
dependency definition
is as
a proof
t e c h n i q u e and
for discussing
s c h e d u l e s and r e c o v e r y issues.
!
Granularity of locks and degrees of consistency
R e l a t i o n s h i R t_~o t r a n s a c t i o n
backu E ~ n d ~
387
recover~:
A t r a n s a c t i o n T is s a i d to be r e c o v e r a b l e if it can be undone b e f o r e
'EOT' without u n d o i n g other t r a n s a c t i o n s '
updates.
A transaction T
is said to be r e p e a t a b l e if it will r e p r o d u c ~ the o r i g i n a l o u t p u t if
rerun f o l l o w i n g
recovery, assuming that
no l o c k s were
r e l e a s e d in
the b a c k u p
process.
Recoverability requires
s y s t e m wide
degree 1
consistency, repeatibility
r e q u i r e s that all o t h e r
t r a n s a c t i o n s be
~t l e a s t d e g r e e I and that the r e p e a t a b l e t r a n s a c t i o n be degree 3.
The no___rmal (i.e.
t r o u b l e free) o p e r a t i o n of a data
base s y s t e m c a n
be
d e s c r i b e d in
terms
of an
initial consistent
state
$0 and
a
schedule
of
transactions
mapping
the
data
base
into
a
final
consistent state
S3 (see
F i g u r e 11).
S1 is
a checkpoint
state,
since
t r a n s a c t i o n s are
in
p r o g r e s s , $1
may
be i n c o n s i s t e n t .
A
system crash leaves
the data b a s e in state
$2.
Since transactions
T3 and T5 w e r e
in p r o g r e s s at the time of
crash, S2 is p o t e n t i a l l y
inconsistent.
S y s t e m r e c o v e r y a m o u n t s to
b r i n g i n g the data b a s e in
a new c o n s i s t e n t s t a t e in one of the f o l l o w i n g ways:
(a)
Starting
from
state
S2, undo
all
i n - p r o g r e s s at the t i m e of t h e crash.
(by S t a r t i n g f r o m state S I
p r o g r e s s at the
time
before
SI) and
then
completed before
the
Sl) .
(c) s t a r t i n g
crash.
at S0 r e d o
observe
(a)
I
I
I
I
I
S0
that
and
first
of the
redo
crash
all
of
transactions
undo all a c t i o n s of t r a n s a c t i o n s in
crash
(i.e. a c t i o n s of T3
and T~
all a c t i o n s
of t r a n s a c t i o n s
which
(i.e.
actions
of T2 and
T3 a f t e r
transactions
(c) are d e g e n e r a t e
TII .............
I
T21 .......
:". . . .
T31 ............
actions
which
cases
of
completed
before
the
(b).
I
I--- I
I ...........
I T~ I--- I
I
T51 .....
>
<
> .... I
<
> .....
I
I
I
I
I
I
SI
$2
S3
F i g u r e 11.
S y s t e m states,
SO is
i n i t i a l state,
S1 is
checkpoint
state, S2 is a c r a s h and S3 is t h e s t a t e t h a t r e s u l t s in the a b s e n c e
of a crash.
U n l e s s all
t r a n s a c t i o n s run at
least degree 1 consistency, system
r e c o v e r y may l o s e
updates.
If for e x a m p l e , T3 w r i t e s
a record, r,
and t h e n T~ f u r t h e r u p d a t e s r then
u n d o i n g T3 w i l l c a u s e the u p d a t e
of
T~ to
r to
be lost.
This situation
can o n l y
a r i s e if
some
t r a n s a c t i o n d o e s not hold its w r i t e l o c k s to EOT.
(a) If
all the t r a n s a c t i o n s
run in
at l e a s t degree
1 consistency
then
system
recovery
loses
no
updates
of
complete
transactions.
However
t h e r e may
be no
s c h e d u l e which
would
give the sa~e r e s u l t b e c a u s e t r a n s a c t i o n s may have read o u t p u t s
of u n d o n e t r a n s a c t i o n s .
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
388
(b)
If all
the t r a n s a c t i o n s
run in
at l e a s t degree
2 then
the
recovered state
is consistent
and derives
from the
schedule
obtatined
from
the
original
system
schedule
by
deleting
incomplete
transactions.
Note
that
degree
2 prevents
read
dependencies on
t r a n s a c t i o n s which might
be undone
by system
recovery,
of all
the
Completed
t r a n s a c t i o n s results
in a
meaningful schedule.
(c)
If
a transaction
reproducible.
is
degree
3
consi stent
Transaction
crash
gives
rise
to
transaction
properties analogous to system recovery.
then
backu~
it
which
is
has
Cost of deqrees of consistencL:
The only advantage
of lower degrees of c o n s i s t e n c y is performance.
If less
is locked
then less computation
and storage
is consumed.
Further
if less
is locked,
c o n c u r r e n c y is
i n c r e a s e d since
fewer
conflicts appear.
(Note that
the granularity
lock scheme
of the
first section
was motivated
by minimizing
the number
of explicit
l o c k s set.)
We
will
make
some
vet Z
crude
estimates
of the
storage
and
computation
resources
consumed
by
the
locking
protocols
as a
function
of the c o n s i s t e n c y
degree.
For
the
r e m a i n d e r of
this
section assume
that all
t r a n s a c t i o n s are
identical.
Also
assume
that they
do R reads
and W writes
(and hence set
approximately R
share mode
locks and
W exclusive mode
locks}.
Further
we assume
that all the transactions run at the same c o n s i s t e n c y degree.
Each outstanding lock request consumes
per-transaction
space for
these queue
c o n s i s t e n c y degrees is:
Table 4. C o n s i s t e n c y
CONSISTENCY
0
I
2
3
DEGREE
degrees
a q u e u e element. The maximum
elements as
a function
of
vs storage consumption.
I
I STORAGE
I
I
I
I
I
I
I
I
J
I
I
I
(in queue elements)
1
~
W+1
W.,-l~
.I
I
I
I
Observe that
degrees I and 2 consume roughly
the same
amount of
s t o r a g e but that degree 3 c o n s u m e s substantially more storage.
This
observation is aggravated
by the fact that reads
are typically ten
times more common than writes.
The estimation
of c o m p u t a t i o n (CPU)
overhead is much
more subtle.
We make
only a crude e s t i m a t e
here.
First
one may
c o n s i d e r the
o v e r h e a d in requesting and releasing locks.
This is shown in TaDle
5 as a function of c o n s i s t e n c y degrees.
Granularity oflocks and degrees o f consistency
TABLE
5. C o m p u t a t i o n a l
CONSISTENCY
overhead
DEGREE
CPU
vs d e g r e e s
389
of c o n s i s t e n c y .
(in c a l l s to l o c k sys)
W
W
W+R
W+R
Table 5 i n d i c a t e s t h a t the c o m p u t a t i o n a l o v e r h e a d of d e g r e e s 2 and 3
are c o m p a r a b l e and are g r e a t e r than the
o v e r h e a d of d e g r e e s 0 or 1.
These p a i r s of d e g r e e s
set t h e s a m e locks, t h e y just
hold them f o r
different durations.
Table
5
ignores
the
observation
that
some
lock
requests
are
trivially
satisfied
(the r e q u e s t is
granted
immediately)
while
others require
a task
switch and h e n c e
are q u i t e
expensive.
The
probability that
a r e a d lock w i l l
h a v e to w a i t is
proportional to
the
n u m b e r of
conflicting locks
[write)
currently granted.
The
p r o b a b i l i t y that a w r i t e lock w i l l h a v e to w a i t
is p r o p o r t i o n a l to
the n u m b e r of
c o n f l i c t i n g (read or write) l o c k s
that are c u r r e n t l y
granted.
T a b l e 4 g i v e s a guess of
t h e m a x i m u m number of
i c ~ k s of
each t y p e held by e a c h t r a n s a c t i o n .
If t h e r e are 2~N+I t r a n s a c t i o n s
one can
m u l t i p l y the
e n t r i e s of
Table 4
by N
to get
an a v e r a g e
n u m b e r of locks held
by all Others.
If a wait
lock r e q u e s t is C÷I
times as e x p e n s i v e as an i m m e d i a t e l y g r a n t e d r e q u e s t and if P is t h e
probability that
two d i f f e r e n t r e q u e s t s
are for the
same r e s o u r c e
then the r e l a t i v e c o m p u t a t i o n a l c o s t s are r o u g h l y c o m p u t e d :
degree
0 overhead:
W
p~C~N~W
c o s t of s e t t i n g locks
c o s t of w a i t s
degree
I overhead:
w
p~C~N*W*W
c o s t of w r i t e s
c o s t of w a i t s
degree
2 overhead:
W+R
P ~ C ~ N ~ W = (W+I)
P~C~N*R~W
c o s t of s e t t i n g locks
cost of w r i t e waits
c o s t of r e a d waits
degree
3 overhead:
W+R
c o s t of s e t t i n g l o c k s
cost of w a i t i n g for w r i t e s
c o s t of w a i t i n g for r e a d s
p*C~N~R~W
TABLE
6. C o m p u t a t i o n a l
CONSISTENCY
overhead
DEGREE
CPU
vs d e g r e e s
(in c a l l s
of c o n s i s t e n c y .
to lock
sys)
I
W + P ~ C ~ N * N ~ (I)
W + R ÷ P ~ C ~ N ~W • (W+2~R)
To c o n s i d e r
five r e a d s
a s p e c i f i c example,
(R=5) and
six (W=6)
a simple
writes.
banking
transaction does
A transaction
accesses a
390
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
r a n d o m a c c o u n t and t h e r e are m i l l i o n s of a c c o u n t s s o the p r o b a b i l i t y
of c o l l i s i o n , P, is r o u g h l y .000001.
S u p p o s e t h e r e are o n e h u n d r e d
t r a n s a c t i o n s per second.
A lock
t a k e s one h u n d r e d i n s t r u c t i o n s and
a wait
r e q u i r e s five
thousand instructions;
hence, C=50.
So t h e
term P*C~N~W e v a l u a t e s
to 0.015.
This implies that Table
5 gave a
good e s t i m a t e of the
CPU o v e r h e a d b e c a u s e t h e last t e r m
in Table 6
is m i n i s c u l e c o m p a r e d
to the term W+R.
Of c o u r s e
t h i s a n a l y s i s is
very s e n s i t i v e
to P
and o n e must
design the data
base so
that P
t a k e s on a very small value.
The s t r i k i n g t h i n g about these e s t i m a t e s is t h a t d e g r e e 2 and d e g r e e
3 seem
to h a v e
similar computational
overhead which
seems to
be
substantially
larger
than
the
overhead
of
degree
0
or
I
consistency.
We s u s p e c t
that t h i s c o n c l u s i o n w o u l d
survive a more
c a r e f u l s t u d y of the problem.
Gmmdarity of locks and degrees of consistency
ISSUE
.
.
.
.
.
.
.
.
.
.
l
.
.
.
.
.
.
.
DEGREE
.
.
COMMITTED
DATA
.
.
.
.
.
.
.
.
.
.
0
.
.
.
]
.
.
.
.
DEGREE
.
.
.
.
.
.
.
.
.
.
.
I
.
.
.
]
.
.
.
.
.
391
DEGREE
.
.
.
.
.
.
.
.
.
.
.
.
2
.
.
.
.
I
.
DEGREE
3
.
WRITES ARE
COMMITTED
IMMEDIATELY
WRITES
ARE
COM M I T T E D
AT EOT
SAME
DIRTY
DATA
YOU D O N I T
UPDATE DIRTY
DATA
0 A N D NO O N E
ELSE UPDATES
YOUR DIRTY
DATA
0, I A N D Y O U
DON'T READ
DIRTY DATA
0,1 ,2 A N D
NO O N E E L S E
DIRTIES
DATA
YOU READ
LOCK
PROTOCOL
SET SHORT
EXCL. LOCKS
O N ANY DAT A
YOU WRITE
SET LONG
EXCL. LOCKS
ON ANY DATA
YOU WRITE
I AND SET SHORT
SHARE LOCKS
ON A N Y D A T A
YOU READ
I AND SET LONG
SHARE LOCKS
ON ANY DATA
YOU READ
TRANSACTION
STRUCTURE
WELL FORMED
WRT WRITES
(WELL FORMED
AND 2 PHASE)
WRT WRITES
CONCURRENCY
GREATEST :
ONLY WAIT
FOR SHORT
WRITE LOCKS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
OVERHEAD
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
TRANSACTION BACKUP
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
WELL FORMED
AND TWO PHASE
LOWEST:
ANY DATA
TOUCHED
IS
L O C K E D T O ROT
.
.
.
.
MEDIUM:
SET BOTH KINDS
OF L O C K S BUT
NEED NOT STORE
SHORT LOCKS
.
.
.
.
HIGHEST :
SET AND STORE
BOTH KINDS OF
L O C KS
J
SAME
AS
2
0,1 AND CAN'T
READ BAD DATA
ITEMS
0, 1,2 A N D CAN' T
READ BAD DATA
RELATIONSHIPS
S A M E AS I: B U T
R E S U L T IS S A M E
AS SOME SCHEDULE
2 AND SCHEDULE
IS S ER IAL
.
0 AND CAN'T
LOSE WRITES
SYSTEM
RECOVERY
TECHNIQUE
A P P L Y LOG
IN O R D E R OF
ARRIVAL
APPLY LOG
IN < ORDER
W- >W
NONE
NONE
.............................
Table
.
1
UN-DO
ANY
INCO~PLE~
TRANSACTIONS
IN A N Y O R D E R
LETS OTHERS
RUN HIGHER
CONSISTENC Y
ORDERING
.
AS
UN-DO ALL
INCOMPLETE
TRANSACTIONS
IN A N Y O R D E R
PROTECTION
PROVIDED
DEPENDENCIES
SAME
MEDIUM:
ALSO WAIT FOR
READ LOCKS
SMALL:
ONLY SET
WRITE
LOCKS
CAN NOT UNDO
WITHOUT
CASCADING
TO
OT HE RS
.
.
1
WELL FORMED
(AND 2 P H A S E
WRT WRITES)
GREAT:
ONLY WAIT
FOR WRITE
LOCKS
LEAST:
ONLY S E T
SHORT WRITE
LOCKS
AS
7.
W- >W
W->R
W->W
W->R
R->W
< < < IS AN
< IS AN
<< IS AN
ORDERING
OF
ORDERING
OF
ORDERING
OF
THE TRANSTHE TRANSTHE TRANSACT I O N S
ACTIONS
ACTIONS
£ .............................................
Summary
of
consistency
degrees.
392
II!.
LOCK
SYSTEMS:
J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger
GRANULARITY
AND
DEGREES
OF
CONSISTENCY
IN
EXISTING
I M S / V S with the
p r o g r a m i s o l a t i o n f e a t u r e [2 ] h a s a
two level lock
hierarchy:
segment types
(sets of r e c o r d s ) , and
segment instances
(records) w i t h i n
a segment
type.
Segment types
may be
l o c k e d in
E X C L U S I V E (E) m o d e (which c o r r e s p o n d s to
our e x c l u s i v e (x) mode) or
in E X P R E S S
R E A D (R) , R E T R I E V E
(G) , or
U P D A T E (U) (each
of w h i c h
correspond
to
our
notion
of
intention
(I)
mode)
[ 2,
pages
3 . 1 8 - 3 . 2 7 ]. S e g m e n t i n s t a n c e s
can be l o c k e d in
s h a r e or e x c l u s i v e
mode.
S e g m e n t type
l o c k s are r e q u e s t e d at
t r a n s a c t i o n initiation,
u s u a l l y in i n t e n t i o n
mode.
Segment instance locks
are d y n a m i c a l l y
set
as the
transaction
proceeds.
In
addition
I M S / ~ S has
user
controlled share
l o c k s on s e g m e n t
i n s t a n c e s (the =Q
option) which
a l l o w other
read r e q u e s t s but not
other ~Q or
e x c l u s i v e requests.
I M S / V S has no n o t i o n of S or SIX l o c k s on s e g m e n t t y p e s (which w o u l d
a l l o w a scan of all m e m b e r s of
a s e g m e n t t y p e c o n c u r r e n t with o t h e r
r e a d e r s but w i t h o u t the o v e r h e a d
of l o c k i n g each s e g m e n t i n s t a n c e ) .
Since IMS/VS does not
s u p p o r t S mode on s e g m e n t t y p e s
one need not
distinguish
the two
intention modes
IS
a n d IX
(see the
section
i n t r o d u c i n g IS
and I X modes).
In
g e n e r a l , I M S / V S has a n o t i o n of
i n t e n t i o n m o d e and does i m p l i c i t l o c k i n g
but d o e s not r e c o g n i z e all
the modes d e s c r i b e d here.
It u s e s a static two l e v e l l o c k tree.
I M S / V S with the p r o g r a m i s o l a t i o n
f e a t u r e b a s i c a l l y p r o v i d e s degree
2 consistency.
However degree
1 c o n s i s t e n c y can
be o b t a i n e d
on a
segment type
b a s i s in a PCB
(view) by s p e c i f y i n g the
EXPRESS READ
option
for that
segment.
Similarly
degree 3
c o n s i s t e n c y can
be
o b t a i n e d by s p e c i f y i n g the E X C L U S I V E
or U P D A T E o p t i o n s . I M S / V S a l s o
has the user c o n t r o l l e d s h a r e l o c k s
discusseH above which a program
can
r e q u e s t on
selected
segment
i n s t a n c e s to
obtain
additional
consistency
over the
degree I
or
2 consistency
p r o v i d e d by
the
system.
I M S / V S w i t h o u t the p r o g r a m i s o l a t i o n
f e a t u r e (and a l s o the p r e v i o u s
v e r s i o n of
IMS n a m e l y
IMS/2) d o e s n ' t have
a lock
hierarchy since
l o c k i n g is done only
on a s e g m e n t t y p e basis. It
provides degree I
c o n s i s t e n c y with d e g r e e 3 c o n s i s t e n c y
o b t a i n a b l e for a s e g m e n t t y p e
in
a
view by
specifying
the
E X C L U S I V E option.
User
controlled
l o c k i n g is a l s o p r o v i d e d on a l i m i t e d basis via the H O L D option.
DMS 1100 has a two l e v e l lock h i e r a r c h y [ q]: a r e a s
and p a g e s w i t h i n
areas.
Areas
may be
locked in one
of s e v e n
m o d e s when
they are
OPENed:
EXCLUSIVE RETRIEVAL
(which c o r r e s p o n d s
to
our notion o f
e x c l u s i v e mode), P R O T E C T E D
U P D A T E (which c o r r e s p o n d s to
our n o t i o n
of share and
i n t e n t i o n e x c l u s i v e mode), P R O T E C T E D
R E T R I E V A L (which
we
call share
mode), U P D A T E
(which c o r r e s p o n d s
to our
intention
e x c l u s i v e mode), and R E T R I E V A L (which
is our i n t e n t i o n s h a r e mode).
G i v e n this
t r a n s l i t e r a t i o n , the
compatibility matrix
d i s p l a y e d in
T a b l e 1 is i d e n t i c a l to
the c o m p a t i b i l i t y
m a t r i x of DMS
1100 [3,
page 3.59].
However, DMS 1100 s e t s
only e x c l u s i v e l o c k s
on p a g e s
within
areas
(short term
share
locks
are i n v i s i b l y
set
during
internal pointer following).
F u r t h e r , even if a
t r a n s a c t i o n locks
an area in e x c l u s i v e mode, DMS 1100 c o n t i n u e s to set e x c l u s i v e locks
(and i n t e r n a l
share locks) on
the pages
in the area,
d e s p i t e the
f a c t that an e x c l u s i v e lock on an area p r e c l u d e s r e a d s or u p d a t e s of
t h e area by
other transactions.
Similar observations
apply to the
DMS 1100
i m p l e m e n t a t i o n of S and
SIX modes.
In g e n e r a l ,
DMS 1100
r e c o g n i z e s all the m o d e s d e s c r i b e d h e r e and
u s e s i n t e n t i o n m o d e s to
d e t e c t c o n f l i c t s but
does not u t i l i z e i m p l i c i t locking.
It uses a
s t a t i c two l e v e l lock tree.
Granularity oflocks and degrees o f consistency
393
DMS 1100 provides level 2 c o n s i s t e n c y
by setting exclusive l o c k s on
the
modified
pages and
and
a
t e m p o r a r y lock
on
the
page
corresponding
to the
page which
is
"current of
run unit".
The
t e m p o r a r y lock is r e l e a s e d when the
"current of run unit" is moved.
In a d d i t i o n a
run-unit can o b t a i n additional locks
via an explicit
K ~ P command.
The ideas presented
were d e v e l o p e d in the process
of d e s i g n i n g and
i m p l e m e n t i n g an
experimental data base system
at the IBM
San J o s e
Research Laboratory.
(We wish to e m p h a s i z e
that this s y s t e m
is a
vehicle
for
research
in
data base
architecture,
and
does
not
indicate plans for future IBN products.)
R subsystem which p r o v i d e s
the modes
of locks
herein described, plus
the n e c e s s a r y
logic to
schedule
requests
and
conversions,
and
to
detect
and
resolve
deadlocks
has
been
implemented
as
one
component
of
the
data
manager.
The lock subsystem is in turn
used by the data m a n a g e r to
a u t o m a t i c a l l y lock
the nodes
of its
lock graph
(see F i g u r e
12).
Users can be unaware of these lock protocols beyond the verbs " b e g i n
t r a n s a c t i o n " and "end transaction".
The
data base
is broken
into
s e v e r a l storage
areas.
Each
area
contains
a
set of
relations
(files),
their indices,
and
their
tuples(records) along with a c a t a l o g of
the area.
Each tuple has a
unique tuple identifier (data b a s e key) which can be used to q u i c k l y
(directly) address the tuple. Each tuple i d e n t i f i e r maps to a set of
field values.
All tuples are s t o r e d t o g e t h e r in an a r e a - w i d e heap
to allow
physical clustering
of t u p l e s
from different r e l a t i o n s .
The unused slots
in this heap are r e p r e s e n t e d by
an a r e a - w i d e pool
of free
tuple identifiers
(i.e. i d e n t i f i e r s
not aliocated
to any
relation).
Each tuple
"belongs"
to a unique \relation, and
all
t u p l e s in a relation
have the s a m e number and type
of fields.
One
may c o n s t r u c t an
index on any sub~et
of the fields of
a relation.
Tuple i d e n t i f i e r s give fast direct
access to tuples, while
indices
give
fast
associative
access
to
field
values
and
to
their
c o r r e s p o n d i n g tuples.
Each key value in an index is made a l o c k a b l e
object
in order
to solve the
p r o b l e m of
"phantoms" [1]
without
locking
the entire
index.
We
do not
e x p l i c i t l y lock
individual
fields or whole indices so t h o s e nodes
a p p e a r in Figure 12 only for
p e d a g o g i c a l reasons.
Figure 12 gives only the "logical" lock graph;
there is
also a graph for p h y s i c a l
page locks
and for
o t h e r low
level resources.
Rs can be seen,
Figure 12 is not a tree.
Heavy use
is made of the
techniques mentioned in the s e c t i o n
on locking DAG's.
For example,
one can
read via tuple i d e n t i f i e r
without setting any
index locks
but to lock a field for u p d a t e
its t u p l e i d e n t i f i e r and the old and
new index key values cowering the updated
field must be l o c k e d in X
mode.
Further,
the tree is
not static,
since data base
keys are
d y n a m i c a l l y allocated to relations:
field values d y n a m i c a l l y enter,
move around
in, and
leave index value
intervals when
records are
inserted, u p d a t e d and deleted; r e l a t i o n s and indices are d y n a m i c a l l y
created
and
destroyed
within areas;
and
areas
are
dynamically
allocated.
The implementation of s u c h
o p e r a t i o n s observes the lock
p r o t o c o l presented
in the
section on dynamic
graphs: when
a node
c h a n g e s parents, all old and new p a r e n t s must be held (explicitly or
implicitly) in
intention e x c l u s i v e
mode and the
n~de to
be moved
must be held in exclusive mode.
The d e s c r i b e d
system supports c o n c u r r e n t l y c o n s i s t e n c y
d e g r e e s 1,2
and 3 which can be
specified on
a transaction basis.
In a d d i t i o n
share locks on individual tuples can be acquired by the user.
J.N. Gray R.A. Lorie, G.R. Putzolu and J.L. Traiger
394
DATA
BASE
I
I
AREAS
I
I
I
RELATIONS
I
I
I
IN DICE S
I
I
INDEX KEY
INTERVALS
I
I
I
I
I
FREE
TUPLE
IDENTIFIERS
I!
ALLOCATED
TU PL E
IDENTIFIERS
I
I I
I I
INDEXED
FIELDS
I
I
UN-INDEXED
FIELDS
Figure 12.
~ lock graph.
ACKNOWLEDGMENT
We g r a t e f u l l y acknowledge many helpful
d i s c u s s i o n s with Phil Macri,
Jim Mehl and Brad Wade on how
locking works in existing systems and
how
these results
might be
better presented.
We are
especially
indebted to Paul McJones in this regard.
REFERENCES
[i]
K.P. Eswaran,
J.N. Gray,
R.A.
Lorie, I.L.
Traiger, On
the
Notions of C o n s i s t e n c y and
Predicate Locks,
technical Report
RJ.I~87, IBM Research Laboratory, San Jose, Ca., Nov. 1974. (to
appear CACM).
[2]
Information
Application
1975.
[3]
UNIVAC 1100
Series Data
Management System
(DMS 1100).
ANSI
COBOL
Field Data
M a n i p u l a t i o n Language.
Order No.
UP7908-2,
Sperry Rand Corp., May 1973.
Management System Virtual Storage (I~S/VS).
Design Guide,
Form
No.
SH20-9025-2, IBM
System
corp. ,
Download