. Modelling in Data Base Management Systems. G.M. Ni]ssen, (ed.) North Holland Publishing Company, 1976 Granularity J.N. Gray, of Locks and Degrees of C o n s i s t e n c y in a Shared Data Base R.A. Lorie, G.R. Putzolu, I.L. Traiger ÷ IBM ~esearch Laboratory San 3ose, California The problem of choosing the appropriate H r a n u l a r i t ~ (size) of lockable objects is introduced and the tradeoff between concurrency and overhead is discusseS. A locking protocol which allows s i m u l t a n e o u s locking at v a r i o u s g r a n u l a r i t i e s by different t r a n s a c t i o n s is presented. It is based on the introduction of additional lock m o d e s besides the conventional share mode an5 exclusive mode. A proof is given of the equivalence of this protocol to a conventional one. .L Next the issue of consistency in a shared e n v i r o n m e n t is analyze~. This discussion is motivated by the r e a l i z a t i o n that some existing data base systems use a u t o m a t i c lock protocols which insure protection only from certain types of inconsistencies (for instance those arising from transaction backup), thereby a u t o m a t i c a l l y providing a limited degree of consistency. Four ~ S ~ ~ consistency are introduced. They can be roughly c h a r a c t e r i z e d as follows: degree 0 protects others from your updates, degree I additionally p r o v i d e s protection from losing updates, degree 2 additionally provides protection from reading incorrect data iteas, and degree 3 additionally provides protection from r e a d i n g incorrect relationships among data items (i.e. total protection). A discussion follows on the relationships of the four degrees to locking protocols, concurrency, overhead, recovery and t r a n s a c t i o n structure. Lastly, these ideas management systems. I. GRANULARITY are compared with existingdata OF LOCKS: An important issue which arises in the design of a data Dase management system is the choice of lockable unitE, i.e. the data aggregates which are atomically locked to insure consistency. Examples of lockable units are areas, files, i n d i v i d u a l records, field values, and i n t e r v a l s of field values. The choice of l o c k a b l e units presents a tradeoff between concurrency and overhead, which is r e l a t e d to the size or K r a n u l a r i t Z of the units themselves. On.the one hand, concurrency is increased if a fine lockable unit (for example a record or field) is chosen. Such unit is appropriate for a "simple" t r a n s a c t i o n which a c c e s s e s few records. On the other hand a fine unit of locking would be costly for a "complex" transaction which accesses a large number of records. Such a t r a n s a c t i o n would have to set and r e s e t a large 365 366 J.N. Gray, R.A. Lorie, G.R. Putzolu and.LL. Traiger number of locks, i n c u r r i n g the computational overhead of many invocations of the lock s u b s y s t e m , and the storage overhead of representing many locks. A coarse l o c k a b l e unit (for e x a m p l e a file) is p r o b a b l y c o n v e n i e n t for a t r a n s a c t i o n which a c c e s s e s many records. However, such a coarse unit discriminates against transactions which only want to lock one m e m b e r o f t h e file. From this ~ i s c u s s i o n it follows that it would be d e s i r a b l e to have lockable units of d i f f e r e n t granularities coexisting in the same system. This paper p r e s e n t s a lock and discusses the related g r a n t i n g and c o n v e r t i n g lock Hierarchical protocol satisfying these implementation issues of requests. requirements scheduling, locks: We will first a s s u m e that t h e set of r e s o u r c e s to be l o c k e d is o r g a n i z e d in a hierarchy. N o t e that t h i s h i e r a r c h y is used in the c o n t e x t of a c o l l e c t i o n of r e s o u r c e s and has n o t h i n g to do with t h e data model used in a data base system. The h i e r a r c h y of Figure I may be suggestive. We a d o p t the n o t a t i o s that each l e v e l of t h e h i e r a r c h y is given a n o d e t y p e w h i c h is a g e n e r i c name for a l l the n o d e i n s t a n c e s of t h a t type. For e x a m p l e , the d a t a base has n o d e s of type area as its immediate descendants, e a c h area in turn h a s n o d e s of type file as its i m m e d i a t e d e s c e n d a n t s and e a c h file has n o ~ e s of type r e c o r d as its i m m e d i a t e d e s c e n d a n t s in the h i e r a r c h y . S i n c e it is a h i e r a r c h y , each node has a u n i q u e p a r e n t . DATA BASE i l AREAS i I FILES l i RECORDS Figure 1. A sample lock hierarchy. Each node of t h e h i e r a r c h y can be locked. If one r e q u e s t s e x c l u s i v e a c c e s s (X) to a p a r t i c u l a r node, then when the r e q u e s t is granted, the r e q u e s t o r has e x c l u s i v e a c c e s s to that n o d e and implicitly to each of its d e s c e n d a n t s . If one requests s h a r e d a c c e s s (S) to a p a r t i c u l a r node, then when the r e q u e s t is g r a n t e d , the r e g u e s t o r h a s s h a r e d a c c e s s to t h a t node and i_mp_l_icitly to e a c h d e s c e n d a n t of that node. These two a c c e s s modes lock an e n t i r e s u b t r e e r o o t e d at t h e r e q u e s t e d node. 3ur goal is to find s o m e t e c h n i q u e for i_m~_l_icitl~ l o c k i n g an e n t i r e subtree. In o r d e r to lock a s u b t r e e r o o t e d at n o d e R in s h a r e or e x c l u s i v e mode it is i m p o r t a n t to prevent s h a r e or e x c l u s i v e l o c k s on the a n c e s t o r s of R which would implicitly lock R and its descendants. Hence a new access mode, i n t e n t i o n mode (I), is introduced. I n t e n t i o n mode is used to "tag" (lock) all a n c e s t o r s of a node to be l o c k e d in s h a r e or e x c l u s i v e mode. These tags s i g n a l t h e fact that l o c k i n g is being done at a " f i n e r " level and t h e r e b y prevents implicit or e x p l i c i t exclusive or share l o c k s on the ancestors. Gra.ularity oflocks and degrees o f eonsistemy 367 The p r o t o c o l to lock a subtree rooted at node R in e x c l u s i v e or s h a r e m o d e is to first lock all a n c e s t o r s of R in i n t e n t i o n m o d e and %hen to lock node R in e x c l u s i v e or s h a r e mode. For e x a m p l e , using F i g u r e I, to lock a particular f i l e one should obtain intention ~ c z e s s to the data base, to t h e a r e a c o n t a i n i n g the f i l e and then request e x c l u s i v e (or share) a c c e s s to the file i t s e l f . This i m p l i c i t l y locks all r e c o r d s of t h e file in e x c l u s i v e (or share) mode. Access modes and compatibility: we say t h a t t w o lock r e q u e s t s for the same n o d e by two d i f f e r e n t t r a n s a c t i o n s are compatible if t h e y can be g r a n t e d concurrently. The mode of the r e q u e s t d e t e r m i n e s its c o m p a t i b i l i t y with r e q u e s t s made by other t r a n s a c t i o n s . The three modes X, S and I are i n c o m p a t i b l e with one a n o t h e r but d i s t i n c t S r e q u e s t s may be g r a n t e d t o g e t h e r and d i s t i n c t I r e q u e s t s m a y be g r a n t e d t o g e t h e r . The c o m p a t i b i l i t i e s among m o d e s d e r i v e from their s e m a n t i c s . Share mode allows reading but not modification of the corresponding r e s o u r c e by the r e q u e s t o r and by other transactions. The semantics of e x c l u s i v e mode is that t h e g r a n t e e may read and modify %he r e s o u r c e but no other t r a n s ~ c t i o n m a y read or m o d i f y the resource while the e x c l u s i v e lock is set. T h e r e a s o n for d i c h o t o m i z i n g share and e x c l u s i v e a c c e s s is that s e v e r a l share r e q u e s t s can be g r a n t e d concurrently (are c o m p a t i b l e ) w h e r e a s an exclusive request is not c o m p a t i b l e with any other r e q u e s t . I n t e n t i o n mode was i n t r o d u c e d to be i n c o m p a t i b l e with s h a r e and e x c l u s i v e mode (to p r e v e n t s h a r e and ~ e x c l u s i v e locks). However, i n t e n t i o n mode is c o m p a t i b l e with itself since two transactions having intention access to a node will explicitly lock d e s c e n d a n t s of t h e node in X, S or I mode and thereby will either be c o m p a t i b l e with one another or w i l l be s c h e d u l e d on the basis of t h e i r r e q u e s t s at the f i n e r level. For example, two t r a n s a c t i o n s can s i m u l t a n e o u s l y be granted the data base and some area and s o m e f i l e in i n t e n t i o n mode. In t h i s case t h e i r e x p l i c i t locks on p a r t i c u l a r r e c o r d s in the file w i l l r e s o l v e a n y c o n f l i c t s among them. The n o t i o n of i n t e n t i o n m o d e is r e f i n e d to i n t e n t i o n s h a r e mode (IS) and i n t e n t i o n e x c l u s i v e mode (IX) for two reasons: the intention s h a r e m o d e only r e q u e s t s share or i n t e n t i o n s h a r e locks at t h e l o w e r n o d e s of the tree (i.e. n e v e r r e q u e s t s an e x c l u s i v e l o c k b e l o w the intention share node), hence IS is c o m p a t i b l e with S mode. Since read o n l y is a common form of access it will be p r o f i t a b l e to distinguish this for greater concurrency. Secondly, if a t r a n s a c t i o n has an i n t e n t i o n share lock on a n o d e it can c o n v e r t % h i s to a share lock at a l a t e r time, but one c a n n o t c o n v e r t an i n t e n t i o n e x c l u s i v e lock to a s h a r e lock on a node. Rather to get the c o m b i n e d rights of s h a r e m o d e and i n t e n t i o n e x c l u s i v e mode one must o b t a i n an X or SIX mode lock. (This issue is d i s c u s s e d in t h e s e c t i o n on r e r e q u e s t s below). We r e c o g n i z e one further r e f i n e m e n t of modes, namely share and intention exclusive mode (SIX). Suppose one t r a n s a c t i o n wants to read an e n t i r e subtree and to update particular n o d e s of that subtree. Using the modes p r o v i d e d so far it would h a v e t h e o p t i o n s of: (a) r e q u e s t i n g e x c l u s i v e a c c e s s to the root of %he s u b t r e e and doing no f u r t h e r locking or (b) r e q u e s t i n g intention exclusive a c c e s s to the root of the s u b t r e e and e x p l i c i t l y locking the lower n o d e s in i n t e n t i o n , s h a r e or e x c l u s i v e mode. Alternative (a) h a s low c o n c u r r e n c y . If o n l y a small f r a c t i o n of the r e a d n o d e s are 368 J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger updated then alternative (b) has high l o c k i n g o v e r h e a d . The c o r r e c t a c c e s s mode w o u l d be share a c c e s s to the s u b t r e e thereby allowing the t r a n s a c t i o n to r e a d all n o d e s of the s u b t r e e without further locking _and intention exclusive access to the subtree thereby a l l o w i n g the t r a n s a c t i o n to set e x c l u s i v e l o c k s on t h o s e n o d e s in the s u b t r e e w h i c h are to be u p d a t e d and IX or SiX l o c k s on the i n t e r v e n i n g Nodes. Since this is such a c o m m o n case, SIX m o d e is i n t r o d u c e d for this purpose. It is c o m p a t i b l e with IS mode since other t r a n s a c t i o n s r e q u e s t i n g IS m o d e will e x p l i c i t l y lock lower nodes in IS or S m o d e t h e r e b y a v o i d i n g any u p d a t e s (IX or X mode) produced by the SIX mode transaction. H o w e v e r SIX m o d e is not c o m p a t i b l e with IX, S, SIX or X m o d e r e q u e s t s . Table 1 g i v e s the c o m p a t i b i l i t y of the request modes, where for completeness we have also introduced the null mode (NL) which r e p r e s e n t s t h e a b s e n c e of r e q u e s t s of a r e s o u r c e by a t r a n s a c t i o n . I_ I I 1 I NL NL Y ES IS YES IX Y ES S YES I S IX Y ES I_X___I_YES Table To s u m m a r i z e , IS YES YES YES YES [ES NO IX YES YES YES NO NO NO 1. C o m p a t i b i l i t i e s we r e c o g n i z e six m o d e s S YES YES NO YES NO NO SIX YES YES NO NO NO NO among access of i.e. access X YES NO NO NO NO NO to represents modes. a resource: NL: Gives no a c c e s s to a node, r e q u e s t of a r e s o u r c e . the absence IS: G i v e s •intention share a c c e s s to the r e q u e s t o r to lock d e s c e n d a n t does no i m p l i c i t locking.) IX: Gives intention exclusive a c c e s s to the requested a l l o w s the r e g u e s t o r to exRli____qcit_~l x lock descendants SiX, IX or IS mode. (It does n o i m p l i c i t locking.) t h e r e q u e s t e d node n o d e s in S or IS of a and a l l o w s mode. (It node and in X, S, S: G i v e s s h a r e access to the r e q u e s t e d n o d e and to all d e s c e n d a n t s of the requested node without setting further locks. (It implicitly sets S locks on all d e s c e n d a n t s of the requested node. ) SIX: Gives s h a r e and intention exclusive a c c e s s to the requested node. (In p a r t i c u l a r it i m p l i c i t l y l o c k s all d e s c e n d a n t s of the n o d e in s h a r e mode ~nd a l l o w s the r e q u e s t o r to e x p l i c i t l y lock d e s c e n d a n t nodes in X, SIX or IX mode.) X: Gives exclusive access to the requested node and to all d e s c e n d a n t s of the r e q u e s t e d n o d e w i t h o u t s e t t i n g f u r t h e r locks. (It i m p l i c i t l y s e t s X l o c k s on all descendants. L o c k i n g lower n o d e s in S or IS mode w o u l d g i v e no i n c r e a s e d access.) IS m o d e is the weakest n o n - n u l l form of a c c e s s to a resource. It c a r r i e s f e w e r p r i v i l e g e s than IX or S modes. IX mode a l l o w s IS, IX, S, S I X and X mode lecks to be set on d e s c e n d a n t n o d e s while S mode allows r e a d only access to all d e s c e n d a n t s of the n o d e without f u r t h e r locking. SIX mode c a r r i e s the privileges of S and of IX Granularity oflocks and degrees of consistency 369 mode (hence the n a m e SIX). X mode is th.= m o s t p r i v i l e g e d form of a z c e s s and a l l o w s r e a d i n g and w r i t i n g of all d e s c e n d a n t s of a node without f u r t h e r locking. H e n c e the mod~s can be r a n k e d in the p a r t i a l o r d e r (lattice) of p r i v i l e g e s s h o w n in F i g u r e 2. Note that it is not a t o t a l o r d e r since IX and S are i n c o m p a r a b l e . X I I SIX I I I IX I I I I IS I I NL Figure Rules for 2. The p a r t i a l ordering of modes by their privileges. request_in S nodes: Th-= i m p l i c i t l o c k i n g of nodes will not w o r k if t r a n s a c t i o n s are a l l o w e d to leap into the m i d d l e of the tree and b e g i n l o c k i n g n o d e s at random. The implicit locking implied by t h e S and X modes d e p e n d s on all t r a n s a c t i o n s o b e y i n g the f o l l o w i n g protocol: (a) Before requesting of the requested reguestor. an S or IS lock on a node, all a n c e s t o r n o d e s node m u s t be held in IX or IS m o d e by t h e (b) B e f o r e r e q u e s t i n g an X, SIX n o d e s of the r e q u e s t e d node or IX lock on a node, a l l a n c e s t o r must be h e l d in SIX or IX m o d e by the requestor. (c) Locks s h o u l d be r e l e a s e d e i t h e r at t h e (in any order) or in leaf to root order. are not held to end of t r a n s a c t i o n , one after r e l e a s i n g i t s a n c e s t o r s . end of the t r a n s a c t i o n In p a r t i c u l a r , if locks s h o u l d not hold a l o c k To p a r a p h r a s e this, l o c k s are requested, r o o t to le_aaf, a_n__dd_released leaf to root. Notice that leaf nodes are never requested in i n t e n t i o n mode since they have no d e s c e n d a n t s . Several examples: To lock r e c o r d R for read: lock d a t a - b a s e with lock a r e a c o n t a i n i n g R with l o c k file c o n t a i n i n g R with lock r e c o r d R with D o n ' t panic, the t r a n s a c t i o n a r e a and file lock. mode = IS mode = IS mode = IS mode = S probably already has the data base, J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger 370 To lock r e c o r d R for w r i t e - e x c l u s i v e access: lock data-base with m o d e = IX lock a r e a c o n t a i n i n g R with mode = IX lock f i l e c o n t a i n i n g R w i t h m o d e = IX lock r e c o r d R with mode = X Note t h a t if the r e c o r d s of t h i s and the previous distinct, each r e q u e s t can be granted simultaneously t r a n s a c t i o n s even t h o u g h both r e f e r to t h e s a m e file. To lock a f i l e F for read and w r i t e access: lock data-base with mode = IX lock a r e a c o n t a i n i n g F w i t h m o d e = IX lock f i l e F with mode = X Sinc~ t h i s r e s e r v e s e x c l u s i v e a c c e s s to the file, uses t h e s a m e file as the p r e v i o u s two e x a m p l e s t r a n s a c t i o n s will h a v e to wait. example are to d i f f e r e n t if this request it or the other To l o c k a f i l e F for c o m p l e t e scan and o c c a s i o n a l update: lock d a t a - b a s e with mode = IX lock a r e a c o n t a i n i n g F w i t h m o d e = IX lock f i l e F with mode = SIX Thereafter, particular r e c o r d s in F can be l o c k e d for update by locking records in X mode. Notice that (unlike the previous example) this transaction is c o m p a t i b l e with the f i r s t example. This is t h e r e a s o n for i n t r o d u c i n g SIX rood_ ~. To q u i e s c e t h e data base: lock d a t a base with mode = X. N o t e that t h i s l o c k s e v e r y o n e else Directed out. acycl_~ic qra]~hs of locks: The notions so far introduced can be g e n e r a l i z e d to work for ~irected acyclic graphs (DAG) of resources rather than simply hierarchies of resources. A tree is a simple DAG. The key o b s e r v a t i o n is that to i m p l i c i t l y or e x p l i c i t l y lock a node, one should l o c k _all the p a r e n t s of the nod~ in t h e DAG and so by i n d u c t i o n l o c k all a n c e s t o r s of the node. In p a r t i c u l a r , to l o c k a s u b g r a p h one must i m p l i c i t l y or e x p l i c i t l y l o c k all a n c e s t o r s of the subgraph in t h e appropriate mode (for a tree t h e r e is only one parent). To give an e x a m p l e of a non-hierarchical structure, i m a g i n e t h e l o c k s are o r g a n i z e d as in F i g u r e 3. DATA BASE I I AREAS I I I FILES INDICES I I I __I I I R ECOR DS Figure 3. A n o n - h i e r a r c h i c a l lock graph. Gramdarity of looks and degrees of consisteney 371 We p o s t u l a t e that areas are " p h y s i c a l " nDtions and that files, indices and r e c o r d s are logical notions. The data b a s e is a collection of areas. Each area is a c o l l e c t i o n of files and indices. Each file has a corresponding i n d e x in the same area. Each record b e l o n g s to s o m e f i l e and t o its c o r r e s p o n d i n g index. A r e c o r d is c o m p r i s e d of field v a l u e s and so~e f i e l d is i n d e x e d by t h e index associated with the file containing the record. The file g i v e s a s e q u e n t i a l a c c e s s path to the r e c o r d s and the i n d e x gives an a s s o c i a t i v e a c c e s s p a t h to t h e r e c o r d s based on field values. Since individual fields are n e v e r locked, they ~o not a p p e a r in t h e l o c k graph. To write a r e c o r d R in file F w i t h index I: l o c k d a t a base with mode = IX lock area c o n t a i n i n g F w i t h m o d e = IX lock file F with mode = IX lock i n d e x I w i t h m o d e = IX lock r e c o r d E with mode = X Note that a l l paths to r e c o r d could lock F and I in e x c l u s i v e e x c l u s i v e mode. R are locked. Alternaltively, one mode t h e r e b y i m p l i c i t l y l o c k i n g R in To give a more c o m p l e t e e x p l a n a t i o n we o b s e r v e that a node can be l o c k e d e_/x~lici_t!~ (by r e q u e s t i n g it) or implici_tl I (by a p p r o p r i a t e e x p l i c i t l o c k s o n the a n c e s t o r s of the node) in o n e of ~ive modes: IS, IX, S, SIX, X. However, t h e d e f i n i t i o n of i m p l i c i t locks and the protocols for s e t t i n g e x p l i c i t l o c k s have to be e x t e n d e d for D A G ' s as f o l l o w s : A n o d e is i_m~licit_!l~ g r a n t e d in S mode to a t r a n s a c t i o n if at least of its p a r e n t s is ( i m p l i c i t l y or e x p l i c i t l y ) g r a n t e d to the t r a n s a z t i o n in S, SIX or X mode. By i n d u c t i o n that m e a n s t h a t at l e a s t one of the n o d e ' s a n c e s t o r s m u s t be e x p l i c i t l y g r a n t e d in S, SIX or X mode to the t r a n s a c t i o n . one A node is imDl~Gitl__[ ~ r a n t e d in X mode if ~ ! ! of its p a r e n t s are ( i m p l i c i t l y or e x p l i c i t l y ) g r a n t e d to the t r a n s a c t i o n in X mode. By i n d u c t i o n , this is e q u i v a l e n t to the c o n d i t i o n that all n o d e s in some cut set of the c o l l e c t i o n of all paths l e a d i n g f r o m the node to t h e r o o t s of the g r a p h are e x p l i c i t l y g r a n t e d to the t r a n s a c t i o n in X mode and all a n c e s t o r s of n o d e s in the c u t set are e x p l i c i t l y g r a n t e d in IX or SIX mode. From Figure 2, a n o d e is i m p l i c i t l y g r a n t e d in IS m o d e if it is implicitly granted in S mode, a n d a n o d e is i m p l i c i t l y g r a n t e d in IS, IX, S and SIX mode if it is i m p l i c i t l y g r a n t e d in X mode. ~h~ protocol for ~ ! ~ ! ~ ~s~t~n~ locks on a DAG: (a) B e f o r e r e q u e s t i n g an S or IS lock on a node, one s h o u l d r e q u e s t at least one parent (and by i n d u c t i o n a path to a root) in IS (or greater) mode. As a c o n s e q u e n c e none of the a n c e s t o r s a l o n g this path can be g r a n t e d to another transaction in a mode i n c o m p a t i b l e with IS. (b) B e f o r e r e q u e s t i n g IX, SIX or X m o d e a c c e s s to a node, one should r e q u e s t all p a r e n t s of the node in IX (or greater) mode. As a c o n s e q u e n c e all a n c e s t o r s will be held in IX (or g r e a t e r mode) and c a n n o t be held b y o t h e r t r a n s a c t i o n s in a mode i n c o m p a t i b l e with IX (i. e. S, S I X , X). J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger 372 (c) Locks s h o u l d be r e l e a s e d e i t h e r at the end of the transaction (in a n y order) or in leaf to root order. In particular, zt locks are not held to the end of t r a n s a c t i o n , one s h o u l d not hold a lower lock after releasing its ancestors. To g i v ~ in f i l e l o c k on lock lock lock an e x a m p l e u s i n g F i g u r e 3, a sequential scan F n e e d n o t use a n i n d e x so o n e can g e t an e a c h r e c o r d in t h e f i l e by: data area file base containing F F with with with mode mode mode of all records implicit share = IS = IS = S This gives implicit S mode access to all r e c o r d s i n F. to r e a d a r e c o r d in a f i l e via t h e i n d e x I f o r f i l e F, g e t a n i m p l i c i t or e x p l i c i t l o c k o n f i l e F: lock lock lock data base area containing index I R with with with mode mode mode = IS = IS = S implicit S mode access to all records This again gives both these cases, ~n__II o n ~ 9ath was (in f i l e F). In readinq. But one Conversely, o n e n e e d not to insert, d e l e t e or u p d a t e a r e c o r d m u s t g e t an i m p l i c i t or e x p l i c i t lock in index I locked for R in f i l e F with index on all ancestors of R. I The first example of r e c o r d is o b t a i n e d . file one can simply a r e a in X mode. The file without further implicitly g r a n t e d in this section showed how an explicit X l o c k on a To g e t a n i m p l i c i t X l o c k on a l l r e c o r d s in a lock the index and file in X m o d e , o r l o c k the latter examples a l l o w b u l k l o a d o r u p d a t e of a locking since all r e c o r d s in the file are X mode. Proof of the of !~uivalence lock protocol. We will now prove that the described lock protocol is equivalent to a conventional one which uses only two modes (S a n d X), and which explicitly locks atomic resources (the l e a v e s of a t r e e o r s i n k s of a DAG). Let G = (N,A) be a f i n i t e ( d i r e c t e d a c y c l i c ) ~_raph w h e r e N is t h e set of nodes and A i s t h e s e t of a r c s . G is a s s u m e d to b e w i t h o u t circuits (i.e. t h e r e is no n o n - n u l l path leading from a node n to itselt~. A n o d e p is a p a r e n t of a n o d e a a n d n is a c h i l d o f p if t h e r e is a n a r c f r o m p t o n. A node n is a source (sink) i f n has no parents (no c h i l d r e n ) . L e t SI b e th~ s e t o f sinks of G. An a n c e s t o r of n o d e n is any n o d e ( i n c l u d i n g n) i n a p a t h f r o m a s o u r c e t o n. A node-slice o f a s i n k n is a c o l l e c t i o n of nodes such that each path from a source to n contains at l e a s t one node of the slice. We a l s o i n t r o d u c e the set of lock modes M = [NL, IS, I K , S , S I X , X } and the compatibility matrix C : MxM->{YES,N3} described in T a b l e I. Let c : mxm->{YES,NO} be the r e s t r i c t i o n of C t o m = [NL, S , X } . 373 Granularity of locks and degrees of consistency A lock_-qr_a~h is a m a p p i n g L : N->M such that: (a) if L(n) e {IS,S} then e i t h e r n is a s o u r c e or t h e r e exists a p a r e n t p of n such that L(p) e [ I S , I X , S , S I X , X } . By i n d u c t i o n there e x i s t s a path from a source to n such that L t a k e s only v a l u e s in {IS,IX,S,SIX, X] on it. E q u i v a l e n t l y L is not e q u a l to NL on the path. (by if L(n) e {IX,SIX,X] then e i t h e r n is a root or for all p a r e n t s pl...pk of n we h a v e L(pi) e {IX, SIX, X} (i=1...k). By i n d u c t i o n L t a k e s o n l y v a l u e s in {IX,SIX, X] on all the a n c e s t o r s of n. The i n t e r p r e t a t i o n of a l o c k - g r a p h is t h a t it gives a map of the explicit locks held by a particular transaction observing the six state lock p r o t o c o l d e s c r i b e d above. The n o t i o n of p r o j e c t i o n of a l o c k - g r a p h is now i n t r o d u c e d to m o d e l the set of i m p l i c i t l o c k s on a t o m i c r e s o u r c e s a c q u i r e d by a t r a n s a c t i o n . The ~ r o j e c t i o n of a l o c k - g r a p h L is the m a p p i n g I: SI->m c o n s t r u c t e d as f o l l o w s : (a) l ( n ) = X if there e x i s t s a node-slice [nl...ns} of n such that L(ni) =X for each node in t h e slice. (b) 1 (n)=S if (a) is not s a t i s f i e d and there e x i s t s an a n c e s t o r a of n such that L(a) q [S,SIX,X]. (c) I ( n ) = N L if (a) and (b) are not satisfied. Two lock - g r a p h s LI an d L2 are said to be comma tible C(LI(n) , L 2 ( n ) ) = Y E S for all n e N. S i m i l a r l y two p r o j e c t i o n s 11 12 are c o m p a t i b l e if c(11 (n),12 (n))=YES for all n e SI. if and Theorem: If two l o c k - g r a p h s LI and 11 and 12 are compatible. by two t r a n s a c t i o n s do not i m p l i c i t l y a c q u i r e d do not L2 are c o m p a t i b l e then their p r o j e c t i o n s In o t h e r words if the e x p l i c i t locks set conflict t h e n a l s o the t h r e e - s t a t e l o c k s conflict. Proof: Assume that 11 and 12 are incompatible. We want to p r o v e that LI and L2 are i n c o m p a t i b l e . By d e f i n i t i o n of c o m p a t i b i l i t y there must e x i s t a sink n s u c h that l l ( n ) = X and 12(n) e [S,X} (or vize versa). By definition of p r o j e c t i o n there must exist a n o d e - s l i c e {nl...ns} of n such that L I ( n l ) = . , . = L I (ns)=X. Also t h e r e must exist an a n c e s t o r nO of n such t h a t L2 (nO) e [S,SIX,X}. F r o m the d e f i n i t i o n of l o c k - g r a p h t h e r e is a path P1 from a source to nO on w h i c h L2 d o e s not take the v a l u e NL. If P 1 intersects the node-slice at ni then i n c o m p a t i b l e s i n c e L l ( n i ) = X w h i c h is incompatible v a l u e of L2 (hi). H e n c e the t h e o r e m is proved. LI and L2 are with the n o n - n u l l Alternatively t h e r e is a path P2 from n0 to the sink n which i n t e r s e c t s the n o d e - s l i c e at hi. F r o m the d e f i n f t i o n of l o c k - g r a p h LI takes a value in {IX,SIX,X] on all ancestors of hi. In p a r t i c u l a r LI (nO) e {IX, SiX, X] . S i n c e L2 (n0) e [S,SIX,X} we h a v e C (L1(n0) ,L2 (n0)) =NO. Q.E.D. Dynamic lock qr_a~hs: Thus far we h a v e p r e t e n d e d that the lock g r a p h is static. However, examination of Figure 3 suggests otherwise. Areas, files and i n d i c e s are d y n a m i c a l l y c r e a t e d and d e s t r o y e d , and of c o u r s e r e c o r d s are c o n t i n u a l l y i n s e r t e d , updated, and d e l e t e d . (If the data D a s e is only read, t h e n there is no need for l o c k i n g at all.) 374 J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger We introduce the lock p r o t o c o l for dynamic DAG' s by example. C o n s i d e r the i m p l e m e n t a t i o n of i n d e x i__nterval locks. Rather than b e i n g f o r c e d to lock e n t i r e i n d i c e s or i n d i v i d u a l r e c o r d s , we would l i k e to be able to l o c k all r e c o r d s with a c e r t a i n c o n t i g u o u s r a n g e of i n d e x values; for e x a m p l e , l o c k all r e c o r d s in the bank account f i l e with the l o c a t i o n f i e l d e q u a l to N a p a . T h e r e f o r e , t h e index is p a r t i t i o n e d into l o c k a b l e key v a l u e intervals. E a c h i n d e x e d record "belongs', to a p a r t i c u l a r i n d e x i n t e r v a l and all r e c o r d s in a file with the same field v a l u e on an i n d e x e d field will belong to the s a m e key v a l u e i n t e r v a l (i.e. all Napa a c c o u n t s will b e l o n g to the same i n t e r v a l ) . This new s t r u c t u r e is d e p i c t e d in F i g u r e 4. In [ I] s u c h locks were c a l l e d p r e d i c a t e l o c k s and and an a l t e r n a t e (more g e n e r a l but less e f f i c i e n t ) i m p l e m e n t a t i o n was proposed. DATA BASE I i AREAS l l FILE l l I INDICES l l INDEX INTERVALS ........ I I I UN-INDEXED FIELDS I I I I I ~ECORD IDENTIFIERS F i g u r e 4. The l o c k g r a p h l I I I I I I I I INDEXED FIELDS with index i n t e r v a l locks. The only subtle a s p e c t of F i g u r e 4 is the d i c h o t o m y b e t w e e n i n d e x e d and u n - i n d e x e d fields. Since t h e indexed field value and record i d e n t i f i e r (logical address) a p p e a r in the index, one can r e a d the i n d e x e d field d i r e c t l y (i.e. w i t h o u t " t o u c h i n g " the record). Hence an index i n t e r v a l is a parent of the c o r r e s p o n d i n g field values. F u r t h e r , the index "points,, via r e c o r d i ~ e a t i f i e r s to all r e c o r d s with that value and so is a p a r e n t of all such record identifiers. On t h e o t h e r hand, one c a n read and update u n - i n d e x e d fields of the re=ord without a f f e c t i n g t h e index and so the f i l e is the only p a r e n t of such fields. When an i n d e x e d field is u p d a t e d , it and its r e c o r d i d e n t i f i e r move from one index interval to another. For example, when a Napa a c c o u n t is moved to the St. H e l e n a branch, the a c c o u n t r e c o r d and its l o c a t i o n field " l e a v e " t h e N a p a i n t e r v a l of the l o c a t i o n i n d e x and "join" the St. Helena index interval. When a new r e c o r d is i n s e r t e d it "joins" the i n t e r v a l c o n t a i n i n g the new f i e l d value and a l s o it "joins" the file. Deletion rem3ves t h e r e c o r d from the i n d e x i n t e r v a l and from t h e file. index is not a l o c k a n c e s t o r ot s u c h fields. Granularity oflocks and degrees o f consistency 375 S i n c e F i g u r e 4 d e f i n e s a DAG, a l b e i t a d y n a m i c DAG, the p r o t o c o l of the p r e v i o u s s e c t i o n can be used to lock the n o d e s of the DaG. However, the protocol should be extended as f o l l o w s to handle d y n a m i c c h a n g e s to the lock graph: (d} Before moving a nods in the lock graph, the n o d e must be i m p l i c i t l y or e x p l i c i t l y g r a n t e d in X mode in both its old and its new p o s i t i o n in the graph. F u r t h e r , the n~de must not be moved in such a way as to c r e a t e a c y c l e in the graph. Carrying out the example of this section, to move a Napa Dank a c c o u n t to the St. H e l e n a branch: lock d a t a base in m o d e = IX lock a r e a c o n t a i n i n g a c c o u n t s in mode = IX in m o d e = IX lock a c c o u n t s file in mode = IX lock l o c a t i o n index in m o d e = IX lock Napa i n t e r v a l in mode = IX lock St. H e l e n a i n t e r v a l in m o d e = IX lock r e c o r d in m o d e = X. lock f i e l d get an i m p l i c i t lock on the field by Alternatively, one c o u l d r e q u e s t i n g e x p l i c i t X mode l o c k s on the r e c o r d and index i n t e r v a l s . Schedulinq and ~ran~n~ r~ues~s: Thus far we have d e s c r i b e d the s e m a n t i c s of the v a r i o u s request m o d e s and h a v e d e s c r i b e d the p r o t o c o l w h i c h r e q u e s t o r s must f o l l o w . To c o m p l e t e the d i s c u s s i o n we d i s c u s s how r e q u e s t s are s c h e d u l e d and granted. The set of a l l r e q u e s t s for a p a r t i c u l a r r e s o u r c e are kept in a queue sorted by some fair s c h e d u l e r . By " f a i r " we mean t h a t no particular transaction will be delayed indefinitely. First-in first-out is the s i m p l e s t fair scheduler and we adopt such a s c h e d u l e r for t h i s d i s c u s s i o n m o d u l o d e a d l o c k p r e e m p t i o n d e c i s i o n s . The g r o u p of m u t u a l l y c o m p a t i b l e r e q u e s t s for a resource appearing at t h e head of the queue is c a l l e d the ~ r a n t e d ~ro_u~. all t h e s e requests can be granted concurrently, assuming that each transaction has at most one request in the queue then the c o m p a t i b i l i t y of two r e q u e s t s by d i f f e r e n t t r a n s a c t i o n s d e p e n d s o n l y on the modes of the r e q u e s t s and may be c o m p u t e d using T a b l e I. Associated with the granted g r o u p is a Wro_u~ m o d e w h i c h is the s u p r e m u m mode of the m e m b e r s of the g r o u p which is computed using F i g u r e 2 or T a b l e 3. T a b l e 2 gives a list of the p o s s i b l e t y p e s of r e q u e s t s t h a t can c o e x i s t in a g r o u p and the c o r r e s p o n d i n g m o d e of the group. T a b l e 2. P o s s i b l e r e q u e s t g r o u p s and t h e i r group mode. Set b r a c k e t s i n d i c a t e that several such r e q u e s t s may be p r e s e n t . [ MODES OF L____RE.Q_B_E_S,TS I X I I I szx, {zs} s, {s}, [Is} I .... Figure 5 i; IX, ~{IX], [IS} IS-[I_S} .... d e p i c t s the queue for M O D E OF GR 0 UP X SIX S IX IS a particular resource, showing the AN. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger 376 requests and t h e i r modes. The granted group consists of five r e q u e s t s and has g r o u p mode IX. The next r e q u e s t in the q u e u e is for S m o d e w h i c h is i n c o m p a t i b l e w i t h the g r o u p mode IX and hence must wait. * G R A N T E D G R O U P : G R O U P ~ O D E = IX * t is i--i ix i--i ZSl--i ZSl--i ZSl--,- i s [- i is [- ix i- iZSl- fix i Figure 5. The q u e u e of r e q u e s t s for a resource. When a new r e q u e s t for a r e s o u r c e arrives, t h e s c h e d u l e r a p p e n d s it to the end of the queue. There are two c a s e s to consider: e i t h e r someone is a l r e a d y w a i t i n g or all o u t s t a n d i n g r e q u e s t s for this r e s o u r c e are g r a n t e d (i.e. no one is w a i t i n g ) . If no one is waiting and the new r e q u e s t is c o m p a t i b l e with the g r a n t e d group m o d e t h e n the new r e q u e s t can be g r a n t e d immediately. O t h e r w i s e the new r e q u e s t must wait its turn in the queue and in t h e c a s e of d e a d l o c k it may preempt some incompatible requests in the queue. ( A l t e r n a t i v e l y the new r e q u e s t could be c a n c e l e d . In F i g u r e 5 all the r e q u e s t s d e c i d e d to wait.) When a p a r t i c u l a r r e q u e s t l e a v e s the g r a n t e d g r o u p the g r o u p mode of the group may c h a n g e . If the mode of t h e first w a i t i n g r e q u e s t in the queue is c o m p a t i b l e w i t h t h e new mode of the g r a n t e d group, t h e n the w a i t i n g r e q u e s t is granted. In F i g u r e 5, if the IX r e q u e s t l e a v e s the g r o u p , then the group mode b e c o m e s IS which is c o m p a t i b l e w i t h S and so the S may be granted. The new g r o u p mode will be S and s i n c e t h i s is c o m p a t i b l e with IS mode the IS request f o l l o w i n g the S request may also join the g r a n t e d group. T h i s p r o d u c e s the s i t u a t i o n d e p i c t e d in F i g u r e 6: • Figure GRANTED GROUP GROUPMODE = S IIS I--IISI-IISl--I ZSl--I Sl-, 6. The q u e u e a f t e r * I ZSl--*- I Xl- IIS I- IIX I the IX r e q u e s t is r e l e a s e d . The X r e q u e s t of F i g u r e 6 w i l l not be g r a n t e d u n t i l all the r e q u e s t s l e a v e the g r a n t e d g r o u p since it is not c o m p a t i b l e with any mode. conversions: A transaction might re-request the same resource for seve ~-al reasons: P e r h a p s it has f o r g o t t e n that it a l r e a d y has access to the record: after all, if it is s e t t i n g many l o c k s it may be s i m p l e r to just a l w a y s r e q u e s t a c c e s s to the record r a t h e r than first a s k i n g i t s e l f "have I seen t h i s r e c o r d before". The l o c k s u b s y s t e m has all the i n f o r m a t i o n to a n s w e r t h i s q u e s t i o n and it s e e m s w a s t e f u l to duplicate. A l t e r n a t i v e l y , t h e t r a n s a c t i o n may k n o w it has a c c e s s to t h e record, but want to i n c r e a s e its a c c e s s m o d e (for e x a m p l e from S to X mod ~- if it is in a read, test, and s o m e t i m e s u p d a t e scan of a file). So the lock s u b s y s t e m must be p r e p a r e d for r e - r e q u e s t s b y a t r a n s a c t i o n for a lock. We c a l l such r e - r e q u e s t s c o n v e r s i o n s . When a r e q u e s t is f o u n d to be a c o n v e r s i o n , the old (granted) m o d e of the r e g u e s t o r to t h e r e s o u r c e and the n e w l y r e q u e s t e d m o d e are c o m p a r e d using T a b l e 3 to c o m p u t e the new m o d e w h i c h is the s u p r e m u m of the old and the r e q u e s t e d m o d e ( r e f . F i g u r e 2). Granu[arity of looks and degrees o f consistency Table 3. The new mode g i v e n I I. . . . I I IS I I IX I I S I I SIX I l__X___i So for e x a m p l e , mode is S I X . IS IS IX S SIX X the requested NEW 8ODE IX S IX S IX SIX SIX S SIX SIX X X if one has IX m o d e SiX SIX SIX SIX SIX X 377 and old mode. X X X X X X and r e q u e s t s S mode t h e n t h e new If the new mode is equal to t h e o l d m o d e (note it is n e v e r l e s s than the old mode) then the r e q u e s t can be g r a n t e d i m m e d i a t e l y and the g r a n t e d m o d e is unchanged. If t h e new mode is c o m p a t i b l e with the g r o u p m o d e of the other m e m b e r s of the g r a n t e d group (a r e q u e s t o r is ~lways compatible with himself) then again the r e q u e s t can be g r a n t e d i m m e d i a t e l y . The g r a n t e d mode is the new mode and t h e g r o u p mode is recomputed using T a b l e 2. In all other cases, the requested conversion must wait u n t i l the g r o u p m o d e of the other granted requests is c o m p a t i b l e w i t h t h e new mode. Note that t h i s i m m e d i a t e g r a n t i n g of conversions over waiting requests is a m i n o r v i o l a t i o n of fair s c h e d u l i n g . If two c o n v e r s i o n s are waiting, e a c h of w h i c h is i n c o m p a t i b l e with an a l r e a d y g r a n t e d r e q u e s t of the o t h e r t r a n s a c t i o a , then a d e a d l o c k e x i s t s and the already granted access of one must be preempted. 3therwise t h e r e is a way of s c h e d u l i n g the w a i t i n g conversions: namely, g r a n t a conversion w h e n it is c o m p a t i b l e with all other granted modes in the g r a n t e d group. (Since there is no d e a d l o c k c y c l e t h i s is a l w a y s possible.) The f o l l o w i n g e x a m p l e may h e l p to c l a r i f y queue for a p a r t i c u l a r r e s o u r c e is: G R O U P M O D E = is IISI---IIS these points. Suppose the * I Figure 7. A s i m p l e queue. Now s u p p o s e the first t r a n s a c t i o n w a n t s to c o n v e r t to X mode. It m u s t wait for the s e c o n d (already granted) request to l e a v e t h e queue. If it d e c i d e s to wait t h e n t h e s i t a a t i o n becomes: G R O U P M O D E = IS IIS<-XI---IISI- Figure 8. ~ c o n v e r s i o n to X m o d e waits. NO new r e q u e s t may e n t e r the g r a n t e d group since t h e r e is now a conversion request waiting. In g e n e r a l , c o n v e r s i o n s are s c h e d u l e d b e f o r e new requests. If the s e c o n d t r a n s a c t i o n now c o n v e r t s to I X , SIX, or S mode it may be g r a n t e d immediately since this d o e s not c o n f l i c t with the ~_ranted (IS) m o d e of the f i r s t t r a n s a c t i o n . When the second transaction eventually leaves the queue, the first c o n v e r s i o n can be made: J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger 378 G R O U P M O D E = IS IIXl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 9. One t r a n s a c t i o n l e a v e s and H o w e v e r , if the s e c o n d transaction m o d e one o b t a i n s the queue: the c o n v e r s i o n tries to convert is granted. to e x c l u s i v e G R O U P Z O D E = IS IIS<-XI-~-WIS<-XI Figure 10. T w o c o n f l i c t i n g conversions are waiting. Since X is incompatible with IS (see Table I), this situation i m p l i e s that each t r a n s a c t i o n is w a i t i n g for the o t h e r to l e a v e the queue (i.e. deadlock) and so one transaction mu~ be p r e e m p t e d . In all other c a s e s (i.e. when no c y c l e exists) there is a way to schedule the c o n v e r s i o n s so that no already granted access is violated. Deadlock and lock t h r a s h i n q : W h e n e v e r a t r a n s a c t i o n waits for a r e q u e s t to be granted, it r u n s the risk of w a i t i n g f o r e v e r in a d e a d l o c k c y c l e . For the p u r p o s e s of d e a d l o c k d e t e c t i o n it is i m p o r t a n t to k n o w who is w a i t i n g for whom. The r e q u e s t queues give this information. Consider any w a i t i n g r e q u e s t R by t r a n s a c t i o n T. There are two cases: If R is a conversion, r is W A I T I N G _ F O R all t r a n s a c t i o n s granted incompatible r e q u e s t s to the queue. If R is not a c o n v e r s i o n , r is W A I T I N G FOR all t r a n s a c t i o n s a h e a d of it in the q u e u e g r a n t e d or waiting f o r incompatible requests. G i v e n this W A I T I N G _ F O R r e l a t i o n c o m p u t e d for all waiting t r a n s a c t i o n s , there is no deadlock if and o n l y if W A I T I N G _ F O R is a c y c l i c . The W A I T I N G FOR r e l a t i o n may c h a n g e whenever a request or r e l e a s e o c c u r s and when a c o n v e r s i o n is granted. If a t r a n s a c t i o n may wait for at most one r e q u e s t at a time, then the d e a d l o c k s t a t e can only c h a n g e when some process decides to wait. In this s p e c i a l case ( s y n c h r o n o u s c a l l s to l o c k system) , o n l y w a i t s r e q u i r e recomputation of the WAITING_FOR relation. If d e a d l o c k is improbable, deadlock t e s t i n g can be done p e r i o d i c a l l y r a t h e r than on each wait, f u r t h e r r e d u c i n g c o m p u t a t i o n a l overhead. One new r e q u e s t may form many c y c l e s and e a c h such c y c l e must be broken. When a c y c l e is detected, to break the c y c l e some g r a n t e d or w a i t i n g request must be p r e e m p t e d . The l o c k scheduler should c h o o s e a m i n i m a l c o s t set of v i c t i m s to p r e e m p t , so that all c y c l e s are broken, u n d o all the c h a n g e s to the d a t a base m a d e by the victims since the p r e e m p t e d r e s o u r c e s w e r e g r a n t e d , and t h e n p r e e m p t the r e s o u r c e and s i g n a l the v i c t i m s that t h e y h a v e been b a c k e d up. The issues d i s c u s s e d so f a r - - l o c k s c h e d u l i n g , d e t e c t i n g and b r e a k i n g deadlocks--are low level scheduling decisions. They must be c o n n e c t e d with a high level transaction scheduler which regulates the load on the s y s t e m and r e g u l a t e s the e n t r y and p r o g r e s s of transactions to p r e v e n t long waits, high p r o b a b i l i t y o~ waiting Granularity oflocks and degrees o f consistency 379 (lock thrashing), and deadlock. By analogy, a page management s y s t e m with only a low l e v e l p a g e f r a m e scheduler, which allocates and p r e e m p t s page f r a m e s in a f a i r l y n a i v e way, is likely to produce page thrashing u n l e s s it is c o u p l e d with a working set s c h e d u l e r which r e g u l a t e s the n u m b e r and c h a r a c t e r of p r o c e s s e s c o m p e t i n g f o r p a g e frames. II. DEGREES OF CONSISTENCY: We now focus on how l o c k s can be used to c o n s t r u c t t r a n s a c t i o n s out of a t o m i c actions. The data base consists of e n t i t i e s which are r e l a t e d in c e r t a i n ways. T h e s e r e l a t i o n s h i p s are best t h o u g h t o f as a s s e r t i o n s about the data. E x a m p l e s of s u c h a s s e r t i o n s are: 'Names is an index for T e l e p h o n e n u m b e r s . ' 'The value of C o u n t of x g i v e s the number of employees in d e p a r t m e n t x.' The data base is s a i d to be c o n s i s t e n t if it satisfies all its assertions [I]. In some cases, the data base must become temporarily inconsistent in order to transform it to a new consistent state. For example, adding a new employee involves s e v e r a l a t o m i c a c t i o n s and the u p d a t i n g of s e v e r a l fields. T h e data base may be inconsistent until all these updates have been completed. To c o p e with these t e m p o r a r y i n c o n s i s t e n c i e s , s e q u e n c e s of a t o m i c actions are g r o u p e d to form transactions. Transactions are the u n i t s of consistency. T h e y are l a r g e r a t o m i c a c t i o n s on the data base which transform it from one consistent state to a new consistent state. Transactions preserve consistency. If some action of a transaction f a i l s then the entire transaction is 'undone' thereby returning t h e data base to a consistent state. T h u s t r a n s a c t i o n s are a l s o the u n i t s of r e c o v e r y . Hardware failure, s y s t e m error, d e a d l o c k , p r o t e c t i o n v i o l a t i o n s and p r o g r a m e r r o r are e a c h a s o u r c e of such failure. If t r a n s a c t i o n s are run one a% a time then each t r a n s a c t i o n w i l l see the c o n s i s t e n t state left b e h i n d by i t s p r e d e c e s s o r . But if s e v e r a l t r a n s a c t i o n s are s c h e d u l e d c o n c u r r e n t l y t h e n locking is r e q u i r e d to i n s u r e that the i n p u t s to e a c h t r a n s a c t i o n are c o n s i s t e n t . Responsibility for r e q u e s t i n g and r e l e a s i n g locks can either be a s s u m e d by the user or be d e l e g a t e d to the system. User c o n t r o l l e d locking results in potentially f e w e r locks due to the user's k n o w l e d g e of the s e m a n t i c s o f the data. O n the o t h e r hand, user controlled locking r e q u i r e s difficult and p o t e n t i a l l y unreliable application programming. H e n c e t h e a p p r o a c h t a k e n by some data base s y s t e m s is to use a u t o m a t i c l o c k p r o t o c o l s which i n s u r e p r o t e c t i o n from g e n e r a l types of i n c o n s i s t e n c y , w h i l e still r e l y i n g on the user to p r o t e c t himself against other s o u r c e s of inconsistencies. For example, a system may a u t o m a t i c a l l y lock updated records b u t not r e c o r d s which are read. Such a s y s t e m p r e v e n t s l o s t u p d a t e s a r i s i n g from t r a n s a c t i o n backup. Still, t h e user should e x p l i c i t l y lock r e c o r d s in a r e a d - u p d a t e S e q u e n c e to i n s u r e that the r e a d v a l u e does not c h a n g e b e f o r e the actual update. In o t h e r words, a user is g u a r a n t e e d a limited a u t o m a t i c d e s r e e of c o n s i s t e n c y . This d e g r e e of c o n s i s t e n c y may be s y s t e m wide or the system may p r o v i d e o p t i o n s to s e l e c t it (for i n s t a n c e a l o c k p r o t o c o l may be a s s o c i a t e d with a t r a n s a c t i o n or with an e n t i t y ) . J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger 380 we now present several e~uivalent d e f i n i t i o n s of four consistency degrees. The first d e f i n i t i o n is an o p e r a t i o n a l and i n t u i t i v e one useful in d e s c r i b i n g the system behavior to users. The second d e f i n i t i o n is a p r o c e d u r a l one in t e r m s of lock protocols, it is useful in explaining the system implementation. The third d e f i n i t i o n is in terms of a t r a c e of the s y s t e m a c t i o n s , it is u s e f u l in f o r m a l l y stating and p r o v i n g p r o p e r t i e s of the v a r i o u s c o n s i s t e n c y degrees. Informal definition of c o n s i s t e n c i : ~ An o u t p u t (write) of a t r a n s a c t i o n is c o m m i t t e d when the t r a n s a c t i o n a b d i c a t e s the right to 'undo' the w r i t e t h e r e b y making the new value available to all other transactions. Outputs are said to be u n c o m m i t t e d or d i r t y if they are not yet c o m m i t t e d by the writer. Concurrent e x e c u t i o n raises the p r o b l e m that reading or writing other t r a n s a c t i o n s ' d i r t y data m a y y i e l d i n c o n s i s t e n t data. Using t h i s n o t i o n defined as: Definition of dirty data, the degrees of consistency may be 1: Degree 3: T r a n s a c t i o n T sees d e q r e e 3 c o n s i s t e n c y if: (a) T does not o v e r w r i t e d i r t y data of other t r a n s a c t i o n s . (b) T does not commit any writes u n t i l it c o m p l e t e s all its w r i t e s (i.e. u n t i l the end of t r a n s a c t i o n (EOT)). (z) T does not read dirty data from other t r a n s a c t i o n s . (d) O t h e r t r a n s a c t i o n s do not d i r t y any data read by T b e f o r e T completes. Degree (a) T (b) T (c) T 2: T r a n s a c t i o n T sees d e q r e e 2 c o n s i s t e n ~ x if: d o e s not o v e r w r i t e dirty data of other t r a n s a c t i o n s . does not c o m m i t a n y writes b e f o r e EOT. d o e s not read dirty data of o t h e r t r a n s a c t i o n s . Degree I: T r a n s a c t i o n T sees d e q r e e I c o n s i s t e n c y if: (a) T does not o v e r w r i t e dirty data of other t r a n s a c t i o n s . (b) T does not c o m m i t any w r i t e s b e f o r e EOT. Degree 0: T r a n s a c t i o n T sees d e q r e e 0 c o n s i s t e n c y if: (a) T d o e s n o t o v e r w r i t e dirty data o f o t h e r t r a n s a c t i o n s . Note that if a t r a n s a c t i o n sees a high also sees all the l o w e r degrees. degree of c o n s i s t e n c y then it Degree 0 consistent transactions commit writes before the end of transaction. Hence backing up a d e g r e e 0 c o n s i s t e n t t r a n s a c t i o n may require undoing an update to an entity locked by another transaction. In this sense, degree 0 transactions are unrecoverable. Degree 1 t r a n s a c t i o n s do not committ writes until the end of the transaction. Hence one may undo (back up) an in-progress degree I transaction without setting a d d i t i o n a l locks. This means that t r a n s a c t i o n b a c k u p does not e r a s e o t h e r t r a n s a c t i o n s ' updates. This is the p r i n c i p a l reason one data m a n a g e m e n t system automatically p r o v i d e s d e g r e e I c o n s i s t e n c y to all t r a n s a c t i o n s . Degree 2 consistency isolates a transaction from the uncommitted Granulari~ of locks and degrees of consistency 381 data of other t r a n s a c t i o n s . With d e g r e e I c o n s i s t e n c y a t r a n s a c t i o n m i g h t read u n c o m m i t t e d v a l u e s which are s u b s e q u e n t l y u p d a t e d or are undone. In degree 2 no dirty d a t a values are read. Degree 3 consistency isolates the transaction from dirty r e l a t i o n s h i p s among values. R e a d s are r _ ~ e a t a b l e . For example, a degree 2 consistent t r a n s a c t i o n may read two d i f f e r e n t (committed) values if it reads the same e n t i t y twice. This is b e c a u s e a t r a n s a c t i o n which u p d a t e s the e n t i t y could begin, u p d a t e and end in t h e i n t e r v a l of time b e t w e e n the two reads. More e l a b o r a t e k i n d s of a n o m a l i e s due to c o n c u r r e n c y are p o s s i b l e if one u p d a t e s an e n t i t y a f t e r r e a d i n g it or if more than one e n t i t y is i n v o l v e d (see e x a m p l e below). Degree 3 consistency completely isolates the t r a n s a c t i o n f r o m i n c o n s i s t e n c i e s due to c o n c u r r e n c y [ I]. Each t r a n s a c t i o n can e l e c t the d e g r e e of c o n s i s t e n c y a p p r o p r i a t e to its function. When the t h i r d d e f i n i t i o n is g i v e n we will be able to s t a t e the c o n s i s t e n c y and r e c o v e r y p r o p e r t i e s of such a s y s t e m more formally. Briefly: If one elects d e g r e e i c o n s i s t e n c y then consistent s t a t e (so l o n g as all o t h e r least degree 0 c o n s i s t e n t ) one sees a degree i t r a n s a c t i o n s run at If all t r a n s a c t i o n s run at least degree I consistent, system b a c k u p (undoing all i n - p r o g r e s s t r a n s a c t i o n s ) l o s e s no u p d a t e s of c o m p l e t e d t r a n s a c t i o n s . If all transactions transaction backup produces a consistent run at least degree 2 consistent, (undoing any in-progress transaction) state. To give an example which demonstrates the a p p l i c a t i o n of these s e v e r a l d e g r e e s of c o n s i s t e n c y , i m a g i n e a p r o c e s s c o n t r o l system in which some transaction is dedicated to reading a gauge and periodically w r i t i n g batches of v a l u e s into a list. Each gauge reading is an individual entity. For performance reasons, this t r a n s a c t i o n sees d e g r e e 0 c o n s i s t e n c y , c o m m i t t i n g all g a u g e r e a d i n g s as s o o n as they e n t e r the d a t a base. This t r a n s a c t i o n is not recoverable (can't be undone). A second transaction is run p e r i o d i c a l l y which r e a d s all t h e r e c e n t g a u g e readings, computes a m e a n and v a r i a n c e and w r i t e s these c o m p u t e d v a l u e s as e n t i t i e s in t h e d a t a base. S i n c e we want t h e s e two v a l u e s to be c o n s i s t e n t with o n e a n o t h e r , they must be c o m m i t t e d t o g e t h e r (i.e. one cannot c o m m i t the f i r s t before the second is w r i t t e n ) . This allows transaction u n d o in the case t h a t it a b o r t s after writing only one of the two values. Hence this statistical summary transaction should see d e g r e e I. A third t r a n s a c t i o n which r e a d s the mean and w r i t e s it on a d i s p l a y sees degree 2 c o n s i s t e n c y . It w i l l not read a mean which m i g h t be 'undone' by a b a c k u p . ~ n o t h e r t r a n s a c t i o n w h i c h r e a d s both the mean and the v a r i a n c e must see degree 3 consistency to insure t h a t the mean and v a r i a n c e d e r i v e from the same c o m p u t a t i o n (i.e. th~ same run which w r o t e the mean a l s o w r o t e the variance). Lock protocol definition of consistenc_z: W h e t h e r an i n s t a n t i a t i o n of a t r a n s a c t i o n s e e s d e g r e e o, I, 2 or 3 consistency depends on the actions of other concurrent transactions. Lock p r o t o c o l s are used by a t r a n s a c t i o n to g u a r a n t e e itself a certain d e g r e e of c o n s i s t e n c y i n d e p e n d e n t of the b e h a v i o r o f o t h e r t r a n s a c t i o n s (so long as all t r a n s a c t i o n s at l e a s t o b s e r v e 11%'. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger 382 the d e g r e e 0 protocol). The d e g r e e s of c o n s i s t e n c y can be p r o c e d u r a l l y d e f i a e d by the l o c k p r o t o c o l s which p r o d u c e them. A transaction l o c k s its i n p u t s to guarantee their c o n s i s t e n c y and l o c k s its o u t p u t s to mark them as dirty (uncommitted). For this s e c t i o n , l o c k s are d i c h o t o m i z e d as s h a r e m o d e l o c k s which allow multiple readers of the s a m e e n t i t y and e x c l u s i v e m o d e locks which r e s e r v e e x c l u s i v e access to an entity. (This is the "two mode" l o c k protocol. Its g e n e r a l i z a t i o n to the "six m o d e " p r o t o c o l of the p r e v i o u s section should be obvious.) Locks may a l s o be c h a r a c t e r i z e d by t h e i r d u r a t i o n : l o c k s held for the d u r a t i o n of a single action are c a l l e d s h o r t d u r a t i o n l o c k s while l o c k s held to the end of the t r a n s a c t i o n are called lonq duration locks. Short duration locks are u s e d to m a r k or test for dirty data f o r the duration of an action rather than for the duration of the transaction. The lock p r o t o c o l s are: Definition 2: D e g r e e 3: t r a n s a c t i o n T o b s e r v e s d e ~ r e e 3 lock p r o t o c o l if: (a) T s e t s a long e x c l u s i v e lock on any data it d i r t i e s . (b) T s e t s a long share lock on a n y d a t a it reads. D e g r e e 2: t r a n s a c t i o n T o b s e r v e s d e ~ r e e 2 lock ~ r o t o c o l if: (a) T s e t s a long e x c l u s i v e lock on any d a t a it dirties. (b) T sets a (possibly short) s h a r e lock on any data it reads. D e g r e e I: t r a n s a c t i o n T o b s e r v e s de__gree I lock P [ ~ E ~ ! if: (a) T sets a long e x c l u s i v e lock on any data it dirties. D e g r e e 3: t r a n s a c t i o n T o b s e r v e s d e g r e e 0 lock p r o t o c o l (a) T sets a (possibly short) e x c l u s i v e lock on dirties. if: a n y data it The lock p r o t o c o l d e f i n i t i o n s can be stated m o r e t e r s e l y w i t h the introduction of the f o l l o w i n g notation. A transaction is ~! formed with respect to w r i t e s ([ead____ss) if it a l w a y s l o c k s an e n t i t y in e x c l u s i v e (shared or exclusive) mode before writing (reading) it. The t r a n s a c t i o n is well formed if it is well f o r m e d with r e s p e c t to r e a d s and writes. A t r a n s a c t i o n is _two p h a s e (with r_e_spect to r e a d s or _updates) if it does not (share or exclusive) l o c k an e n t i t y after unlocking some entity. A two p h a s e t r a n s a c t i o n has a g r o w i n g phase d u r i n g which it acquires locks and a s h r i n k i n g phase d u r i n g which it releases locks. D e f i n i t i o n 2 is too r e s t r i c t i v e in t h e s e n s e that c o n s i s t e n c y will not r e q u i r e t h a t a t r a n s a c t i o n hold all l o c k s to the EOT (i.e. the EOT is the shrinking phase) . R a t h e r , the constraint that the t r a n s a c t i o n be two phase is a d e q u a t e to i n s u r e c o n s i s t e n c y . On the o t h e r hand, o n c e a t r a n s a c t i o n u n l o c k s an u p d a t e d entity, it has committed that entity and so c a n n o t be undone without cascading backup to any t r a n s a c t i o n s which m a y have subsequently read the entity. For that reason, the s h r i n k i n g p h a s e is u s u a l l y d e f e r r e d to the end of the transaction; thus, the transaction is always recoverable and all u p d a t e s are committed together. The lock p r o t o c o l s can be r e d e f i n e d as: Gramtlarity of looks and degrees of consistency 383 D e f i n i t i o n 3!: Degree 3: T is well f o r m e d and T is two phase. Degree 2: T is well f o r m e d and T is two p h a s e with r e s p e c t Degree Degree to writes. 1: T is well f o r m e d with r e s p e c t to w r i t e s and T is two p h a s e with r e s p e c t to writes. 0: T is well f o r m e d w i t h r e s p e c t to writes. All transactions are r e ~ i r e d p r o t o c o l so that t h e y do not others. Degrees I, 2 and 3 consistency. Consisten~ to observe the d e g r e e 0 locking updates of u p d a t e the uncommitted syst e s - g u a r a n t eed provide increasing of s c h e d u l e s : The d e f i n i t i o n of what it m e a n s for a t r a n s a c t i o n to see a degree of c o n s i s t e n c y was given in t e r m s of d i r t y data. In o r d e r to make the notion of dirty data e x p l i c i t it is necessary to c o n s i d e r the e x e c u t i o n of a t r a n s a c t i o n in t h e c o n t e x t of a set of c o n c u r r e n t l y executing transactions. To do this we i n t r o d u c e the n o t i o n of a s c h e d u l e for a set of t r a n s a c t i o n s . A s c h e d u l e can be t h o u g h t of as a history or audit trail of the actions performed by the set of transactions. Given a s c h e d u l e the n o t i o n of a particular entity b e i n g d i r t i e d by a p a r t i c u l a r t r a n s a c t i o n is made e x p l i c i t and h e n c e the n o t i o n of s e e i n g a c e r t a i n d e g r e e of c o n s i s t e n c y is f o r m a l i z e d . T h e s e n o t i o n s may t h e n be Used to c o n n e c t the v a r i o u s d e f i n i t i o n s of c o n s i s t e n c y and show t h e i r e q u i v a l e n c e . The system directly supports entities and actions. Actions are categorized as b e q ! n actions, en_dd actions, s h a r e lock actions, exclusive lock actions, unlock actions, read a c t i o n s , and write actions. An end action is p r e s u m e d to u n l o c k any l o c k s held by the t r a n s a c t i o n but not e x p l i c i t l y u n l o c k e d by the t r a n s a c t i o n . F o r the p u r p o s e s of the f o l l o w i n g d e f i n i t i o n s , s h a r e lock a c t i o n s and t h e i r c o r r e s p o n d i n g u n l o c k a c t i o n s are a d d i t i o n a l l y c o n s i d e r e d to be read a c t i o n s and e x c l u s i v e lock a c t i o n s and t h e i r corresponding unlock a c t i o n s are a d d i t i o n a l l y c o n s i d e r e d to be w r i t e a c t i o n s . A transaction is any a c t i o n and e n d i n g w i t h or end actions. s e q u e n c e of actions beginning with a b e g i n an end a c t i o n and not c o n t a i n i n g o t h e r b e g i n Any (sequence preserving) merging t r a n s a c t i o n s into a s i n g l e s e q u e n c e of t r a n s a c t i o n s . of the is c a l l e d actions of a set a s c h e d u l e for the of set A s c h e d u l e is a h i s t o r y of the o r d e r in w h i c h a c t i o n s were e x e c u t e d (it does not record a c t i o n s which were u n d o n e due to backup). The s i m p l e s t s c h e d u l e s run all a c t i o n s of one t r a n s a c t i o n and t h e n all a c t i o n s of another transaction,... Such o n e - t r a n s a c t i o n - a t - a - t i m e s c h e d u l e s are c a l l e d s e r i a l b e c a u s e they h a v e no c o n c u r r e n c y among transactions. Clearly, a s e r i a l s c h e d u l e has no c o n c u r r e n c y induced i n c o n s i s t e n c y and no t r a n s a c t i o n sees d i r t y data. Locking constrains the s c h e d u l e is le_~a!l o n l y set of a l l o w e d schedules. if it does not s c h e d u l e a In p a r t i c u l a r , a lock a c t i o n on an 384 J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger e n t i t y for one t r a n s a c t i o n when t h a t e n t i t y some o t h e r t r a n s a c t i o n in a c o n f l i c t i n g mode. is a l r e a d y locked by An initial state and a schedule completely d e f i n e the system's behavior. At each s t e p of the s c h e d u l e one can d e d u c e w h i c h entity v a l u e s h a v e been c o m m i t t e d and w h i c h are dirty: it l o c k i n g is used, u p d a t e d d a t a is dirty until it is unlocked. Since a s c h e d u l e m a k e s the d e f i n i t i o n of d ~ r t y data can a p p l y D e f i n i t i o n 1 to d e f i n e c o n s i s t e n t s c h e d u l e s : Definition explicit, one 3: A t r a n s a c t i o n runs at ~ E ~ 9 ~ (!, 2 or 3) c___onsistency in s c h e d u l e S if T sees degree 0 (I, 2 or 3) c o n s i s t e n c y in S. (Conversely, t r a n s a c t i o n T sees degree i c o n s i s t e n c y if all legal s c h e d u l e s run T at d e g r e e i c o n s i s t e n c y . ) If all t r a n s a c t i o n s run at d e g r e e 0 (1,2 schedule S then S is said to be a _de_qree ~ schedule. Given these Assertion (a) (b) definitions or 3) c o n s i s t e n c y in (I, 2 or 3) c o n s i s t e n t one c a n show: I: If each transaction o b s e r v e s the degree 0 (1, 2 or 3) lock p r o t o c o l (Definition 2) t h e n any legal s c h e d u l e is d e g r e e 0 (1, 2 or 3) c o n s i s t e n t ( D e f i n i t i o n 3) (i.e, each t r a n s a c t l o n sees degree 0 (1, 2 or 3) c o n s i s t e n c y in the s e n s e of D e f i n i t i o n 1). Unless transaction T observes the d e g r e e I (2 or 3) lock p r o t o c o l then it is p o s s i b l e to d e f i n e a n o t h e r t r a n s a c t i o n T' w h i c h does observe the d e g r e e I (2 or 3) lock protocol such that T and T' h a v e a l e g a l s c h e d u l e S but T d o e s not run at d e g r e e 1 (2 or 3) c o n s i s t e n c y in S. In [ 1] argument we proved Assertion generalizes directly I for degree to t h i s result. 3 consistency. That Assertion 1 says that if a t r a n s a c t i o n o b s e r v e s the lock p r o t o c o l d e f i n i t i o n of c o n s i s t e n c y ( D e f i n i t i o n 2) then it is a s s u r e d of the i n f o r m a l d=_finition of c o n s i s t e n c y b a s e d on c o m m i t t e d and d i r t y ,data (Definition 1). Unless a transaction actually sets the locks prescribed by degree 1 (2 or 3) consistency one can construct t r a n s a c t i o n m i x e s and s c h e d u l e s which will c a u s e the t r a n s a c t i o n te run at (see) a lower degree of c o n s i s t e n c y . However, in p a r t i c u l a r c a s e s such t r a n s a c t i o n m i x e s may n e v e r occur due to the s t r u c t u r e or use of the system. In these c a s e s an a p p a r e n t l y low d e g r e e of l o c k i n g may a c t u a l l y provide degree 3 consistency. For example, a data base r e o r g a n i z a t i o n u s u a l l y need do no l o c k l n g s i n c e it is run as an o f f - l i n e u t i l i t y w h i c h is n e v e r run c o n c u r r e n t l y with other transactions. Assertion 2: If e a c h t r a n s a c t i o n in a d e g r e e 0 lock p r o t o c o l and or 3) lock p r o t o c o l then (Definitions 1, 3) in transactions. set of t r a n s a c t i o n s at least o b s e r v e s the if t r a n s a c t i o n T o b s e r v e s the d e g r e e I (2 T r u n s at d e g r e e 1 (2 or 3) c o n s i s t e n c y any legal schedule for the set of Granularity of locks and degrees of consistency 385 A s s e r t i o n 2 says t h a t each transaction can c h o o s e its d e g r e e of c o n s i s t e n c y so long as all t r a n s a c t i o n s o b s e r v e at l e a s t aegree 0 protocols. Of c o u r s e the o u t p u t s of degree 0, I or 2 consistent t r a n s a c t i o n s may be d e g r e e 0, I or 2 c o n s i s t e n t (i.e. i n c o n s i s t e n t ) because they were c o m p u t e d with potentially inconsistent inputs. 3ne can i m a g i n e that each data e n t i t y is t a g g e d with t h e d e g r e e of c o n s i s t e n c y of its writer: Degree 0 entities are purple, d e g r e e I e n t i t i e s are red, d e g r e e 2 e n t i t i e s are y e l l o w and degree 3 e n t l t i e s are green. The c o l o r of the o u t p u t s of a t r a n s a c t i o n is the m i n i m u m of the t r a n s a c t i o n ' s c o l o r and the c o l o r s of the e n t i t i e s it r e a d s ( b e c a u s e they are p o t e n t i a l l y i n c o n s i s t e n t ) . Gradually the s y s t e m will turn p u r p l e or red u n l e s s e v e r y o n e r u n s with a high d e g r e e of consistency. If t h e t r a n s a c t i o n 's a u t h o r k n o w s s o m e t h i n g about t h e systems structure w h i c h allows an a p p a r e n t l y degree 1 consistent protocol to produce degree 3 c o n s i s t e n t r e s u l t s then this color c o d i n g is p e s s i m i s t i c . But, in g e n e r a l a t r a n s a c t i o n m u s t b e w a r e of reading entities t a g g e d with d e g r e e s l o w e r than the d e g r e e of the transaction. DeRendencies amonq transactions: 9ne t r a n s a c t i o n is said to depeP_~d on a n o t h e r if the first t a k e s s o m e of its inputs from the second. Thenotion of d e p e n d e n c y is d e f i n e d differently for each degree of consistency. These dependency r e l a t i o n s are c o m p l e t e l y d e f i n e d by a s c h e d u l e and can be u s e f u l in ~ i s c u s s i n g c o n s i s t e n c y and r e c o v e r y . Each s c h e d u l e d e f i n e s three r e l a t i o n s : <, << and <<< on the set of t r a n s a c t i o n s as follows. S u p p o s e that t r a n s a c t i o n T p e r f o r m s a c t i o n a on entity e at s o m e step in the s c h e d u l e and that t r a n s a c t i o n T' p e r f o r m s action a' on entity e at a later s t e p in the schedule. F u r t h e r s u p p o s e that T d o e s not e q u a l T'. Then: T <<< T' if or or a is a write a c t i o n and a' is a w r i t e action a is a w r i t e action and a' is a r e a d action a is a read a c t i o n and a' is a w r i t e action T << T' if or a is a w r i t e action and a' is a w r i t e a c t i o n a is a w r i t e action and a' is a read action T < T' if a is a write a c t i o n and a' is a w r i t e a c t i o n So degree I d o e s not care a b o u t read d e p e n d e n c i e s at all. Degree 2 c a r e s o n l y about one kind of read dependency. And degree 3 ignores only r e a d - r e a d d e p e n d e n c i e s (reads c o m m u t e ) . The f o l l o w i n g table is n o t a t i o n a l l y c o n v e n i e n t way of s e e i n g t h e s e d e f i n i t i o n s : <<< : W->W I W->R << : W->W "I W ->R < : W ->W I R->W m e a n i n g that (for e x a m p l e ) T <<< T' if T w r i t e s read (R) by T' or w r i t t e n (W) by T' or T reads w r i t t e n (W) by T'. Let <~ be the t r a n s i t i v e c l o s u r e BEFOBEI (T) = {T'I T' <~ T} AFTERI (T} = {T' I T <~ T']. The sets B E F O R E 2 , AFTER2, BEFOPF3 of <, and (W) s o m e t h i n g (R) s o m e t h i n g later later then define: AFTER3 are d e f i n e d a n a l o g o u s l y J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger 386 from << and <<<. The o b v i o u s i n t e r p r e t a t i o n f o r t h i s is that each B E F O R E set is the set of t r a n s a c t i o n s which c o n t r i b u t e i n p u t s to T and e a c h AFTER set is the set of t r a n s a c t i o n s w h i c h t a k e their i n p u t s f r o m T (where the o r d e r i n g only c o n s i d e r s d=_pendencies i n d u c e d by the c o r r e s p o n d i n g c o n s i s t e n c y degree). If some t r a n s a c t i o n is both b e f o r e T and a f t e r T in some s c h e d u l e then no s e r i a l schedule could give such r e s u l t s . In this case c o n c u r r e n c y has i n t r o d u c e d i n c o n s i s t e n c y . On the o t h e r hand, if all relevant transactions are e i t h e r b e f o r e or after T (but not both) then T will see a c o n s i s t e n t s t a t e (of the c o r r e s p o n d i n g degree). If all t r a n s a c t i o n s d i c h o t o m i z e o t h e r s in this way then t h e r e l a t i o n <e (<<~ or <<<e) will be a p a r t i a l o r d e r and the w h o l e s c h e d u l e will g i v e d e g r e e I (2 or 3) c o n s i s t e n c y . This can be s t r e n g t h e n e d to: Assertion 3: A s c h e d u l e is degree I (2 or 3) c o n s i s t e n t if and o n l y the r e l a t i o n <~ (<<e or <<<~) is a partial order. if The <, << and <<< r e l a t i o n s are v a r i a n t s of the d e p e n d e n c y sets introduced in [1]. In that paper only d e g r e e 3 consistency is i n t r o d u c e d and A s s e r t i o n 3 was proved for that case. In p a r t i c u l a r such a s c h e d u l e is e q u i v a l e n t to the serial schedule o b t a i n e d by r u n n i n g the t r a n s a c t i o n s one at a t i m e in <<< order. T h e proofs of [ I ] generalize f a i r l y e a s i l y to h a n d l e a s s e r t i o n 1 in the case of d e g r e e I or 2 c o n s i s t e n c y . Consider the T1 TI TI T2 T2 T2 T2 T2 T2 TI T1 T1 following example: LOCK A READ A UNLOCK A LOCK A WRITE A LOCK B WRITE B UNLOCK R UNLOCK B LOCK B WHITE B UNLOCK B In this s c h e d u l e T2 g i v e s B to TI and T2 u p d a t e s A a f t e r TI r e a d s A so T2<TI, T2<<T1, T 2 < < < T 1 and T I < < < T 2 . The s c h e d u l e is degree 2 c o n s i s t e n t but not d e g r e e 3 consistent. It runs TI at degree 2 c o n s i s t e n c y and T2 at d e g r e e 3 c o n s i s t e n c y . It w o u l d be nice to d e f i n e a t r a n s a c t i o n to s e e d e g r e e I (2 or 3) c o n s i s t e n c y if and only if the B E F O R E and A F T E R s e t s are d i s j o i n t in some schedule. However, t h i s is not r e s t r i c t i v e enough; r a t h e r one must require that the b e f o r e and a f t e r s e t s be d i s j o i n t in all s c h e d u l e s in order to s t a t e D e f i n i t i o n 1 in t e r m s of d e p e n d e n c i e s . F u r t h e r , t h e r e seems to be no n a t u r a l way to d e f i n e the d e p e n d e n c i e s of degree 0 consistency. H e n c e the principal application of the dependency definition is as a proof t e c h n i q u e and for discussing s c h e d u l e s and r e c o v e r y issues. ! Granularity of locks and degrees of consistency R e l a t i o n s h i R t_~o t r a n s a c t i o n backu E ~ n d ~ 387 recover~: A t r a n s a c t i o n T is s a i d to be r e c o v e r a b l e if it can be undone b e f o r e 'EOT' without u n d o i n g other t r a n s a c t i o n s ' updates. A transaction T is said to be r e p e a t a b l e if it will r e p r o d u c ~ the o r i g i n a l o u t p u t if rerun f o l l o w i n g recovery, assuming that no l o c k s were r e l e a s e d in the b a c k u p process. Recoverability requires s y s t e m wide degree 1 consistency, repeatibility r e q u i r e s that all o t h e r t r a n s a c t i o n s be ~t l e a s t d e g r e e I and that the r e p e a t a b l e t r a n s a c t i o n be degree 3. The no___rmal (i.e. t r o u b l e free) o p e r a t i o n of a data base s y s t e m c a n be d e s c r i b e d in terms of an initial consistent state $0 and a schedule of transactions mapping the data base into a final consistent state S3 (see F i g u r e 11). S1 is a checkpoint state, since t r a n s a c t i o n s are in p r o g r e s s , $1 may be i n c o n s i s t e n t . A system crash leaves the data b a s e in state $2. Since transactions T3 and T5 w e r e in p r o g r e s s at the time of crash, S2 is p o t e n t i a l l y inconsistent. S y s t e m r e c o v e r y a m o u n t s to b r i n g i n g the data b a s e in a new c o n s i s t e n t s t a t e in one of the f o l l o w i n g ways: (a) Starting from state S2, undo all i n - p r o g r e s s at the t i m e of t h e crash. (by S t a r t i n g f r o m state S I p r o g r e s s at the time before SI) and then completed before the Sl) . (c) s t a r t i n g crash. at S0 r e d o observe (a) I I I I I S0 that and first of the redo crash all of transactions undo all a c t i o n s of t r a n s a c t i o n s in crash (i.e. a c t i o n s of T3 and T~ all a c t i o n s of t r a n s a c t i o n s which (i.e. actions of T2 and T3 a f t e r transactions (c) are d e g e n e r a t e TII ............. I T21 ....... :". . . . T31 ............ actions which cases of completed before the (b). I I--- I I ........... I T~ I--- I I T51 ..... > < > .... I < > ..... I I I I I I SI $2 S3 F i g u r e 11. S y s t e m states, SO is i n i t i a l state, S1 is checkpoint state, S2 is a c r a s h and S3 is t h e s t a t e t h a t r e s u l t s in the a b s e n c e of a crash. U n l e s s all t r a n s a c t i o n s run at least degree 1 consistency, system r e c o v e r y may l o s e updates. If for e x a m p l e , T3 w r i t e s a record, r, and t h e n T~ f u r t h e r u p d a t e s r then u n d o i n g T3 w i l l c a u s e the u p d a t e of T~ to r to be lost. This situation can o n l y a r i s e if some t r a n s a c t i o n d o e s not hold its w r i t e l o c k s to EOT. (a) If all the t r a n s a c t i o n s run in at l e a s t degree 1 consistency then system recovery loses no updates of complete transactions. However t h e r e may be no s c h e d u l e which would give the sa~e r e s u l t b e c a u s e t r a n s a c t i o n s may have read o u t p u t s of u n d o n e t r a n s a c t i o n s . J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger 388 (b) If all the t r a n s a c t i o n s run in at l e a s t degree 2 then the recovered state is consistent and derives from the schedule obtatined from the original system schedule by deleting incomplete transactions. Note that degree 2 prevents read dependencies on t r a n s a c t i o n s which might be undone by system recovery, of all the Completed t r a n s a c t i o n s results in a meaningful schedule. (c) If a transaction reproducible. is degree 3 consi stent Transaction crash gives rise to transaction properties analogous to system recovery. then backu~ it which is has Cost of deqrees of consistencL: The only advantage of lower degrees of c o n s i s t e n c y is performance. If less is locked then less computation and storage is consumed. Further if less is locked, c o n c u r r e n c y is i n c r e a s e d since fewer conflicts appear. (Note that the granularity lock scheme of the first section was motivated by minimizing the number of explicit l o c k s set.) We will make some vet Z crude estimates of the storage and computation resources consumed by the locking protocols as a function of the c o n s i s t e n c y degree. For the r e m a i n d e r of this section assume that all t r a n s a c t i o n s are identical. Also assume that they do R reads and W writes (and hence set approximately R share mode locks and W exclusive mode locks}. Further we assume that all the transactions run at the same c o n s i s t e n c y degree. Each outstanding lock request consumes per-transaction space for these queue c o n s i s t e n c y degrees is: Table 4. C o n s i s t e n c y CONSISTENCY 0 I 2 3 DEGREE degrees a q u e u e element. The maximum elements as a function of vs storage consumption. I I STORAGE I I I I I I I I J I I I (in queue elements) 1 ~ W+1 W.,-l~ .I I I I Observe that degrees I and 2 consume roughly the same amount of s t o r a g e but that degree 3 c o n s u m e s substantially more storage. This observation is aggravated by the fact that reads are typically ten times more common than writes. The estimation of c o m p u t a t i o n (CPU) overhead is much more subtle. We make only a crude e s t i m a t e here. First one may c o n s i d e r the o v e r h e a d in requesting and releasing locks. This is shown in TaDle 5 as a function of c o n s i s t e n c y degrees. Granularity oflocks and degrees o f consistency TABLE 5. C o m p u t a t i o n a l CONSISTENCY overhead DEGREE CPU vs d e g r e e s 389 of c o n s i s t e n c y . (in c a l l s to l o c k sys) W W W+R W+R Table 5 i n d i c a t e s t h a t the c o m p u t a t i o n a l o v e r h e a d of d e g r e e s 2 and 3 are c o m p a r a b l e and are g r e a t e r than the o v e r h e a d of d e g r e e s 0 or 1. These p a i r s of d e g r e e s set t h e s a m e locks, t h e y just hold them f o r different durations. Table 5 ignores the observation that some lock requests are trivially satisfied (the r e q u e s t is granted immediately) while others require a task switch and h e n c e are q u i t e expensive. The probability that a r e a d lock w i l l h a v e to w a i t is proportional to the n u m b e r of conflicting locks [write) currently granted. The p r o b a b i l i t y that a w r i t e lock w i l l h a v e to w a i t is p r o p o r t i o n a l to the n u m b e r of c o n f l i c t i n g (read or write) l o c k s that are c u r r e n t l y granted. T a b l e 4 g i v e s a guess of t h e m a x i m u m number of i c ~ k s of each t y p e held by e a c h t r a n s a c t i o n . If t h e r e are 2~N+I t r a n s a c t i o n s one can m u l t i p l y the e n t r i e s of Table 4 by N to get an a v e r a g e n u m b e r of locks held by all Others. If a wait lock r e q u e s t is C÷I times as e x p e n s i v e as an i m m e d i a t e l y g r a n t e d r e q u e s t and if P is t h e probability that two d i f f e r e n t r e q u e s t s are for the same r e s o u r c e then the r e l a t i v e c o m p u t a t i o n a l c o s t s are r o u g h l y c o m p u t e d : degree 0 overhead: W p~C~N~W c o s t of s e t t i n g locks c o s t of w a i t s degree I overhead: w p~C~N*W*W c o s t of w r i t e s c o s t of w a i t s degree 2 overhead: W+R P ~ C ~ N ~ W = (W+I) P~C~N*R~W c o s t of s e t t i n g locks cost of w r i t e waits c o s t of r e a d waits degree 3 overhead: W+R c o s t of s e t t i n g l o c k s cost of w a i t i n g for w r i t e s c o s t of w a i t i n g for r e a d s p*C~N~R~W TABLE 6. C o m p u t a t i o n a l CONSISTENCY overhead DEGREE CPU vs d e g r e e s (in c a l l s of c o n s i s t e n c y . to lock sys) I W + P ~ C ~ N * N ~ (I) W + R ÷ P ~ C ~ N ~W • (W+2~R) To c o n s i d e r five r e a d s a s p e c i f i c example, (R=5) and six (W=6) a simple writes. banking transaction does A transaction accesses a 390 J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger r a n d o m a c c o u n t and t h e r e are m i l l i o n s of a c c o u n t s s o the p r o b a b i l i t y of c o l l i s i o n , P, is r o u g h l y .000001. S u p p o s e t h e r e are o n e h u n d r e d t r a n s a c t i o n s per second. A lock t a k e s one h u n d r e d i n s t r u c t i o n s and a wait r e q u i r e s five thousand instructions; hence, C=50. So t h e term P*C~N~W e v a l u a t e s to 0.015. This implies that Table 5 gave a good e s t i m a t e of the CPU o v e r h e a d b e c a u s e t h e last t e r m in Table 6 is m i n i s c u l e c o m p a r e d to the term W+R. Of c o u r s e t h i s a n a l y s i s is very s e n s i t i v e to P and o n e must design the data base so that P t a k e s on a very small value. The s t r i k i n g t h i n g about these e s t i m a t e s is t h a t d e g r e e 2 and d e g r e e 3 seem to h a v e similar computational overhead which seems to be substantially larger than the overhead of degree 0 or I consistency. We s u s p e c t that t h i s c o n c l u s i o n w o u l d survive a more c a r e f u l s t u d y of the problem. Gmmdarity of locks and degrees of consistency ISSUE . . . . . . . . . . l . . . . . . . DEGREE . . COMMITTED DATA . . . . . . . . . . 0 . . . ] . . . . DEGREE . . . . . . . . . . . I . . . ] . . . . . 391 DEGREE . . . . . . . . . . . . 2 . . . . I . DEGREE 3 . WRITES ARE COMMITTED IMMEDIATELY WRITES ARE COM M I T T E D AT EOT SAME DIRTY DATA YOU D O N I T UPDATE DIRTY DATA 0 A N D NO O N E ELSE UPDATES YOUR DIRTY DATA 0, I A N D Y O U DON'T READ DIRTY DATA 0,1 ,2 A N D NO O N E E L S E DIRTIES DATA YOU READ LOCK PROTOCOL SET SHORT EXCL. LOCKS O N ANY DAT A YOU WRITE SET LONG EXCL. LOCKS ON ANY DATA YOU WRITE I AND SET SHORT SHARE LOCKS ON A N Y D A T A YOU READ I AND SET LONG SHARE LOCKS ON ANY DATA YOU READ TRANSACTION STRUCTURE WELL FORMED WRT WRITES (WELL FORMED AND 2 PHASE) WRT WRITES CONCURRENCY GREATEST : ONLY WAIT FOR SHORT WRITE LOCKS . . . . . . . . . . . . . . . . . . . . OVERHEAD . . . . . . . . . . . . . . . . . . . TRANSACTION BACKUP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . WELL FORMED AND TWO PHASE LOWEST: ANY DATA TOUCHED IS L O C K E D T O ROT . . . . MEDIUM: SET BOTH KINDS OF L O C K S BUT NEED NOT STORE SHORT LOCKS . . . . HIGHEST : SET AND STORE BOTH KINDS OF L O C KS J SAME AS 2 0,1 AND CAN'T READ BAD DATA ITEMS 0, 1,2 A N D CAN' T READ BAD DATA RELATIONSHIPS S A M E AS I: B U T R E S U L T IS S A M E AS SOME SCHEDULE 2 AND SCHEDULE IS S ER IAL . 0 AND CAN'T LOSE WRITES SYSTEM RECOVERY TECHNIQUE A P P L Y LOG IN O R D E R OF ARRIVAL APPLY LOG IN < ORDER W- >W NONE NONE ............................. Table . 1 UN-DO ANY INCO~PLE~ TRANSACTIONS IN A N Y O R D E R LETS OTHERS RUN HIGHER CONSISTENC Y ORDERING . AS UN-DO ALL INCOMPLETE TRANSACTIONS IN A N Y O R D E R PROTECTION PROVIDED DEPENDENCIES SAME MEDIUM: ALSO WAIT FOR READ LOCKS SMALL: ONLY SET WRITE LOCKS CAN NOT UNDO WITHOUT CASCADING TO OT HE RS . . 1 WELL FORMED (AND 2 P H A S E WRT WRITES) GREAT: ONLY WAIT FOR WRITE LOCKS LEAST: ONLY S E T SHORT WRITE LOCKS AS 7. W- >W W->R W->W W->R R->W < < < IS AN < IS AN << IS AN ORDERING OF ORDERING OF ORDERING OF THE TRANSTHE TRANSTHE TRANSACT I O N S ACTIONS ACTIONS £ ............................................. Summary of consistency degrees. 392 II!. LOCK SYSTEMS: J.N. Gray, R.A. Lorie, G.R. Putzolu and J.L. Traiger GRANULARITY AND DEGREES OF CONSISTENCY IN EXISTING I M S / V S with the p r o g r a m i s o l a t i o n f e a t u r e [2 ] h a s a two level lock hierarchy: segment types (sets of r e c o r d s ) , and segment instances (records) w i t h i n a segment type. Segment types may be l o c k e d in E X C L U S I V E (E) m o d e (which c o r r e s p o n d s to our e x c l u s i v e (x) mode) or in E X P R E S S R E A D (R) , R E T R I E V E (G) , or U P D A T E (U) (each of w h i c h correspond to our notion of intention (I) mode) [ 2, pages 3 . 1 8 - 3 . 2 7 ]. S e g m e n t i n s t a n c e s can be l o c k e d in s h a r e or e x c l u s i v e mode. S e g m e n t type l o c k s are r e q u e s t e d at t r a n s a c t i o n initiation, u s u a l l y in i n t e n t i o n mode. Segment instance locks are d y n a m i c a l l y set as the transaction proceeds. In addition I M S / ~ S has user controlled share l o c k s on s e g m e n t i n s t a n c e s (the =Q option) which a l l o w other read r e q u e s t s but not other ~Q or e x c l u s i v e requests. I M S / V S has no n o t i o n of S or SIX l o c k s on s e g m e n t t y p e s (which w o u l d a l l o w a scan of all m e m b e r s of a s e g m e n t t y p e c o n c u r r e n t with o t h e r r e a d e r s but w i t h o u t the o v e r h e a d of l o c k i n g each s e g m e n t i n s t a n c e ) . Since IMS/VS does not s u p p o r t S mode on s e g m e n t t y p e s one need not distinguish the two intention modes IS a n d IX (see the section i n t r o d u c i n g IS and I X modes). In g e n e r a l , I M S / V S has a n o t i o n of i n t e n t i o n m o d e and does i m p l i c i t l o c k i n g but d o e s not r e c o g n i z e all the modes d e s c r i b e d here. It u s e s a static two l e v e l l o c k tree. I M S / V S with the p r o g r a m i s o l a t i o n f e a t u r e b a s i c a l l y p r o v i d e s degree 2 consistency. However degree 1 c o n s i s t e n c y can be o b t a i n e d on a segment type b a s i s in a PCB (view) by s p e c i f y i n g the EXPRESS READ option for that segment. Similarly degree 3 c o n s i s t e n c y can be o b t a i n e d by s p e c i f y i n g the E X C L U S I V E or U P D A T E o p t i o n s . I M S / V S a l s o has the user c o n t r o l l e d s h a r e l o c k s discusseH above which a program can r e q u e s t on selected segment i n s t a n c e s to obtain additional consistency over the degree I or 2 consistency p r o v i d e d by the system. I M S / V S w i t h o u t the p r o g r a m i s o l a t i o n f e a t u r e (and a l s o the p r e v i o u s v e r s i o n of IMS n a m e l y IMS/2) d o e s n ' t have a lock hierarchy since l o c k i n g is done only on a s e g m e n t t y p e basis. It provides degree I c o n s i s t e n c y with d e g r e e 3 c o n s i s t e n c y o b t a i n a b l e for a s e g m e n t t y p e in a view by specifying the E X C L U S I V E option. User controlled l o c k i n g is a l s o p r o v i d e d on a l i m i t e d basis via the H O L D option. DMS 1100 has a two l e v e l lock h i e r a r c h y [ q]: a r e a s and p a g e s w i t h i n areas. Areas may be locked in one of s e v e n m o d e s when they are OPENed: EXCLUSIVE RETRIEVAL (which c o r r e s p o n d s to our notion o f e x c l u s i v e mode), P R O T E C T E D U P D A T E (which c o r r e s p o n d s to our n o t i o n of share and i n t e n t i o n e x c l u s i v e mode), P R O T E C T E D R E T R I E V A L (which we call share mode), U P D A T E (which c o r r e s p o n d s to our intention e x c l u s i v e mode), and R E T R I E V A L (which is our i n t e n t i o n s h a r e mode). G i v e n this t r a n s l i t e r a t i o n , the compatibility matrix d i s p l a y e d in T a b l e 1 is i d e n t i c a l to the c o m p a t i b i l i t y m a t r i x of DMS 1100 [3, page 3.59]. However, DMS 1100 s e t s only e x c l u s i v e l o c k s on p a g e s within areas (short term share locks are i n v i s i b l y set during internal pointer following). F u r t h e r , even if a t r a n s a c t i o n locks an area in e x c l u s i v e mode, DMS 1100 c o n t i n u e s to set e x c l u s i v e locks (and i n t e r n a l share locks) on the pages in the area, d e s p i t e the f a c t that an e x c l u s i v e lock on an area p r e c l u d e s r e a d s or u p d a t e s of t h e area by other transactions. Similar observations apply to the DMS 1100 i m p l e m e n t a t i o n of S and SIX modes. In g e n e r a l , DMS 1100 r e c o g n i z e s all the m o d e s d e s c r i b e d h e r e and u s e s i n t e n t i o n m o d e s to d e t e c t c o n f l i c t s but does not u t i l i z e i m p l i c i t locking. It uses a s t a t i c two l e v e l lock tree. Granularity oflocks and degrees o f consistency 393 DMS 1100 provides level 2 c o n s i s t e n c y by setting exclusive l o c k s on the modified pages and and a t e m p o r a r y lock on the page corresponding to the page which is "current of run unit". The t e m p o r a r y lock is r e l e a s e d when the "current of run unit" is moved. In a d d i t i o n a run-unit can o b t a i n additional locks via an explicit K ~ P command. The ideas presented were d e v e l o p e d in the process of d e s i g n i n g and i m p l e m e n t i n g an experimental data base system at the IBM San J o s e Research Laboratory. (We wish to e m p h a s i z e that this s y s t e m is a vehicle for research in data base architecture, and does not indicate plans for future IBN products.) R subsystem which p r o v i d e s the modes of locks herein described, plus the n e c e s s a r y logic to schedule requests and conversions, and to detect and resolve deadlocks has been implemented as one component of the data manager. The lock subsystem is in turn used by the data m a n a g e r to a u t o m a t i c a l l y lock the nodes of its lock graph (see F i g u r e 12). Users can be unaware of these lock protocols beyond the verbs " b e g i n t r a n s a c t i o n " and "end transaction". The data base is broken into s e v e r a l storage areas. Each area contains a set of relations (files), their indices, and their tuples(records) along with a c a t a l o g of the area. Each tuple has a unique tuple identifier (data b a s e key) which can be used to q u i c k l y (directly) address the tuple. Each tuple i d e n t i f i e r maps to a set of field values. All tuples are s t o r e d t o g e t h e r in an a r e a - w i d e heap to allow physical clustering of t u p l e s from different r e l a t i o n s . The unused slots in this heap are r e p r e s e n t e d by an a r e a - w i d e pool of free tuple identifiers (i.e. i d e n t i f i e r s not aliocated to any relation). Each tuple "belongs" to a unique \relation, and all t u p l e s in a relation have the s a m e number and type of fields. One may c o n s t r u c t an index on any sub~et of the fields of a relation. Tuple i d e n t i f i e r s give fast direct access to tuples, while indices give fast associative access to field values and to their c o r r e s p o n d i n g tuples. Each key value in an index is made a l o c k a b l e object in order to solve the p r o b l e m of "phantoms" [1] without locking the entire index. We do not e x p l i c i t l y lock individual fields or whole indices so t h o s e nodes a p p e a r in Figure 12 only for p e d a g o g i c a l reasons. Figure 12 gives only the "logical" lock graph; there is also a graph for p h y s i c a l page locks and for o t h e r low level resources. Rs can be seen, Figure 12 is not a tree. Heavy use is made of the techniques mentioned in the s e c t i o n on locking DAG's. For example, one can read via tuple i d e n t i f i e r without setting any index locks but to lock a field for u p d a t e its t u p l e i d e n t i f i e r and the old and new index key values cowering the updated field must be l o c k e d in X mode. Further, the tree is not static, since data base keys are d y n a m i c a l l y allocated to relations: field values d y n a m i c a l l y enter, move around in, and leave index value intervals when records are inserted, u p d a t e d and deleted; r e l a t i o n s and indices are d y n a m i c a l l y created and destroyed within areas; and areas are dynamically allocated. The implementation of s u c h o p e r a t i o n s observes the lock p r o t o c o l presented in the section on dynamic graphs: when a node c h a n g e s parents, all old and new p a r e n t s must be held (explicitly or implicitly) in intention e x c l u s i v e mode and the n~de to be moved must be held in exclusive mode. The d e s c r i b e d system supports c o n c u r r e n t l y c o n s i s t e n c y d e g r e e s 1,2 and 3 which can be specified on a transaction basis. In a d d i t i o n share locks on individual tuples can be acquired by the user. J.N. Gray R.A. Lorie, G.R. Putzolu and J.L. Traiger 394 DATA BASE I I AREAS I I I RELATIONS I I I IN DICE S I I INDEX KEY INTERVALS I I I I I FREE TUPLE IDENTIFIERS I! ALLOCATED TU PL E IDENTIFIERS I I I I I INDEXED FIELDS I I UN-INDEXED FIELDS Figure 12. ~ lock graph. ACKNOWLEDGMENT We g r a t e f u l l y acknowledge many helpful d i s c u s s i o n s with Phil Macri, Jim Mehl and Brad Wade on how locking works in existing systems and how these results might be better presented. We are especially indebted to Paul McJones in this regard. REFERENCES [i] K.P. Eswaran, J.N. Gray, R.A. Lorie, I.L. Traiger, On the Notions of C o n s i s t e n c y and Predicate Locks, technical Report RJ.I~87, IBM Research Laboratory, San Jose, Ca., Nov. 1974. (to appear CACM). [2] Information Application 1975. [3] UNIVAC 1100 Series Data Management System (DMS 1100). ANSI COBOL Field Data M a n i p u l a t i o n Language. Order No. UP7908-2, Sperry Rand Corp., May 1973. Management System Virtual Storage (I~S/VS). Design Guide, Form No. SH20-9025-2, IBM System corp. ,