Set 9, Part 2 timestamp ordering, distributed CC and OO CC CS4411/9538 Set 9, Part 2, Concurrency Control 1 Outline of notes Set 1: Introduction ✔ Set 2: Architecture ✔ Centralized Relational Distributed DBMS Object-Oriented DBMS Set 3: Database Design ✔ Centralized Relational Distributed DBMS Set 4: Data Modeling ✔ Set 5: Querying ✔ Set 6: XML Model and Querying ✔ Set 7: Algebraic Query Optimization ✔ Centralized Relational Distributed DBMS Object-Oriented DBMS CS4411/9538 Set 8: Storage, Indexing, and Execution Strategies ✔ Set 8, Part 2: Costs and OO Implementation ✔ Set 8, Part 3: XML Implementation Issues ✔ Set 9: Transactions and Concurrency Control ✔ Set 9, Part 2 CC with timestamps Distributed DBMS Object-Oriented DBMS Set 10: Recovery Centralized Relational Centralized Relational Distributed DBMS Set 11: Database Security Set 9, Part 2, Concurrency Control 2 Concurrency Control with Timestamps The transaction manager assigns a unique timestamp to each transaction when it arrives. How? with a centralized database, i.e. a single transaction manager (TM), use a counter or a clock time with many TMs, i.e. a distributed database, use the site or TM ID and a counter or a clock if a clock is used, the TM cannot issue the next timestamp until the clock ticks with a counter, the database would have to be restarted periodically to prevent the values from growing in length. CS4411/9538 Set 9, Part 2, Concurrency Control 3 Timestamp Ordering Rule: if a read or write from transaction i conflicts with a read or write from transaction j, then process the operations in timestamp order. Theorem: A log representing a concurrent execution of a set of transactions such that the above rule is followed, is serializable. Basic Timestamp Ordering: abort any transaction for which the rule cannot be followed because a read or write arrives at the scheduler “too late”. CS4411/9538 Set 9, Part 2, Concurrency Control 4 Implementation for each item x in the database, maintain, in a table: the maximum TS of any transaction which has read x, and the maximum TS of any transaction which has written x. slightly more complicated than that. The RM does not process operations instantaneously. So the scheduler keeps max-r-scheduled(x) max-w-scheduled(x) to make sure the RM does the operations in the right order, the scheduler has to queue them up, in timestamp order, with one queue per data item. when a transaction is aborted, it is restarted with a new, larger (later) timestamp, to avoid being rejected again. CS4411/9538 Set 9, Part 2, Concurrency Control 5 Algorithm (simple version) Transaction T, with timestamp TS(T), tries to do operation p(x), where p is either read or write, and x is a data item. 1. compare TS(T) with max-r-scheduled(x) and/or max-w-scheduled(x), depending on which operations p(x) conflicts with (read-write and writewrite are the conflicting combinations). 2. if TS(T) is < any of these things it conflicts with, then T is aborted and restarted with a larger timestamp. CS4411/9538 Set 9, Part 2, Concurrency Control 6 3. if TS(T) is > the timestamps on x of all the operations it conflicts with, then update max-rscheduled(x) or max-w-scheduled(x) and put the operation in the queue of operations scheduled for x. If the system doesn't crash, these operations will eventually be carried out. Note: with locking, we don't have to worry about this queuing, because the scheduler issues the locks, and therefore only allows one write or multiple reads on a data item x to be passed on to the RM at any one time. CS4411/9538 Set 9, Part 2, Concurrency Control 7 Revised Timestamp Ordering Algorithm Assume in this discussion that tR(x) = max-r-scheduled(x) tW(x) = max-w-scheduled(x) t is the timestamp of transaction T 1. W-R synchronization Transaction T with timestamp t wants to read x: if t ≥ tW(x) then { put the read in the queue for x; if t > tR(x) then set tR(x) to t } else abort T (and restart it with a new TS) CS4411/9538 Set 9, Part 2, Concurrency Control 8 Revised Algorithm, cont’d 2. R-W synchronization Transaction T wants to write x: if t ≥ tR(x) and t ≥ tW(x) then { put the write in the queue for x; if t > tW(x) then set tW(x) to t } else if t < tR(x) then abort T (and restart it with a new TS) 3. W-W synchronization (still have transaction T wanting to write x:) if tR(x) ≤ t < tW(x) then do nothing, i.e. ignore the write (no harm done so no need to abort T.) This is slightly different from the previous algorithm. This last step is called the Thomas write rule. CS4411/9538 Set 9, Part 2, Concurrency Control 9 CS4411/9538 Set 9, Part 2, Concurrency Control 10 Another View Reading TS of T reading x (Value of tR(x) not relevant) < tW(x) Abort ≥ tW(x) Do the read; Update tR(x) if t > tR(x) Writing TS of T writing x <tR(x) CS4411/9538 ≥ tR(x) < tW(x) abort Ignore (Thomas write rule) ≥ tW(x) abort Do the write, Update tW(x) if t > tW(x) Set 9, Part 2, Concurrency Control 11 Maintaining timestamps on data items keep a table of triples: (x, max-r-scheduled(x), max-w-scheduled(x)) if the data items are small, this table could be very large. periodically purge the table of all entries whose timestamps are “too old” for any older transaction to be likely to be still in the system. pick a time interval δ, and purge everything older than the current time t - δ. Tag the table with this timestamp, t - δ. modify the scheduler so that if a data item x is not in the table, its timestamp is assumed to be t - δ, and act accordingly. This may occasionally reject a transaction unnecessarily. there exists a tradeoff between δ and the size of the table. CS4411/9538 Set 9, Part 2, Concurrency Control 12 Example using timestamps: Let TS(T1) =1, TS(T2) = 2, TS(T3) = 3 Schedule tR(x) tW(x) tR(y) tW(y) tR(z) tW(z) Initially readT2(x) writeT3 (x) writeT1(y) readT2 (y) writeT2 (z) CS4411/9538 Set 9, Part 2, Concurrency Control 13 Example using timestamps: Let TS(T1) =1, TS(T2) = 2, TS(T3) = 3 Schedule tR(x) tW(x) tR(y) tW(y) tR(z) tW(z) Initially 0 0 0 0 0 0 readT2(x) 2 0 0 0 0 0 writeT3 (x) 2 3 0 0 0 0 writeT1(y) 2 3 0 1 0 0 readT2 (y) 2 3 2 1 0 0 writeT2 (z) 2 3 2 1 0 2 CS4411/9538 Set 9, Part 2, Concurrency Control 14 Comments on this Schedule If locks were used instead, then, to have the same schedule or transaction log, T2 cannot possibly be two-phase, because it would have had to release the lock on x, so T3 can write it, before getting the lock on y, which T1 needs in the meantime. This is, however, a serializable schedule. It is just not one that can be achieved by two-phase locking. Therefore, two-phase locking is not the same as timestamp ordering. They both guarantee serializability, but generate different schedules. CS4411/9538 Set 9, Part 2, Concurrency Control 15 All Serializable Schedules 2PL Schedules ? TO Schedules * The previous schedule falls into the set of schedules generated by Timestamp ordering and not by 2PL. ? Is there a schedule in 2PL which is not possible with TO? CS4411/9538 Set 9, Part 2, Concurrency Control 16 Here’s a 2PL schedule for the ? on the previous slide: Step T0 1 Slock(A) 2 Read(A) T1 3 Xlock(B) 4 write(B) 5 Unlock(B) 6 Slock(B) 7 Read(B) 8 Unlock all Assuming by the schedule that T0 starts first, so TS of T0 < TS of T1. Suppose TS of T0 is 0 and TS of T1 is 1, then this schedule is not possible because Read(B) at step 7 would not be allowed. CS4411/9538 Set 9, Part 2, Concurrency Control 17 Multiversion Timestamp Ordering basic idea is to keep more than one version of a data item, a sequence of them with increasing timestamps. (Write) Timestamps Read at time t xk xv xw if a transaction wants to read x, with timestamp t, it will read the version with the largest TS < t in other words, it will behave as if it had executed “on time” (here it will read xv) so, reads are never rejected CS4411/9538 Set 9, Part 2, Concurrency Control 18 Multiversion TO cont’d to write, for a transation with TS t, if there are no reads from t to xw, then write is OK (Write) Timestamps Write at time t xk xv Write at time t xk xw there was a read here xv (Write) Timestamps xw in the second case, the write is not allowed CS4411/9538 Set 9, Part 2, Concurrency Control 19 Why is two-phase locking used? It interferes very little with the design and programming of transactions It has much less overhead (lock table in main memory vs. timestamps maintained on potentially all data items) I believe that in performance analyses, it gives much better throughput CS4411/9538 Set 9, Part 2, Concurrency Control 20 Serialization Graph Testing (pessimistic version) idea is to keep a graph with active transactions and some recently committed transactions as the nodes. if transaction Ti wishes to do an operation on a data item x, put an edge from Tj to Ti for every transaction Tj for which a conflicting operation has been previously scheduled, showing that Tj precedes Ti. if the resulting graph contains a cycle involving Ti, abort Ti and delete it from the graph. if the graph is acyclic, schedule the operation by queuing it up for the RM as for Basic Timestamp ordering. nodes can be deleted from the graph when the corresponding transaction has committed and it has no incoming edges left in the graph. the scheduler never schedules any operations that are not serializable. CS4411/9538 Set 9, Part 2, Concurrency Control 21 Optimistic Techniques Two-Phase Locking and Timestamp Ordering are classified as pessimistic techniques. They assume something will conflict and prevent against it. Optimistic techniques assume that nothing will conflict in most cases, and therefore just let the transactions run. When a transaction is ready to commit, it checks what has happened to see if it is serializable. If not, the transaction is aborted, rolled back and restarted. If there are very few conflicts, it works well. If there are a lot of conflicts, it has poor performance because a lot of work gets repeated. There is an optimistic version of 2PL, timestamp ordering and serialization graph testing. The Optimistic Serialization Graph Testing algorithm works as follows: hold all writes until the commit point at which time the graph is built to see if there were any conflicts. CS4411/9538 Set 9, Part 2, Concurrency Control 22 Newest SQL Server versions to serve a variety of workloads, with emphasis on regular transaction processing, including data warehousing, they’ve added column stores to be optimized for larger (cheaper) main memory and multiple cores use multiversion timestamp ordering, with an optimistic approach Pros: no overhead for locking highly parallelizable Cons: there is overhead for validation more frequent aborts CS4411/9538 Set 9, Part 2, Concurrency Control 23 Distributed Concurrency Control the correctness criterion becomes: One copy serializability This means: the resulting schedule with read and write steps that refer to individual copies must be equivalent to a serial schedule with only one copy per data item. CS4411/9538 Set 9, Part 2, Concurrency Control 24 System Architecture CS4411/9538 Set 9, Part 2, Concurrency Control 25 Taxonomy of Distributed Concurrency Control Mechanisms Pessimistic Locking Timestamp Ordering Centralized Primary Copy Distributed Basic TO Multi-version TO Conservative TO Hybrid Optimistic Locking Timestamp Ordering CS4411/9538 Set 9, Part 2, Concurrency Control 26 Locking-Based Protocols All use two-phase locking in order to guarantee local serializability Centralized 2PL or Primary site 2PL: the lock manager for the database is at one site (i.e. one scheduler looks after the entire database). in this case the coordinating TM sends messages directly to the various RM's once locks are granted obviously a tremendous bottleneck, and reliability problems if this site fails. CS4411/9538 Set 9, Part 2, Concurrency Control 27 Primary Copy 2PL Various schedulers (lock managers) are responsible for certain data items. Each data item is handled by 1 scheduler. That scheduler grants locks on it. So there is a primary copy for each data item. Fewer bottlenecks than the previous method. This was used in the Distributed Ingres prototype. In this case, the directories need to know the primary site location for each data item. CS4411/9538 Set 9, Part 2, Concurrency Control 28 Distributed 2PL expects schedulers to be present at each site which handle the locks for the local data items. without replicated data, similar to primary copy. if data is replicated, then usually use a ROWA (Read One Write All) replica control protocol: for reading, need a lock on one copy for writing, need a lock on all copies. was used in the R* prototype, and in Tandem's NonStop SQL. in order to ensure two-phase locking is going on at all sites, i.e., not true that locks are released at one site before some locks are granted at another for a given transaction, can use Strict twophase locking (i.e. hold all locks until the commit point). there are issues to resolve if the network becomes partitioned CS4411/9538 Set 9, Part 2, Concurrency Control 29 DB2 and Oracle looking at the web pages for these two systems, it looks like they both use a master (primary) copy idea when there are replicas. updates are made on the master copy and later propagated out to the replicas CS4411/9538 Set 9, Part 2, Concurrency Control 30 Distributed Deadlocks, Centralized Detection Locking implies Deadlocks are possible. each scheduler keeps a waits-for-graph in which nodes correspond to transactions. There is an edge from Ti to Tj if Ti is waiting for a lock which conflicts with a lock on the same data item held by Tj. must take the union of these various local graphs Centralized Detection means that all the waits-for-graphs are periodically shipped to a single site and the deadlock detection algorithm is run there. The cost of detecting deadlocks is much greater than in the centralized case because of transmission costs and delays. the global deadlock detector must also get extra information to select a victim. CS4411/9538 Set 9, Part 2, Concurrency Control 31 Phantom Deadlocks some of the edges in the waits-for-graph in a cycle are no longer true because the lock being waited for has been granted. This happens because of delays in sending the information to the global deadlock detector. Note: If there really is a deadlock, it will remain until something intervenes. The problem with phantom deadlocks is they may cause unnecessary aborts. CS4411/9538 Set 9, Part 2, Concurrency Control 32 Distributed Deadlock Detection all sites are running deadlock detection as required more than 90% of cycles in a waits-for-graph are of length 2 centralized deadlock detection is very slow at detecting these short cycles. one problem with distributed deadlock detection is that if more than one site finds the deadlock, they may select different victims. A very cheap method is just to abort a transaction if it times out waiting for a lock, i.e. assume it is involved in a deadlock. CS4411/9538 Set 9, Part 2, Concurrency Control 33 Path Pushing Send paths in the waits-for-graph to other sites for deadlock detection. to reduce traffic, assign a total order to the transactions, and only send a path from the waits-for-graph, Ti → ... → Tj if Ti < Tj in the ordering. if furthermore, we assume that each transaction is only active at one site at a time, (i.e. it does all its operations at one site, then goes to the next for some more -- very restrictive on transaction design), then when a transaction moves from Site A to Site B, send all the paths ending at Tj along with it. So if at Site B, Tj needs a lock held by some transaction on one of these paths, the deadlock can be detected there. None of these techniques is very good. CS4411/9538 Set 9, Part 2, Concurrency Control 34 Timestamp-Based Deadlock Avoidance basic idea is that if Ti is going to wait for a lock held by Tj, it should only do so if we can guarantee that a deadlock will not result transactions will still be forced to abort. First approach: assign priorities to transactions. Ti is allowed to wait for a lock held by Tj only if Ti has a higher priority. Otherwise, Ti aborts and restarts with a higher priority cycles cannot occur in the waits-for-graph, so we don't need it. this method can lead to cyclic restart, or livelock, as in the following example: CS4411/9538 Set 9, Part 2, Concurrency Control 35 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Ti starts with priority 1 Ti sets a write lock on x Tj starts with priority 2 Tj sets a write lock on y Ti tries to lock y, and is forced to abort Ti restarts with priority 3 Ti write locks x Tj tries to lock x, and is forced to abort Tj restarts with priority 4 Tj write locks y go back to step5 CS4411/9538 Set 9, Part 2, Concurrency Control 36 A Better Approach assign a unique (over the distributed database) timestamp to each transaction any transaction that stays alive long enough will eventually have the smallest (oldest) timestamp Therefore, assign the priorities to be the inverse of their timestamps. if Ti is trying to obtain a lock for which Tj holds a conflicting lock, then there are two main strategies (use one of them): CS4411/9538 Set 9, Part 2, Concurrency Control 37 Wait-Die if Timestamp(Ti) < Timestamp(Tj) then Ti waits else abort Ti (Ti dies) Wound-Wait if Timestamp(Ti) < Timestamp(Tj) then abort Tj (try to kill it) else Ti waits CS4411/9538 Set 9, Part 2, Concurrency Control 38 Furthermore if a transaction is aborted, it is restarted with its original timestamp. Thus it eventually has the highest priority and runs to completion. Thus there are no livelocks and no deadlocks Wounding is actually an attempt to kill Tj. It may not work if Tj has committed in the meantime. In either case, the locks Ti wants are released older, active transaction is never aborted CS4411/9538 Set 9, Part 2, Concurrency Control 39 Behaviour of Wait-Die older transactions may wait for different locks from younger and younger transactions. thus this method favours younger transactions. once a transaction has all its locks, it never aborts. this method is also called non-preemptive Behaviour of Wound-Wait older transactions force their way through to completion. younger ones abort, and get restarted with their original timestamp. Meanwhile the older transaction got its lock, so the restarted young transaction now waits. this method is also called preemptive: older transactions preempt younger ones. CS4411/9538 Set 9, Part 2, Concurrency Control 40 Distributed Timestamp-Based Concurrency Control distributed timestamp-based schedulers can behave exactly like the centralized ones, as long as distributed timestamps are issued correctly, each site has enough information to proceed to ensure unique timestamps over the network, have to append some site ID bits to the timestamp which is issued in one of the ways discussed for centralized databases. if one site's clock is slow or fast, it could mean that its transactions have trouble. CS4411/9538 Set 9, Part 2, Concurrency Control 41 For Cloud-based Systems: CAP Theorem Consistency: all nodes see the same data at the same time – note this is different from the definition of consistency for ACID Availability: a guarantee that every request receives a response about whether it was successful or failed Partition tolerance: the system continues to operate despite arbitrary message loss or failure of part of the system The CAP theorem says that a distributed system can only satisfy two of these properties at a time CS4411/9538 Set 9, Part 2, Concurrency Control 42 Base Basically Available Soft-state services with Eventual-consistency provides a lot of Availability and can be achieved by optimistic techniques. CS4411/9538 Set 9, Part 2, Concurrency Control 43 Concurrency Control for Object-Oriented Databases Characteristics of Work Done in OODBs: design environments, say for program development or computeraided design still want sharing and concurrent access may be short transactions mixed with long transactions; design activities which can last for the working day of the designer. the commit point is under user control; they may have a commit button on an interactive interface may want to support co-operative work, where there is simultaneous access by more than one user. CS4411/9538 Set 9, Part 2, Concurrency Control 44 Do not want (want to minimize): work blocked because another transaction holds a lock work rolled back at the commit point because something is not serializable Do want: concurrent work with some notion of correctness. CS4411/9538 Set 9, Part 2, Concurrency Control 45 Multiple Granularity Locking Same idea as with relational databases complications arise from schema updates, inheritance and shared subobjects an object is highly likely to be the target of an update, and therefore the smallest granule. It is more likely to be the object of an individual lock than an individual tuple in a relational database. CS4411/9538 Set 9, Part 2, Concurrency Control 46 Consider the following tree of kinds of granules CS4411/9538 Set 9, Part 2, Concurrency Control 47 Some Details on the objects themselves, only S and X locks are possible. the following lock modes are available on classes and sets of instances (or other collections, e.g. lists) S lock on a set means the set is locked in S mode, and all the instances are implicitly locked in S mode. X lock on a set means the set is locked in X mode, and all instances in the set can be read or written. IS lock (intention shared) on a set means the instances will be explicitly locked in Shared mode as necessary. IX lock on the set (Intention eXclusive) means instances will be locked in S or X mode as necessary. SIX lock on a set (Shared, Intent eXclusive) means the set is locked in S mode, and all instances too. Instances to be written will be locked in X mode as necessary. these are almost the same as we had before, except here we explicitly talk about what granules they apply to. CS4411/9538 Set 9, Part 2, Concurrency Control 48 Rules The following rules are also familiar, but slightly different. 1. To set an explicit S lock on a lockable granule, first set an IS lock on all direct ancestors, along any one ancestor chain, of the lockable granule in the DAG 2. To set an X lock on a lockable granule, first set an IX or SIX lock on all direct ancestors, along every ancestor chain, of the lockable granule in the DAG 3. Set all locks in root to leaf order 4. Release all locks in leaf to root order, or all at once at the end of the transaction. CS4411/9538 Set 9, Part 2, Concurrency Control 49 Lock Compatibility Matrix IS IS Y IX S SIX X Y Y Y - IX S SIX X Y Y Y - Y - Y - - - Which is what we had before without the Update locks. This version is symmetric. CS4411/9538 Set 9, Part 2, Concurrency Control 50 OODB example classes Class Employee public type tuple (name : string, ID : string, jobtitle : string; worksFor : Department, startDate : Date) end; Class Department public type tuple (deptName : string, deptLoc : string empsOf : set(Employee) ) end; CS4411/9538 Class WorksOn public type tuple (who : Employee, what : Project, howMuch : real) end; Class Project public type tuple (projName : string, targetDate : Date, projLocation : string) end; Set 5, OODB Query Languages 51 Possible Granules MySchema MyDatabase ClassDefinitions Employee EmpCollection Project ProjectCollection SmithObject ProjectP1 CS4411/9538 Set 9, Part 2, Concurrency Control 52 Suppose we have the following commands in the interactive tool: set schema MySchema method x in class Employee ... defining a new method for class Employee It has to lock the schema and the class definition for class Employee so that the class definition can be changed. At the very least, IX mode for schema MySchema, the ClassDefinitions node, and X mode to write (update) the class Employee class definition. CS4411/9538 Set 9, Part 2, Concurrency Control 53 If we begin a session with set schema MySchema set (data)base MyDatabase query ... By the time it gets to the query statement, it should know it needs to lock the whole schema for reading at lower levels. Before that, it may have assumed it was the previous session and locked things in IX mode. Note that the schema definitions need to be read to compile the queries. So, if a query is written which accesses SmithObject, the locking might be: IS on schema MySchema S on Class Employee (to read the class definition to compile the query) IS on Database MyDatabase IS on EmpCollection S on SmithObject Unlock everything CS4411/9538 Set 9, Part 2, Concurrency Control 54 For a transaction to write a new Employee object: IX on schema MySchema S on class Employee (need to verify the structure of the new object) IX on (data)base MyDatabase X on EmpCollection (adding a new object reference to the set) X on the page that the new student object will be put on What is wrong with getting this X lock on EmpCollection? We have not got the required locks on all its ancestors (we were supposed to get IX on the class definition). CS4411/9538 Set 9, Part 2, Concurrency Control 55 More OO lock modes Schemes have been devised based on multiple granularity locking to deal with OODBs, with shared subobjects, and how they can be updated. All based on introducing more lock modes and coming up with the lock compatibility table. here’s the worst one CS4411/9538 Set 9, Part 2, Concurrency Control 56 ISOS: Intent Shared, these are subObjects which are Shared IXOS: Intent eXclusive, these are subObjects which are Shared SIXOS: Shared, Intent eXclusive access, these are subObjects which are Shared. IS IX S SIX X IS Y Y Y Y - Y - - Y - - IX Y Y - - - - - - - - - S Y - Y - - Y - - Y - - SIX Y - - - - - - - - - - X - - - - - - - - - - - ISO Y - Y - - Y Y Y Y Y Y IXO - - - - - Y Y - Y Y - SIXO - - - - - Y - - Y - - ISOS Y - Y - - Y Y Y Y - - IXOS - - - - - Y Y - - - - SIXOS - - - - - Y - - - - - CS4411/9538 ISO IXO SIXO ISOS IXOS SIXOS Set 9, Part 2, Concurrency Control 57 Concurrency Control for LongRunning Transactions Long running transactions are typical in Object-Oriented Database applications, which include things like software development and computer-aided design. Solutions can be “arranged” by users so that work can be carried out in tandem, but the operations might not technically be serializable. We will look briefly at one idea which allows more concurrent work while preserving theoretical properties. CS4411/9538 Set 9, Part 2, Concurrency Control 58 Sagas this idea is used for workflow systems workflows are usually represented by a directed graph a workflow is made up of (small) actions and special actions commit and abort. there are no edges leaving a commit or abort node. there is usually one start node for each action A, there must be a compensating action A-1 which reverses the effect of A CS4411/9538 Set 9, Part 2, Concurrency Control 59 CS4411/9538 Set 9, Part 2, Concurrency Control 60 Each action is executed like a short transaction, with standard concurrency control. If the execution leads to the abort termination, all the actions which have been performed, say A1 ... An are compensated in reverse order by their compensating actions in reverse order An-1 An-1-1 ... A1-1. the individual actions need to be of a type that are reversible. CS4411/9538 Set 9, Part 2, Concurrency Control 61 What scenario do XML databases fall into? CS4411/9538 Set 9, Part 2, Concurrency Control 62