TimeStamp Ordering Concurrency Control Protocols In Distributed DataBase Systems TSO CSC536 – Barton Price 5/2/2006 CSC 336 – Final Presentation 1 Table of Contents • Introduction: • Background: DDBMS architecture, transaction processing, concurrency control. • Classic TSO protocols: 1981state of the art. • TSO performance: TSO vs. 2PL performance. • TSO today: new view on TSO performance; New uses of TSO in DDBMSs. Conclusions: topic summary. • • References: references in appearance order. 5/2/2006 CSC 336 – Final Presentation 2 1. INTRODUCTION • Is the TSO protocol a viable option in Concurrency Control (CC) for DDBSs? Our 02/14 reading -“RTDBSs “Mobile Agent Model for Quite areading few of- our CSC536 had Our 04/11 and readings Data Services” Our 04/18 reading Study of CC in(Komiya RT, 2003) states: Transaction Processing in2004) DDBDs” discussed (or DiPippo, at- “A least mentioned) (Ramamritham, Son, - states: CC, but only 3 Active DBSs” (Datta & Son 2002) -either mentions that “The traditional approaches are using a 2-phase papers mentioned the TimeStamp (TSO) “For conflict-resolution in RTDBs various time-cognizant OCC-TI, (a Timestamp variant of Optimistic CC locking protocol (2PL), or using timestamp-ordering extensions of two-phase locking (2PL), optimistic and a approach in CC, and neither of them presented protocolbased on Dynamic Adjustment of (TSO)” --- However, their have proposed method uses a timestamp-based protocols been proposed” Serialization Order)CC is shown to perform very study ofapproach. TSO algorithms. blocking well - they compare it in simulation experiments to 2PL-HP (2PL-High-Priority), WAIT-50, and their proposed OCC-APFO (Adaptive Priority Fan Out – Optimistic protocol)” 5/2/2006 CSC 336 – Final Presentation 3 2. BACKGROUND 2.1 Distributed DataBase Systems Architecture • A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. • A distributed database management system (DDBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users. • Distributed database system: (DDBS) = DDBs + DDBMS. 5/2/2006 CSC 336 – Final Presentation 4 BACKGROUND cont… 2.1 Distributed DataBase Systems Architecture Difference between a Centralized DBMS and a Distributed DBMS Environment Centralized DBMS on a Network Distributed DBMS Environment 5/2/2006 CSC 336 – Final Presentation 5 BACKGROUND cont… 2.2 Transaction Processing In DDBSs • A transaction is a collection of actions that make transformations of system states while preserving system consistency. • Transaction processing has to ensure: – Concurrency transparency – Failure transparency 5/2/2006 CSC 336 – Final Presentation 6 BACKGROUND cont… 2.3 Concurrency Control (CC) • CC = coordinating concurrent accesses to data while preserving concurrency transparency. • Main Difficulty = preventing DB updates (writes) by one user, from interfering with DB retrievals (reads) or updates (writes) performed by another. • w/o CC in place: problems like lost updates and inconsistent retrievals. • CC in DDBSs = harder than in a Centralized DBS: – Users may access data stored in many different nodes. – A CC mechanism running at one node cannot instantly know the interactions at other nodes. 5/2/2006 CSC 336 – Final Presentation 7 BACKGROUND cont… 2.3 Concurrency Control (CC) cont… • In Centralized DBMSs, CC is well understood. – Studies started in early 1970's; By the end of the 1970’s, the Two-Phase Locking (2PL) approach (based on critical sections) has been accepted as the standard solution. • In DDBS, CC choice was still debated in the early 80's, and a large number of algorithms was proposed. • Bernstein & Goodman (1981) surveyed the state of the art in DDBMS CC, presenting 48 CC methods. – Structure and correctness of the algorithms; – Little emphasis on performance issues. – Introduce standard terminology for DDBMS CC algorithms and standard model for the DDBMS. 5/2/2006 CSC 336 – Final Presentation 8 BACKGROUND : Concurrency Control (CC) cont… 2.3.1 DDBMS model • Each site in a DDBMS is a computer running: – A transaction manager (TM), and – A data manager (DM) • Correctness criteria of a CC alg.: The users expect that: – each transaction submitted will eventually be executed; – the computation performed by each transaction will be the same whether it executes alone (in a dedicated system), or in parallel with other transactions in a multi-programmed system. 5/2/2006 CSC 336 – Final Presentation 9 BACKGROUND : Concurrency Control (CC) cont… 2.3.1 DDBMS model cont… • A DDBMS has four components: – Transactions, TMs, DMs, and data. • There are 4 operations defined in interface between the Transaction (T) and the TM. – BEGIN: The TM creates a private workspace for T. – READ(X): Data item X is read – WRITE(X, new-value): X in T's private workspace is updated to new-value – END: Two-phase commit (2PC) takes place, and T's execution is finished. 5/2/2006 CSC 336 – Final Presentation 10 BACKGROUND : Concurrency Control (CC) cont… 2.3.1 CC Problem Decomposition • Serializability: – An execution is serializable if it is computationally equivalent to a non-concurrent, serial execution. • CC problem = decomposed into 2 sub-problems (B&G): – Read-write synchronization (rw) and – Write-write synchronization (ww) • Only 2 types of operations access the stored data: – dm-read and dm-write. • Two operations conflict if: – They operate on the same data item and one is dm-write. – The conflicts are either rw, or ww. • B&G determined that all the algorithms that they examined were variations of only 2 basic techniques: 2-phase locking (2PL) & timestamp ordering (TSO). 5/2/2006 CSC 336 – Final Presentation 11 3. CLASSIC TSO PROTOCOLS • • • • • • 5/2/2006 In Timestamp Ordering (TSO): The serialization order is selected a priori. Transaction execution is forced to obey this order. Each transaction is assigned a unique timestamp (Ts) by its TM. The TM attaches the Ts to all dm-reads and dm-writes issued on behalf of the transaction. DMs process conflicting operations in Ts order. The timestamp of operation O is denoted ts(O). CSC 336 – Final Presentation 12 3. Classic TSO Protocols - cont… RW and WW Conflicts: For rw synchronization 2 operations conflict if: For ww synchronization 2 operations conflict if: • Both operate on the same data item, and • Both operate on the same data item, and •One is a dm-read and the other is a dm-write • Both are dm-writes. 5/2/2006 CSC 336 – Final Presentation 13 3. Classic TSO Protocols - cont… 3.1 Basic TSO Implementation • An implementation of TSO needs: – a TSO scheduler (S), – a software module that receives dm-reads and dm-writes and outputs these operations according to the TSO rules. – In DDBMSs, for the 2-phase commit (2PC) to work properly, prewrites must also be processed through the TSO scheduler. – The basic TSO implementation distributes the schedulers along with the database. • In centralized DBMSs, the 2PC can be ignored, and thus the basic TSO scheduler is very simple: – At each DM, and for each data item x stored at the DM, the scheduler records the largest timestamp of any dm-read(x) or dm-write(x) that has been processed. 5/2/2006 CSC 336 – Final Presentation 14 3. Classic TSO Protocols - cont… 3.1 Basic TSO Implementation – cont… • In Distributed DBMSs, 2PC is incorporated by – timestamping prewrites and accepting / rejecting prewrites instead of dm-writes. – Once the scheduler (S) accepts a prewrite, it must guarantee to accept its corresponding dm-write. • RW (or WW) synchronization: – once S accepts a prewrite(x) with stamp TS, and until its corresponding dm-write(x) is output, S must not output any dm-read(x) or dm-write(x) with a timestamp newer than TS. – The effect is similar to setting a write-lock on data item x for the duration of two-phase commit. • To implement the above rules, S buffers dm-reads, dm-writes, and prewrites. 5/2/2006 CSC 336 – Final Presentation 15 3. Classic TSO Protocols - cont… 3.1 Basic TSO Implementation – The Algorithm • Let min-r-ts(x) be the min. TS of buffered dm-read’s, and let min-w-ts(x), min-p-ts(x) be defined analogously. • RW synchronization: Let R be a dm-read(x). – If ts(r) < w-ts(x), R is rejected. – Else if ts(r) > min-p-ts(x), R is buffered. – Else R is output. Let P be a prewrite(x). – If ts(p) < r-ts(x), P is rejected. – Else P is buffered. Let W be a dm-write(x). W is never rejected. – If ts(w) > min-r-ts(x), W is buffered. – Else W is output, and the corresponding prewrite is de-buffered*. 5/2/2006 CSC 336 – Final Presentation 16 3. Classic TSO Protocols - 3.1 Basic TSO Implementation – cont… Buffer emptying for basic T/O rw synchronization 5/2/2006 CSC 336 – Final Presentation 17 3. Classic TSO Protocols - 3.1 Basic TSO Implementation – cont… - Author's note Typo in the previous pseudo-code figure?: • The pseudocode given for when a R-operation is “ready” is as follows: R is ready if it precedes the earliest prewrite request: if ts(r) < min-p-ts(x) • I found that the following modification is needed: R is ready if <= the earliest prewrite request: if ts(r) <= min-p-ts(x) 5/2/2006 CSC 336 – Final Presentation 18 3. Classic TSO Protocols - 3.1 Basic TSO Implementation – cont… -- Author's note (cont…) -- Test case Time: 4 Op: P DI: 0 DIVal: 4 Time: 6 Op: P DI: 0 DIVal: 6 Time: 5 Op: R DI: 0 Time: 5 Op: P DI: 0 DIVal: 5 Time: 5 Op: W DI: 0 DIVal: 5 Time: 4 Op: W Output: DI: 0 DIVal: Time: 4 Op:W DI: 0 DIVal: 4 Status: SUCCEEDED Conflict: No_Conflict 4 AsTime: I expected, the output only 6 Op: W DI:shows 0 transaction DIVal: 4 being executed, even if logically I would6 expect transactions 5 and 6 to succeed as well. 5/2/2006 CSC 336 – Final Presentation 19 3. Classic TSO Protocols - 3.1 Basic TSO Implementation – cont… -- Author's note (cont…) -Therefore, I made the following correction to the source code: R is ready if it precedes the earliest prewrite request: if ts(R) <= min-P-ts(x) New Output: Time: 4 Op: W DIVal: 4 SUCCESS No_Conflict Time: 5 Op: R DIVal: 4 SUCCESS RW_Conflict Time: 5 Op: W DIVal: 5 SUCCESS RW_Conflict Time: 6 Op: W DIVal: 6 SUCCESS No_Conflict 5/2/2006 CSC 336 – Final Presentation 20 3. Classic TSO Protocols - 3.1 Basic TSO Implementation – cont… RW synchronization: Let P be a prewrite(x). if ts(p) < w-ts(x), P is rejected; else P is buffered. Let W be a dm-write(x). W is never rejected. if ts(w) > min-p-ts(x), W is buffered; else W is output. When W is output, the corresponding prewrite is debuffered. If min-p-ts(x) increased, buffered dm-writes are retested to see if any can now be output. 5/2/2006 CSC 336 – Final Presentation 21 3.2 The Thomas Write Rule (TWR) • Basic TSO can be optimized for ww-synch. using TWR: – Let W be a dm-write(x), and suppose ts(W) < W-ts(x). – Instead of rejecting W, it can simply be ignored. • TWR applies to dm-writes that try to place obsolete data into the DB. • TWR guarantees that applying a set of dm-writes to x has identical effect as if the dm-writes were applied in TS order. • If TWR is used, there is no need to incorporate 2PC into the ww-synchronization algorithm; the ww-scheduler always accepts prewrites and never buffers dm-writes. 5/2/2006 CSC 336 – Final Presentation 22 3.3 Multiversion TSO • Basic TSO can be improved for rw-synch. using multiversion data items. • For each data item x there is a set of r-ts's and a set of (w-ts, value) pairs, called Versions. • The r-ts's of x, record the TSs of all executed dm-read(x) operations, and the Versions, record the TSs and values of all executed dm-write(x) operations. • In practice one cannot store r-ts's and versions forever; Techniques for deleting old versions and timestamps are needed. 5/2/2006 CSC 336 – Final Presentation 23 3.4 Conservative TSO • Eliminates restarts during TSO scheduling. • Requires that each scheduler receive dm-reads (or dm-writes) from each TM in TS-order. • Since the network is assumed FIFO, this ordering is accomplished requiring TMs to send dm-reads (or dm-writes) to Schedulers in TS- order. • It buffers dm-reads and dm-writes as part of its normal operation. When a scheduler buffers an operation, it remembers the TM that sent it. 5/2/2006 CSC 336 – Final Presentation 24 3.5 Timestamp Management • Common critique of TSO schedulers: – Too much memory is needed to store timestamps. – This can be overcome by "forgetting" old timestamps. • Timestamps (TSs) are used in Basic-TSO to reject operations that "arrive late“; for example, a dm-read(x) with stamp TS1, is rejected if it arrives after a dm-write(x) with stamp TS2, where TS1< TS2. • In principle, TS1 and TS2 differ by arbitrary amount, • In practice it is unlikely that TSs will differ more than a few minutes. • Consequently, TSs can be stored in small tables that are periodically purged. 5/2/2006 CSC 336 – Final Presentation 25 3.5 Timestamp Management cont… • R-ts's are stored in R-table entries of form (x, R-ts(x)); [for any data item x, there is at most one entry]. • A variable R-min tells the maximum value of any TS that has been purged from the table. • To update R-ts(x), the S modifies the (x, R-ts(x)) entry in the table, if one exists. Otherwise, a new entry is created. • When the R-table is full, the S selects an appropriate value for R-min and deletes all entries from the table with smaller timestamp. • The W-ts's are managed similarly; Analogous techniques can be devised for Multiversion TSO databases • Maintaining TSs for Conservative TSO is even cheaper, since it requires only timestamped operations, not also timestamped data. 5/2/2006 CSC 336 – Final Presentation 26 3.6 Integrated CC Methods • An integrated CC method consists of: – two components ( rw and ww synchronization technique) – an interface between the components to ensure serializability. • Bernstein & Goodman list 48 CC methods that can be constructed combining the synchronization techniques using 2pl and/or TSO techniques, thus: – Pure 2PL methods, – Pure TSO methods, or – Methods that combine 2PL and TSO techniques. • This presentation discusses only the pure TSO techniques. 5/2/2006 CSC 336 – Final Presentation 27 3.6 Integrated CC Methods cont… Pure TSO Method RW technique 5/2/2006 WW technique 1 Basic T/O Basic T/O 2 Basic T/O Thomas Write Rule (TWR) 3 Basic T/O Multiversion T/O 4 Basic T/O Conservative T/O 5 Multiversion T/O Basic T/O 6 Multiversion T/O TWR 7 Multiversion T/O Multiversion T/O 8 Multiversion T/O Conservative T/O 9 Conservative T/O Basic T/O 10 Conservative T/O TWR 11 Conservative T/O Multiversion T/O 12 Conservative T/O Conservative T/O CSC 336 – Final Presentation 28 4. TSO PERFORMANCE • Main performance metrics for CC algorithms: – system throughput – transaction response time • 4 cost factors influence these metrics: – – – – inter-site communication local processing transaction restarts transaction blocking • The impact of each cost factor varies depending on algorithm, system, and application type. • B&G state at the time they wrote their paper in 1981: – "impact detail is not understood yet, and a comprehensive quantitative analysis is beyond the state of the art". 5/2/2006 CSC 336 – Final Presentation 29 4. TSO PERFORMANCE cont… • Since 1981 more research was done in Distributed CC Performance. • Carey & Livny (1988), studied performance for 4 algorithms - Distributed 2PL, Wound-Wait (WW), Basic TSO (BTO), and a Distributed Optimistic algorithm (OPT). They: – Examined for various degrees of contention, “distributedness” of the workload, and data replication. – Found that 2PL and OPT dominated BTO and WW. – Concluded that Optimistic Locking (where transactions lock remote copies of data only as they enter into the commit protocol, at the risk of end-of-transaction deadlocks), is the best performer in replicated DBs where messages are costly. 5/2/2006 CSC 336 – Final Presentation 30 5. TSO TODAY • Until recently, TSO was viewed as the black sheep of the family of CC algorithms. This seems to be changing… • Nørvåg at al. (1997), present a simulation study to examine the performance and response-times for two CC algorithms, TSO and 2PL. • Their results show the following: – For mix of shorts/long transactions (T’s), the throughput is significantly higher for TSO than for the 2PL scheduler. – For short T’s, the performance is almost identical. – For long T’s, 2PL performs better. • The authors comment: – TSO throughput was not expected to be higher than 2PL! (most previous studies used centralized DBs) – In tests done using the centralized ver. of the simulator, 2PL performed indeed better than TSO in most cases. 5/2/2006 CSC 336 – Final Presentation 31 5. TSO TODAY cont… • Srinivasa et al.(2001) takes a new look at the TSO CC. – They state that during the 80's and the 90's, the popular conception has been that TSO techniques like Basic-TSO (BTO) perform poorly compared to dynamic 2PL (D-2PL) – that is because previous studies concentrated on centralized DBs and low data contention scenarios. – This is mainly due to BTO’s high reject rate (thus, high restart rate) that causes it to reach hardware resources limits. – But, today’s processors are faster and workload characteristics have changed. – The authors show that BTO’s performance is much better than D2PL’s for a wide range of conditions, especially for high data contention. – D-2PL outperforms BTO only when both data contention and message latency are low. Increasing throughput demands make high data contention important, and BTO becomes an attractive choice for concurrency control. 5/2/2006 CSC 336 – Final Presentation 32 5. TSO TODAY cont… 5.1 New TSO Uses • Pacitti et al. (1999, 2001), proposed refreshment algorithms (using TSO). – address the central problem of maintaining replicas’ consistency in a lazy master replicated system. – In their related work section they also mention other relatively recent work where the authors propose 2 new lazy update protocols, also based on timestamp ordering 5/2/2006 CSC 336 – Final Presentation 33 5. TSO TODAY: 5.1 New TSO Uses cont… • Jensen & Lomet (2001), provide a new approach to transaction timestamping, both for Time Choice of the timestamp (TS), as well as for the CC protocol (combined with 2PL) – Their CC method combines TSO with 2PL – They delay the choice of a transaction’s TS, until the moment when the TS is needed by a statement in the transaction, or until the transaction commits. – They chose to delay choosing the TS because the classical approach of forcing the choice of the TS at transactionstart increases the chances that TS consistency checking will fail, resulting thus in aborts. 5/2/2006 CSC 336 – Final Presentation 34 6. CONCLUSION • This presentation discussed the Timestamp Ordering (TSO) Concurrency Control (CC), from its beginnings (70's) to present time (2000's). • The “classic” TSO methods were described thoroughly by Bernstein & Goodman a 1981 survey of the state of the art in DDBMS CC. – B&G design a system model and define terminology and concepts for a variety of CC algorithms. They define the concept of decomposition of CC algorithms into read-write and write-write synchronization sub-algorithms. They do not focus on performance. • Main CC Performance metrics are: – System Throughput – Transaction Response Time 5/2/2006 CSC 336 – Final Presentation 35 6. CONCLUSION cont… • As it was foreseen in the Bernstein & Goodman 1981 survey, many studies about the performance of CC algorithms were performed since then (e.g., 1988, 1997, and 2001 previously mentioned in this presentation). • Even if in the 1970's and 1980's TSO was viewed as the black sheep of the family of CC algorithms from the point of view of performance, this view is changing today, due to recent conclusions that TSO actually performs better in many situations in today's DDBMSs. 5/2/2006 CSC 336 – Final Presentation 36 7. REFERENCES [1] T. Komiya, H. Ohshida, M. Takizawa. Mobile Agent Model for Transaction Processing in Distributed Database Systems - Information Sciences, v. 154, issue 1-2, Aug 2003. [2] R. Ramakrishnan, J. Gehrke. Database Management Systems - McGraw Hill, 2003. Chapter 22 (Parallel And Distributed Databases) [3] A. Tanenbaum, M. van Steen, Distributed Systems: Principles and Paradigms - Prentice-Hall, 2002. Chapters 1,2 and 5. [4] Philip A. Bernstein, Nathan Goodman, Concurrency Control in Distributed Database Systems - ACM Computing Surveys (CSUR) Volume 13, Issue 2, June 1981. 5/2/2006 CSC 336 – Final Presentation 37 7. REFERENCES cont… • [5] Michael J. Carey, Miron Livny, Distributed Concurrency Control Performance: A Study of Algorithms, Distribution, and Replication, Fourteenth International Conference on Very Large DataBases (VLDB), August 29 - September 1, 1988, Los Angeles, California, USA, Proceedings. • [6] Kjetil Nørvåg, Olav Sandstå, and Kjell Bratbergsengen, Concurrency Control in Distributed Object-Oriented Database Systems, Advances in Databases and Information Systems, 1997 5/2/2006 CSC 336 – Final Presentation 38 7. REFERENCES cont… • [7] Rashmi Srinivasa, Craig Williams, Paul F. Reynolds Jr., A New Look at Timestamp Ordering Concurrency Control - Database and Expert Systems Applications: 12th International Conference, DEXA 2001 Munich, Germany, September 3-5, 2001, Proceedings • [8] Jensen, C., and Lomet, D., Transaction timestamping in (temporal) databases - VLDB Conference, Rome, Italy in Sept. 2001 and available at ftp://ftp.research.microsoft.com/users/lomet/pub/temporaltim e.pdf 5/2/2006 CSC 336 – Final Presentation 39 7. REFERENCES cont… • [9] E. Pacitti, P. Minet, and E. Simon. Fast algorithms for maintaining replica consistency in lazy master replicated databases, VLDB'99 - 126-137, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, UK. Also published in Distributed and Parallel Databases, 9, 237–267, 2001, by Kluwer Academic Publishers. • [10] Stéphane Gançarski, Hubert Naacke, Esther Pacitti, Patrick Valduriez, Parallel Processing with Autonomous Databases in a Cluster System, Proc. Coopis'2002, Irvine, California 5/2/2006 CSC 336 – Final Presentation 40 7. REFERENCES cont… • [11] M. Tamer Özsu , Patrick Valduriez, Principles of Distributed Database Systems, Second Edition, Prentice Hall , ISBN 0-13-659707-6 , 1999. Notes available at http://www.cs.ualberta.ca/~database/ddbook.html • [12] Gray J. N., How High is High Performance Transaction Processing?, Presentation at the High Performance Transaction Processing Workshop (HPTS99), Asilomar, California, 26-29th Sep 1999. • [13] Y. Breitbart, R. Komondoor, R. Rastogi, and S. Seshadri, Update propagation protocols for replicated databases, ACM SIGMOD Int. Conference on Management of Data, Philadelphia, PA, May 1999. 5/2/2006 CSC 336 – Final Presentation 41