International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014 On Real-Time Optimistic Concurrency Control Faiz Baothman Department of Computer Science College of Computers and Information Technology Taif University, Taif, Saudi Arabia f.baothman@tu.edu.sa Muzammil H Mohammed Department of Information Technology College of Computers and Information Technology Taif University, Taif, Saudi Arabia m.muzammil@tu.edu.sa Abstract: The performance of database transaction processing system can be profoundly affected by the concurrency control mechanism employed since it is necessary to preserve database integrity in a multi-user environment. In many applications, a database management system may have to operate under real-time constraints where it must satisfy timing constraints in addition to maintaining database integrity. In this paper we investigate the effect of system resources availability on the performance of optimistic concurrency control in firm real-time database system, the performance gain with memory resident database and of virtual run policy. II. OPTIMISTIC CONCURRENCY CONTROL Keywords: Real-Time Databases, Scheduling, Concurrency control, Transaction Processing. I. INTRODUCTION The task of a concurrency control (CC) mechanism is to ensure the consistency of the database while allowing a set of transactions to execute concurrently [1]. A real-time database system (RTDBS) is a database system where transactions have explicit timing constraints such as deadlines and the primary performance criterion is timeliness level and scheduling of transactions is driven by priority rather than fairness considerations[2,3,12]. CC is one of the main issues in the study of RTDBS. In hard RTDBS, missing deadlines may result in a catastrophe. In soft RTDBS, transactions have deadlines but there are some decreasing values in allowing them to complete even after their deadlines. When these values drop to zero by missing the transactions their deadlines, it is referred to as Firm RTDBS but no catastrophic consequences [4, 5, 6, and 13]. Real-time database systems have grown larger and become more critical. For example, they are used in stock market, telephone switching systems, virtual environment systems, network management, automated factory management, and command and control systems. ISSN: 2231-5381 Optimistic concurrency controls (OCC) [7] are designed to get rid of the locking overhead. They are optimistic in the sense that they take into account the explicit assumption that conflicts among transactions are rare events. Thus, they rely for efficiency on the hope that conflicts will not occur. Since locking are not applied, such schemes are deadlock free. In the OCC, conflict detection and resolution are both done at the certification time. When a transaction completes its execution, it requests the CC manager to validate all its accessed data objects (in contrast to 2PL’s pessimistic method of detecting conflicts before transaction making access to the data object). The basic idea of an optimistic CC schemes is that, the execution of a transaction consists of three phases: read, validation and write phases as shown in fig. (1). During the read phase, a read access is first directed to the transaction’s private workspace. If the data object is not found in its buffer, database has to be accessed, that is, the system buffer or the database stored in disks. Updated or modified data objects are stored in the transaction’s private workspace. After completing its read phase it enters the validation phase where CC manager has to check whether or not the transaction intending to commit was in conflict with any of the transactions operating in parallel. If so, some conflict resolution policy has to be applied. If no conflict is detected, the transaction is prepared to commit or enters its write phase where it writes all its updates to the database, i.e., makes all its modifications on the database generally visible to other concurrent transactions. The key component among these three phases is the validation phase where a transaction’s destiny is decided and it is based on the following principles to ensure serializability. If a transaction Ti is serialized before transaction Tj the following two rules must be observed [8]: Rule 1. No overwriting The writes of Ti should not overwrite the writes of Tj and vice versa. http://www.ijettjournal.org Page 414 International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014 Rule 2. No read dependency The writes of Ti should not affect the read phase of Tj Rule 1 is automatically ensured in most OCC schemes because I/O operations in the write phase are required to be done sequentially in critical section. Rule 2 can be performed in one of two ways [9]: backward validation and forward validation. In backward validation, the validating transaction either commits or aborts depending on whether or not it has conflicts with transactions that have already committed. Thus, there is no way to take transactions timing constraints into account and the only way to resolve conflict if it exists is to restart the validating one. So backward validation is not applicable to realtime database systems. In forward validation scheme, validation of a transaction is done against concurrently running transactions where, either the validating transaction or active conflicting transactions can be aborted and restarted to resolve the conflict. Thus, forward validation provides flexibility for conflict resolution because it may be preferable sometimes not to commit the validating transaction, depending on the timing characteristics or priorities of the validating transaction and the active conflicting transactions. In addition, forward validation generally detects and resolves data conflicts earlier than backward validation and hence it wastes less resources and time. This work was motivated by a desire to investigate the performance of optimistic concurrency control and some of its variants in firm real-time database system (RTDBS) under different resources-related assumptions. Since selecting a concurrency control scheme for RTDBS is strongly resource dependent and the hardware costs continue to fall dramatically and in near future the dominant factor will not be the cost, but rather an attempt to increase the return from our resources. READ VALIDATION WRITE Time Fig. 1. The three Phases of a Transaction IV. Adaptation of OCC for RTDBS Optimistic concurrency control (OCC) is non-blocking and deadlock free. These properties make OCC more attractive to real-time transaction processing systems. To adapt optimistic concurrency control schemes for real-time database environment, the issue is how to incorporate priorities or timing constraints of transactions into the conflict resolution mechanism of the optimistic concurrency control. As the validating transaction may conflict with a set of transactions, where some may have a higher priority ISSN: 2231-5381 and others may have lower priority than the validating transaction. We explain below the four optimistic based schemes used in the experiments. III. OCC-Forward Validation(OCC-FV) In this scheme, the transaction that reaches its validation phase is allowed to commit and all the active conflicting transactions which are in their read phases are aborted. This scheme does not take the transactions timing constraints into account and favours the validating one to save the amount of progress done by the validating transaction and will definitely complete if it is not restarted. TR1,....TRi TV if (Tv conflicts with TR1,.......TRi ) then abort all conflicting transactions Fig. 2. Conflict resolution of OCC-FV Where TR is transaction in its read phase and Tv is transaction in its validation phase. OCC-High Priority100 (OCC-HP100) The validating transaction is aborted and restarted if all conflicting read phase transactions have higher priorities than the validating one; otherwise it commits and all the conflicting transactions are restarted. TR1,....TRi TV if (Tv conflicts with TR1,.......TRi ) then if( all conflicting transactions have higher priority than Tv) then abort Tv else abort all conflicting transactions. Fig. 3. Conflict resolution of OCC-HP100 V. OCC-High Priority50 (OCC-HP50) In this scheme, when a transaction reaches its validation phase, its priority is checked against those conflicting transactions in their read phases. If more than 50 percent of the conflicting transactions have higher priority, the validating transaction is aborted and all conflicting transactions are allowed to continue; otherwise the validating transaction commits and all conflicting transactions are restarted. TR1,....TRi TV If (Tv conflicts with TR1,.......TRi ) then if( more than 50% of the conflicting transactions have higher priority) then abort Tv else abort all conflicting transactions. Fig. 4. Conflict resolution of OCC-HP50 http://www.ijettjournal.org Page 415 International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014 VI. OCC-High Priority (OCC-HP) It is an optimistic protocol which uses a priority-driven abort for conflict resolution. In this protocol, when a transaction reaches its validation phase, it is aborted if one or more conflicting transactions have higher priority than the validating one; otherwise it commits and all the conflicting transactions are restarted immediately. This protocol uses transaction priority (timing constraints) in such a way that the validating transaction sacrifices itself for the sake of conflicting one with higher priority. TR1,......TRi TV if (Tv conflicts with one or more of TR1,.......TRi ) then if( Tv has higher priority than all) then abort all conflicting transactions. else abort Tv Fig. 5. Conflict resolution of OCC-HP VII. PERFORMANCE MODEL The simulation model for RTDBS is a single-site disk resident and memory resident database system operating on shared memory multiprocessors. CPUs share a single queue and the service discipline used for the queue is priority scheduling without pre-emption. Each disk has its own queue and is also scheduled with priority scheduling[9]. In this model, the execution of a transaction consists of multiple instances of alternating data access request and data operation steps until all of the data operations in it complete or it is aborted for some reason. For optimistic concurrency control, the first CC request is granted immediately and all object accesses are then performed with no intervening CC requests, only after the last object access is finished does a transaction return to the CC manager[10]. When a transaction completes its data access requests, it requests the concurrency control manager to validate them. If it is validated it enters the write phase with priority raised to maximum so it can complete as fast as possible. Whenever a transaction passes through concurrency control for data access request or whenever it is restarted, it enters the deadline test. If it missed its deadline, it is terminated and permanently discarded from the system. When CC decides to validate a transaction and restart the active (a transaction in its read phase is considered to be active) conflicting transactions or vice versa, the restarted transaction enters the CC queue and then begins making all of its data accesses and operations from the beginning for the same read and write sets (real restart) if it has not missed its deadline. The database is modelled as a set of pages, each of which can contain a single data object. The ISSN: 2231-5381 database size is fixed to 200 to investigate the performance under high data contention that is to create a situation in which conflicts are more frequent. The small database also allows us to study the effect of hot spots, in which a small part of the database is accessed frequently by most of the transactions. A transaction consists of a mixed sequence of read and write operations. We assume that a write operation is always preceded by a read, i.e., the write set of a transaction is always a subset of its read set. The use of database buffer pool is simulated using probability. When a transaction attempts to read a data item, the system determines whether the page is in memory or disk using the probability DISK ACCESS PROB. If the page is determined to be in memory, the transaction can continue processing without disk access. Otherwise, an I/O service request is created and placed in the input queue of the appropriate disk. The database is partitioned equally over the disks and we use the function D = i Number of disks DB size to map an object i to the disk where it is stored. Transactions arrive in a Poisson stream, i.e., their inter-arrival times are exponentially distributed. The mean arrival time parameter specifies the mean inter arrival time between transactions. The number of data objects accessed by a transaction is determined by a normal distribution with mean transaction length of 10 and the actual database items are randomly chosen from among all of the data objects in the database. A data item that is read is updated with the probability Update probability. We also assume that the cost of executing concurrency control operation is included in the variable that states how much CPU time is needed per data object that a transaction accesses. The assignment of deadlines to transaction is controlled by the parameters: minimum slack factor of 2, maximum slack factor of 8 which set a lower and upper bound, respectively, on a transaction’s slack time, AT and ET which denote the arrival time and execution time, respectively. A deadline is assigned by choosing a slack time uniformly from the range specified by the bounds. The execution time of a transaction used is not an actual execution time, but a time estimated using the values of parameters, mean transaction length, mean CPU computation time and mean disk time. In this system, the priorities of transactions are assigned by the Earliest Deadline First policy, which uses only deadline information to decide transaction priority. Our program to simulate a RTDBS system was written in C, for each of the following experiments, the simulation was run with the same parameters for at least 20 different random number seeds for generating one data points. Each http://www.ijettjournal.org Page 416 International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014 run continued until 2000 transactions were executed. The statistical data reported in this paper has 90% confidence intervals whose end points are within 10% of the reported mean values which are plotted in the graphs. The important goal of RTDBS is to meet the time constraints of the activities, therefore the primary performance metric used is the percentage of transactions which miss their deadlines, referred to as Miss Percentage. Miss Percentage is calculated with the following equation: Miss Percentage = 100 * (no. of deadline-missing transactions/total no. of transactions processed). We show also the average number of restarts per transaction which gives the average number of times that a transaction has to restart before completing or missing its deadline and then permanently discarded from the system. It is computed as the ratio of the number of transaction-restarting events to the number of processed transactions. IX. EXPERIMENTS AND RESULTS We investigate the performance of the above four optimistic concurrency control schemes to show the impact of the system resource availability on their performance since the conflict resolution method used by the concurrency control mechanism has a direct effect on the utilisation of the system resources. A. Limited System Resources (LSR) versus Unlimited System Resources (USR) In this experiment we evaluate the performance under the condition of limited system resources where we fixed the number of CPUs and disks to 2 and 4, respectively. Result shows the miss percentage behaviour of the four schemes under different levels of system workload. System workload is controlled by the arrival rate of transactions in the system. The value of update probability is set to 0.25 in this experiment. It is clear that for very low arrival rates, there is not much difference among the four protocols. However, as the arrival rate increases, OCC-FV, OCC-HP100 and OCC-HP50 do better than OCC-HP and OCC-FV does even slightly better than OCC-HP100 and OCCHP50. The performance difference becomes more clear as the update probability increases to 0.5. The schemes OCC-FV, OCC-HP100 and OCC-HP50 outperform OCC-HP, since they avoid (as OCC-FV) or try to avoid the wastage of work done by the validating transactions, in contrast to OCC-HP where a validating transaction is aborted for the sake of, even one, conflicting higher priority still in its read phase which may later be aborted. OCCFV gives slightly better performance than OCC- ISSN: 2231-5381 HP100 and OCC-HP50 because every transaction that reaches its validation phase is allowed to commit. There is also slight gain for OCC-HP100 over OCC-HP50 as we are putting more restriction in OCC-HP100 for aborting the validating transactions than that in OCC-HP50. Thus, the results obtained are biased in favour of the schemes who save or try to save the validating transactions. Since the system in this experiment operates with limited system resources, it has high level of resources contention. Therefore, the average number of restarts starts decreasing when the resources contention dominates data contention in discarding deadline-missing transactions and at certain workload point when the system is saturated, its value becomes almost constant and roughly zero. This is because transactions miss their timing constraints while waiting in system resource queues for their turn to be served. OCC-HP incurs the highest restart number since it restarts validating (near completion) transaction for the sake of one or more conflicting higher priority transaction which itself may later be restarted and this explains also its inferior performance to the other schemes since small number of restarts leads to better performance with limited system resources as resources waste is avoided and resources will be available for doing useful work. Since in RTDBS meeting the time constraints of real-time transactions is more important than the cost consideration and to eliminate the effect of system resources contention on the performance of the concurrency control schemes we simulate an unlimited system resources situation where there is always a free CPU when it is needed i.e., we eliminate the queueing for the system resources (CPUs and disks). The result shows the miss percentage behaviour of the four schemes and the performance difference here is due to their different conflict resolution mechanisms only, since in this experiment, there is no effect of system resources contention on the performance. Again with small update probability and low level of workload, there is not much difference among the schemes, but as the arrival rate or the update probability increases, the OCC-HP is the worst for the same reasons explained above, though there is significant improvement as compared to the limited system resources experiment, this is because the effect of wastage of system resources on its performance which in turn leads to high resources contention is reduced or tolerated in this experiment due to the unlimited availability of system resources, but the degradation in its performance as the number of arriving transactions increases is because a situation in which the validating transaction conflicts with one higher priority transaction is more frequent which in http://www.ijettjournal.org Page 417 International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014 turn increases the number of transactions missing their deadlines. Result of this work also shows that, under low data contention (small update probability) and low system workload level, OCC-HP50 gives the better performance, since in such a situation, restarting the validating transaction if it conflicts with more than 50% of higher priority transactions helps in balancing between increasing the number of transactions meeting their deadlines and saving the progress done by the validating transactions, but as the workload level or the update probability increases, OCC-HP50 becomes inferior to OCC-FV and OCC-HP100. This is because, the chance that a restarted validating transaction, in such situations, faces the same destiny (restart again) is more. Since there is no resources contention, the average number of restarts of the four schemes increases as the system workload increases. The performance gain is considerable with no system resources contention as compared to that with limited system resources. B. Memory Resident Database (MRDB) versus Disk Resident Database (DRDB) In this experiment we study the performance of the four schemes while assuming the database as memory resident and then as disk resident to show the impact of the I/O operations on the performance. Studying memory resident database (MRDB) systems is important since many existing real-time systems currently hold all their data in memory, memory prices are drastically dropping, memory sizes are growing and memory residence becomes less of a restriction. It is clear that as the impact of I/O is reduced (as with D. A. P = 0.5) or eliminated as in memory resident databases, since writes to maintain up-to-date copy of the data objects on disks occur after a transaction commits and have no effect on transaction tardiness and also no transaction reads from disk, the performance gain is significant. Similar results are obtained for the other schemes. To isolate the effect of system resources contention under memory resident databases assumption, we also perform this experiment with unlimited system resources. Result shows the excellent performance especially of OCC-FV, OCCHP100 and OCC-HP50, since they allow the maximum number of transactions to meet their timing constraints and it is the primary goal of RTDBS and more important than the cost consideration especially in some situations where a large negative value is imparted to the system if a deadline is missed. Also shows that, OCC-HP50 with small update probability gives the best performance as explained above. Also, since there is no system resources contention, the average number of restarts ISSN: 2231-5381 of the schemes increases as the workload level increases. X. Virtual Run Policy (VRP) In the previous experiments we have not considered the effect of buffering on rerun transactions. Then the buffer hit-ratios of rerun and first-run transactions are taken to be the same. With sufficient buffer and high retention effect, data blocks referenced by aborted transactions continue to be retained in memory and be available for accessing during rerun[11]. In this experiment when CC decides to restart the active conflicting transactions, if some of them are in their first run, instead of aborting them they enter their virtual run mode and continue their read phases to bring data objects required to buffer, assuming sufficient buffer with a high retention effect, so that data blocks referenced by aborted transactions continue to be retained in memory and be available for access during reruns. When a virtual run transaction completes its read phase, it is aborted and resubmitted to the system to start its real second run. There is no point to allow restarted rerun transaction to complete its read phase in virtual mode since its data items are already in memory. Result shows that the OCC-FV scheme with virtual run policy does better under low system workload level when the system operates with limited resources (2 system resources units, where each resource unit has one CPU and two disks), but as the number of arriving transactions increases, its performance is somewhat degraded. This is because the marked for restart first-run transactions under this policy continue their read phases in virtual run mode and compete for system resources to complete fetching their database requirements to buffer, which in turn increases the, already high, system resources contention, where the average number of restarts of OCC-FV with virtual run policy begins decreasing sooner than that of OCC-FV. We operate the system with five system resources units and as it is expected, the performance is improved below 20 transactions/sec workload point, but, as before, when the workload increases further, the performance degrades. The sensitivity of the schemes towards the system resources availability is clear, also the improvement in the performance of the schemes with the virtual run policy as the number of resources units increases. Since, with sufficient system resources availability, the increased load put by the marked for restart first-run transactions can be tolerated and then the virtual run policy with high retention buffer can help the schemes in achieving better performance. Similar results are obtained for the other schemes. Results show the good performance especially for the http://www.ijettjournal.org Page 418 International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014 schemes OCC-FV, OCC-HP100 and OCC-HP50 under unlimited system resources assumption with virtual run policy which is comparable to that of memory resident databases with unlimited system resources. Also since there is no resources contention in this experiment, the average number of restarts increases as the number of arriving transactions increases. REFERENCES XI. CONCLUSION We have investigated the performance of four optimistic concurrency control schemes OCCFV, OCC-HP100, OCC-HP50 and OCC-HP with alternative assumptions about database system resources. We showed that, under the policy that discards tardy transactions (i.e., transactions which miss their deadline) from the system, the OCC-FV, OCC-HP100 and OCC-HP50 outperform the other optimistic scheme OCC-HP where they incur less miss percentage and there is slight performance gain for OCC-FV over OCC-HP100 and OCC-HP50 due to its complete saving policy of work done by the validating transaction. 3. 1. 2. 4. 5. 6. To isolate the effect of system resource contention on the performance of the three schemes, we assumed unlimited system resources and as it was expected there is significant performance gain especially for the schemes OCC-FV, OCC-HP100 and OCC-HP50. We showed also the impact of I/O operations on the performance and the improvement obtained by making our databases memory resident. The excellent performance of the above three schemes was obtained under the assumption of unlimited system resources with virtual run policy or with memory resident databases. This assumption is reasonable since the primary goal of RTDBS is maximising the number of transactions meeting their timing constraints and it is more important than the cost considerations particularly in some critical situations. Finally, the specific conclusion that is drawn regarding the resource-related performance results of the schemes is that, as the effect of system resources contention is isolated (as in USR experiment) and the impact of I/O operations is reduced (as in VRP experiment) or eliminated (as in MRDB experiment), the performance gain is very significant especially for those schemes who save or try to save the validating (near completion) transactions. ISSN: 2231-5381 7. 8. 9. 10. 11. 12. 13. P. Bernstein and N. Goodman, ” Concurrency control in distributed databasesystems,” Computing Surveys, vol. 13,no. 2, June 1981. R. K. Abbott and H. Garcia-Molina, “Scheduling Real-Time Transactions: A performance evaluation,” ACM Trans. Database Syst., vol. 17, no. 3, Sept 1992. Sang H. Son and S. Park, “A Priority Based Scheduling Algorithm for Real-Time Databases,” Journal of Information Science and Engg., Nov 1995. K. Ramamritham, “Real-Time Databases, ”Intenational Journal of Distributed and Parallel Databases, vol. 1, no. 1, 1993. J. Haung, J. Stankovic, D. Towsley and K. Ramamritham, “Real-time transaction processing: Design, implementation and performance evaluation,” COINS technical report May 1990. Department of computer science, University of Massachusettes at Amherst. J. Harista, M. Carey and M. Livny, “Data access scheduling in firm real-time database systems,” Journal of real-time systems, vol. 4, sept. 1992. H. T. Kung and J. Robinson, “On optimistic method for concurrency control,” ACM Trans. Database Syst., vol. 6, no. 2, June 1981. T. Harder, “Observations on optimistic concurrency control schemes,” Information Systems, vol. 9, no. 2, 1984. J. Lee and S. H. Son, “Using Dynamic Adjusting of Serialization Order for Real-Time Database Systems,” Proc. of the 14th Real-Time Systems Symposium, Raleigh-Durham, NC, Dec 1993 R. Agrawal, M. Carey and M. Livny, “Concurrency control performance modeling: Alternatives and Implications,” ACM Trans. on Database Systems, Dec. 1987. P. Yu and D. Dias, “Analysis of hybrid concurrency control schemes for a high data contention environment,” IEEE Trans. Software Engg., vol. 18, no. 2, Feb 1992. N. Kaur, and al, “Concurrency Control for multilevel secure Database”, Inter. Journal of Network security, vol.9, No. 1, July 2009 Md Anisur and Md Hossain “ A comprehensive Concurrency Control Techniques for Real-Time Database Systems” Global Journal of Computer Science and Technology” vol. 13 issue 2, 2013 http://www.ijettjournal.org Page 419