On Real-Time Optimistic Concurrency Control

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014
On Real-Time Optimistic Concurrency
Control
Faiz Baothman
Department of Computer Science
College of Computers and Information Technology
Taif University, Taif, Saudi Arabia
f.baothman@tu.edu.sa
Muzammil H Mohammed
Department of Information Technology
College of Computers and Information Technology
Taif University, Taif, Saudi Arabia
m.muzammil@tu.edu.sa
Abstract: The performance of database transaction
processing system can be profoundly affected by
the concurrency control mechanism employed
since it is necessary to preserve database integrity
in a multi-user environment. In many
applications, a database management system may
have to operate under real-time constraints where
it must satisfy timing constraints in addition to
maintaining database integrity. In this paper we
investigate the effect of system resources
availability on the performance of optimistic
concurrency control in firm real-time database
system, the performance gain with memory
resident database and of virtual run policy.
II. OPTIMISTIC CONCURRENCY CONTROL
Keywords: Real-Time Databases, Scheduling,
Concurrency control, Transaction Processing.
I. INTRODUCTION
The task of a concurrency control (CC) mechanism is
to ensure the consistency of the database while
allowing a set of transactions to execute concurrently
[1]. A real-time database system (RTDBS) is a
database system where transactions have explicit
timing constraints such as deadlines and the primary
performance criterion is timeliness level and
scheduling of transactions is driven by priority rather
than fairness considerations[2,3,12]. CC is one of the
main issues in the study of RTDBS. In hard RTDBS,
missing deadlines may result in a catastrophe. In soft
RTDBS, transactions have deadlines but there are
some decreasing values in allowing them to complete
even after their deadlines. When these values drop to
zero by missing the transactions their deadlines, it is
referred to as Firm RTDBS but no catastrophic
consequences [4, 5, 6, and 13].
Real-time database systems have grown
larger and become more critical. For example, they
are used in stock market, telephone switching
systems, virtual environment systems, network
management, automated factory management, and
command and control systems.
ISSN: 2231-5381
Optimistic concurrency controls (OCC) [7]
are designed to get rid of the locking overhead. They
are optimistic in the sense that they take into account
the explicit assumption that conflicts among
transactions are rare events. Thus, they rely for
efficiency on the hope that conflicts will not occur.
Since locking are not applied, such schemes are
deadlock free. In the OCC, conflict detection and
resolution are both done at the certification time.
When a transaction completes its execution, it
requests the CC manager to validate all its accessed
data objects (in contrast to 2PL’s pessimistic method
of detecting conflicts before transaction making
access to the data object). The basic idea of an
optimistic CC schemes is that, the execution of a
transaction consists of three phases: read, validation
and write phases as shown in fig. (1). During the read
phase, a read access is first directed to the
transaction’s private workspace. If the data object is
not found in its buffer, database has to be accessed,
that is, the system buffer or the database stored in
disks. Updated or modified data objects are stored in
the transaction’s private workspace. After completing
its read phase it enters the validation phase where CC
manager has to check whether or not the transaction
intending to commit was in conflict with any of the
transactions operating in parallel. If so, some conflict
resolution policy has to be applied. If no conflict is
detected, the transaction is prepared to commit or
enters its write phase where it writes all its updates to
the database, i.e., makes all its modifications on the
database generally visible to other concurrent
transactions. The key component among these three
phases is the validation phase where a transaction’s
destiny is decided and it is based on the following
principles to ensure serializability.
If a transaction Ti is serialized before transaction Tj
the following two rules must be observed [8]:
Rule 1. No overwriting The writes of Ti should not
overwrite the writes of Tj and vice versa.
http://www.ijettjournal.org
Page 414
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014
Rule 2. No read dependency The writes of Ti should
not affect the read phase of Tj
Rule 1 is automatically ensured in most OCC
schemes because I/O operations in the write phase
are required to be done sequentially in critical
section. Rule 2 can be performed in one of two ways
[9]: backward validation and forward validation. In
backward validation, the validating transaction either
commits or aborts depending on whether or not it has
conflicts with transactions that have already
committed. Thus, there is no way to take transactions
timing constraints into account and the only way to
resolve conflict if it exists is to restart the validating
one. So backward validation is not applicable to realtime database systems. In forward validation scheme,
validation of a transaction is done against
concurrently running transactions where, either the
validating transaction or active conflicting
transactions can be aborted and restarted to resolve
the conflict. Thus, forward validation provides
flexibility for conflict resolution because it may be
preferable sometimes not to commit the validating
transaction, depending on the timing characteristics
or priorities of the validating transaction and the
active conflicting transactions. In addition, forward
validation generally detects and resolves data
conflicts earlier than backward validation and hence
it wastes less resources and time.
This work was motivated by a desire to
investigate
the
performance
of
optimistic
concurrency control and some of its variants in firm
real-time database system (RTDBS) under different
resources-related assumptions. Since selecting a
concurrency control scheme for RTDBS is strongly
resource dependent and the hardware costs continue
to fall dramatically and in near future the dominant
factor will not be the cost, but rather an attempt to
increase the return from our resources.
READ
VALIDATION WRITE
Time
Fig. 1. The three Phases of a Transaction
IV. Adaptation of OCC for RTDBS
Optimistic concurrency control (OCC) is
non-blocking and deadlock free. These properties
make OCC more attractive to real-time transaction
processing systems. To adapt optimistic concurrency
control schemes for real-time database environment,
the issue is how to incorporate priorities or timing
constraints of transactions into the conflict resolution
mechanism of the optimistic concurrency control. As
the validating transaction may conflict with a set of
transactions, where some may have a higher priority
ISSN: 2231-5381
and others may have lower priority than the
validating transaction. We explain below the four
optimistic based schemes used in the experiments.
III. OCC-Forward Validation(OCC-FV)
In this scheme, the transaction that reaches
its validation phase is allowed to commit and all the
active conflicting transactions which are in their read
phases are aborted. This scheme does not take the
transactions timing constraints into account and
favours the validating one to save the amount of
progress done by the validating transaction and will
definitely complete if it is not restarted.
TR1,....TRi
TV
if (Tv conflicts with TR1,.......TRi )
then abort all conflicting transactions
Fig. 2. Conflict resolution of OCC-FV
Where TR is transaction in its read phase and Tv is
transaction in its validation phase.
OCC-High Priority100 (OCC-HP100)
The validating transaction is aborted and
restarted if all conflicting read phase transactions
have higher priorities than the validating one;
otherwise it commits and all the conflicting
transactions are restarted.
TR1,....TRi
TV
if (Tv conflicts with TR1,.......TRi ) then
if( all conflicting transactions have higher
priority
than Tv)
then abort Tv else
abort
all
conflicting
transactions.
Fig. 3. Conflict resolution of OCC-HP100
V. OCC-High Priority50 (OCC-HP50)
In this scheme, when a transaction reaches
its validation phase, its priority is checked against
those conflicting transactions in their read phases. If
more than 50 percent of the conflicting transactions
have higher priority, the validating transaction is
aborted and all conflicting transactions are allowed to
continue; otherwise the validating transaction
commits and all conflicting transactions are restarted.
TR1,....TRi
TV
If (Tv conflicts with TR1,.......TRi ) then
if( more than 50%
of the conflicting
transactions
have higher priority)
then abort Tv else abort all conflicting
transactions.
Fig. 4. Conflict resolution of OCC-HP50
http://www.ijettjournal.org
Page 415
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014
VI. OCC-High Priority (OCC-HP)
It is an optimistic protocol which uses a
priority-driven abort for conflict resolution. In this
protocol, when a transaction reaches its validation
phase, it is aborted if one or more conflicting
transactions have higher priority than the validating
one; otherwise it commits and all the conflicting
transactions are restarted immediately. This protocol
uses transaction priority (timing constraints) in such a
way that the validating transaction sacrifices itself for
the sake of conflicting one with higher priority.
TR1,......TRi
TV
if (Tv conflicts with one or more of TR1,.......TRi )
then
if( Tv has higher priority than all)
then abort all conflicting transactions.
else abort Tv
Fig. 5. Conflict resolution of OCC-HP
VII. PERFORMANCE MODEL
The simulation model for RTDBS is a
single-site disk resident and memory resident
database system operating on shared memory
multiprocessors. CPUs share a single queue and the
service discipline used for the queue is priority
scheduling without pre-emption. Each disk has its
own queue and is also scheduled with priority
scheduling[9]. In this model, the execution of a
transaction consists of multiple instances of
alternating data access request and data operation
steps until all of the data operations in it complete or
it is aborted for some reason. For optimistic
concurrency control, the first CC request is granted
immediately and all object accesses are then
performed with no intervening CC requests, only
after the last object access is finished does a
transaction return to the CC manager[10]. When a
transaction completes its data access requests, it
requests the concurrency control manager to validate
them. If it is validated it enters the write phase with
priority raised to maximum so it can complete as fast
as possible. Whenever a transaction passes through
concurrency control for data access request or
whenever it is restarted, it enters the deadline test. If
it missed its deadline, it is terminated and
permanently discarded from the system. When CC
decides to validate a transaction and restart the active
(a transaction in its read phase is considered to be
active) conflicting transactions or vice versa, the
restarted transaction enters the CC queue and then
begins making all of its data accesses and operations
from the beginning for the same read and write sets
(real restart) if it has not missed its deadline.
The database is modelled as a set of pages,
each of which can contain a single data object. The
ISSN: 2231-5381
database size is fixed to 200 to investigate the
performance under high data contention that is to
create a situation in which conflicts are more
frequent. The small database also allows us to study
the effect of hot spots, in which a small part of the
database is accessed frequently by most of the
transactions. A transaction consists of a mixed
sequence of read and write operations. We assume
that a write operation is always preceded by a read,
i.e., the write set of a transaction is always a subset of
its read set.
The use of database buffer pool is simulated
using probability. When a transaction attempts to
read a data item, the system determines whether the
page is in memory or disk using the probability DISK
ACCESS PROB. If the page is determined to be in
memory, the transaction can continue processing
without disk access. Otherwise, an I/O service
request is created and placed in the input queue of the
appropriate disk. The database is partitioned equally
over the disks and we use the function
D = i  Number of disks  DB size
to map an object i to the disk where it is stored.
Transactions arrive in a Poisson stream, i.e.,
their inter-arrival times are exponentially distributed.
The mean arrival time parameter specifies the mean
inter arrival time between transactions. The number
of data objects accessed by a transaction is
determined by a normal distribution with mean
transaction length of 10 and the actual database items
are randomly chosen from among all of the data
objects in the database. A data item that is read is
updated with the probability Update probability. We
also assume that the cost of executing concurrency
control operation is included in the variable that
states how much CPU time is needed per data object
that a transaction accesses.
The assignment of deadlines to transaction is
controlled by the parameters: minimum slack factor
of 2, maximum slack factor of 8 which set a lower
and upper bound, respectively, on a transaction’s
slack time, AT and ET which denote the arrival time
and execution time, respectively. A deadline is
assigned by choosing a slack time uniformly from the
range specified by the bounds. The execution time of
a transaction used is not an actual execution time, but
a time estimated using the values of parameters,
mean transaction length, mean CPU computation
time and mean disk time. In this system, the priorities
of transactions are assigned by the Earliest Deadline
First policy, which uses only deadline information to
decide transaction priority. Our program to simulate a
RTDBS system was written in C, for each of the
following experiments, the simulation was run with
the same parameters for at least 20 different random
number seeds for generating one data points. Each
http://www.ijettjournal.org
Page 416
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014
run continued until 2000 transactions were executed.
The statistical data reported in this paper has 90%
confidence intervals whose end points are within 
10% of the reported mean values which are plotted in
the graphs.
The important goal of RTDBS is to meet the
time constraints of the activities, therefore the
primary performance metric used is the percentage of
transactions which miss their deadlines, referred to as
Miss Percentage. Miss Percentage is calculated with
the following equation:
Miss Percentage = 100 * (no. of deadline-missing
transactions/total no. of transactions processed).
We show also the average number of restarts per
transaction which gives the average number of times
that a transaction has to restart before completing or
missing its deadline and then permanently discarded
from the system. It is computed as the ratio of the
number of transaction-restarting events to the number
of processed transactions.
IX. EXPERIMENTS AND RESULTS
We investigate the performance of the above
four optimistic concurrency control schemes to show
the impact of the system resource availability on their
performance since the conflict resolution method
used by the concurrency control mechanism has a
direct effect on the utilisation of the system
resources.
A. Limited System Resources (LSR) versus
Unlimited System Resources (USR)
In this experiment we evaluate the
performance under the condition of limited system
resources where we fixed the number of CPUs and
disks to 2 and 4, respectively. Result shows the miss
percentage behaviour of the four schemes under
different levels of system workload. System
workload is controlled by the arrival rate of
transactions in the system. The value of update
probability is set to 0.25 in this experiment. It is clear
that for very low arrival rates, there is not much
difference among the four protocols. However, as the
arrival rate increases, OCC-FV, OCC-HP100 and
OCC-HP50 do better than OCC-HP and OCC-FV
does even slightly better than OCC-HP100 and OCCHP50. The performance difference becomes more
clear as the update probability increases to 0.5.
The schemes OCC-FV, OCC-HP100 and
OCC-HP50 outperform OCC-HP, since they avoid
(as OCC-FV) or try to avoid the wastage of work
done by the validating transactions, in contrast to
OCC-HP where a validating transaction is aborted for
the sake of, even one, conflicting higher priority still
in its read phase which may later be aborted. OCCFV gives slightly better performance than OCC-
ISSN: 2231-5381
HP100 and OCC-HP50 because every transaction
that reaches its validation phase is allowed to
commit. There is also slight gain for OCC-HP100
over OCC-HP50 as we are putting more restriction in
OCC-HP100 for aborting the validating transactions
than that in OCC-HP50. Thus, the results obtained
are biased in favour of the schemes who save or try to
save the validating transactions. Since the system in
this experiment operates with limited system
resources, it has high level of resources contention.
Therefore, the average number of restarts starts
decreasing when the resources contention dominates
data contention in discarding deadline-missing
transactions and at certain workload point when the
system is saturated, its value becomes almost
constant and roughly zero. This is because
transactions miss their timing constraints while
waiting in system resource queues for their turn to be
served. OCC-HP incurs the highest restart number
since it restarts validating (near completion)
transaction for the sake of one or more conflicting
higher priority transaction which itself may later be
restarted and this explains also its inferior
performance to the other schemes since small number
of restarts leads to better performance with limited
system resources as resources waste is avoided and
resources will be available for doing useful work.
Since in RTDBS meeting the time
constraints of real-time transactions is more
important than the cost consideration and to eliminate
the effect of system resources contention on the
performance of the concurrency control schemes we
simulate an unlimited system resources situation
where there is always a free CPU when it is needed
i.e., we eliminate the queueing for the system
resources (CPUs and disks). The result shows the
miss percentage behaviour of the four schemes and
the performance difference here is due to their
different conflict resolution mechanisms only, since
in this experiment, there is no effect of system
resources contention on the performance. Again with
small update probability and low level of workload,
there is not much difference among the schemes, but
as the arrival rate or the update probability increases,
the OCC-HP is the worst for the same reasons
explained above, though there is significant
improvement as compared to the limited system
resources experiment, this is because the effect of
wastage of system resources on its performance
which in turn leads to high resources contention is
reduced or tolerated in this experiment due to the
unlimited availability of system resources, but the
degradation in its performance as the number of
arriving transactions increases is because a situation
in which the validating transaction conflicts with one
higher priority transaction is more frequent which in
http://www.ijettjournal.org
Page 417
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014
turn increases the number of transactions missing
their deadlines.
Result of this work also shows that, under
low data contention (small update probability) and
low system workload level, OCC-HP50 gives the
better performance, since in such a situation,
restarting the validating transaction if it conflicts with
more than 50% of higher priority transactions helps
in balancing between increasing the number of
transactions meeting their deadlines and saving the
progress done by the validating transactions, but as
the workload level or the update probability
increases, OCC-HP50 becomes inferior to OCC-FV
and OCC-HP100. This is because, the chance that a
restarted validating transaction, in such situations,
faces the same destiny (restart again) is more. Since
there is no resources contention, the average number
of restarts of the four schemes increases as the system
workload increases.
The performance gain is considerable with
no system resources contention as compared to that
with limited system resources.
B. Memory Resident Database (MRDB) versus
Disk Resident Database (DRDB)
In this experiment we study the performance
of the four schemes while assuming the database as
memory resident and then as disk resident to show
the impact of the I/O operations on the performance.
Studying memory resident database (MRDB) systems
is important since many existing real-time systems
currently hold all their data in memory, memory
prices are drastically dropping, memory sizes are
growing and memory residence becomes less of a
restriction. It is clear that as the impact of I/O is
reduced (as with D. A. P = 0.5) or eliminated as in
memory resident databases, since writes to maintain
up-to-date copy of the data objects on disks occur
after a transaction commits and have no effect on
transaction tardiness and also no transaction reads
from disk, the performance gain is significant.
Similar results are obtained for the other schemes.
To isolate the effect of system resources
contention under memory resident databases
assumption, we also perform this experiment with
unlimited system resources. Result shows the
excellent performance especially of OCC-FV, OCCHP100 and OCC-HP50, since they allow the
maximum number of transactions to meet their
timing constraints and it is the primary goal of
RTDBS and more important than the cost
consideration especially in some situations where a
large negative value is imparted to the system if a
deadline is missed. Also shows that, OCC-HP50 with
small update probability gives the best performance
as explained above. Also, since there is no system
resources contention, the average number of restarts
ISSN: 2231-5381
of the schemes increases as the workload level
increases.
X. Virtual Run Policy (VRP)
In the previous experiments we have not
considered the effect of buffering on rerun
transactions. Then the buffer hit-ratios of rerun and
first-run transactions are taken to be the same. With
sufficient buffer and high retention effect, data blocks
referenced by aborted transactions continue to be
retained in memory and be available for accessing
during rerun[11].
In this experiment when CC decides to
restart the active conflicting transactions, if some of
them are in their first run, instead of aborting them
they enter their virtual run mode and continue their
read phases to bring data objects required to buffer,
assuming sufficient buffer with a high retention
effect, so that data blocks referenced by aborted
transactions continue to be retained in memory and
be available for access during reruns. When a virtual
run transaction completes its read phase, it is aborted
and resubmitted to the system to start its real second
run. There is no point to allow restarted rerun
transaction to complete its read phase in virtual mode
since its data items are already in memory. Result
shows that the OCC-FV scheme with virtual run
policy does better under low system workload level
when the system operates with limited resources (2
system resources units, where each resource unit has
one CPU and two disks), but as the number of
arriving transactions increases, its performance is
somewhat degraded. This is because the marked for
restart first-run transactions under this policy
continue their read phases in virtual run mode and
compete for system resources to complete fetching
their database requirements to buffer, which in turn
increases the, already high, system resources
contention, where the average number of restarts of
OCC-FV with virtual run policy begins decreasing
sooner than that of OCC-FV. We operate the system
with five system resources units and as it is expected,
the performance is improved below 20
transactions/sec workload point, but, as before, when
the workload increases further, the performance
degrades. The sensitivity of the schemes towards the
system resources availability is clear, also the
improvement in the performance of the schemes with
the virtual run policy as the number of resources units
increases. Since, with sufficient system resources
availability, the increased load put by the marked for
restart first-run transactions can be tolerated and then
the virtual run policy with high retention buffer can
help the schemes in achieving better performance.
Similar results are obtained for the other schemes.
Results show the good performance especially for the
http://www.ijettjournal.org
Page 418
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 9 – Oct 2014
schemes OCC-FV, OCC-HP100 and OCC-HP50
under unlimited system resources assumption with
virtual run policy which is comparable to that of
memory resident databases with unlimited system
resources. Also since there is no resources contention
in this experiment, the average number of restarts
increases as the number of arriving transactions
increases.
REFERENCES
XI. CONCLUSION
We have investigated the performance of
four optimistic concurrency control schemes OCCFV, OCC-HP100, OCC-HP50 and OCC-HP with
alternative assumptions about database system
resources. We showed that, under the policy that
discards tardy transactions (i.e., transactions which
miss their deadline) from the system, the OCC-FV,
OCC-HP100 and OCC-HP50 outperform the other
optimistic scheme OCC-HP where they incur less
miss percentage and there is slight performance gain
for OCC-FV over OCC-HP100 and OCC-HP50 due
to its complete saving policy of work done by the
validating transaction.
3.
1.
2.
4.
5.
6.
To isolate the effect of system resource
contention on the performance of the three schemes,
we assumed unlimited system resources and as it was
expected there is significant performance gain
especially for the schemes OCC-FV, OCC-HP100
and OCC-HP50. We showed also the impact of I/O
operations on the performance and the improvement
obtained by making our databases memory resident.
The excellent performance of the above three
schemes was obtained under the assumption of
unlimited system resources with virtual run policy or
with memory resident databases. This assumption is
reasonable since the primary goal of RTDBS is
maximising the number of transactions meeting their
timing constraints and it is more important than the
cost considerations particularly in some critical
situations.
Finally, the specific conclusion that is drawn
regarding the resource-related performance results of
the schemes is that, as the effect of system resources
contention is isolated (as in USR experiment) and the
impact of I/O operations is reduced (as in VRP
experiment) or eliminated (as in MRDB experiment),
the performance gain is very significant especially for
those schemes who save or try to save the validating
(near completion) transactions.
ISSN: 2231-5381
7.
8.
9.
10.
11.
12.
13.
P. Bernstein and N. Goodman, ” Concurrency
control
in
distributed
databasesystems,”
Computing Surveys, vol. 13,no. 2, June 1981.
R. K. Abbott and H. Garcia-Molina, “Scheduling
Real-Time Transactions: A performance
evaluation,” ACM Trans. Database Syst., vol.
17, no. 3, Sept 1992.
Sang H. Son and S. Park, “A Priority Based
Scheduling
Algorithm
for
Real-Time
Databases,” Journal of Information Science and
Engg., Nov 1995.
K. Ramamritham, “Real-Time Databases,
”Intenational Journal of Distributed and Parallel
Databases, vol. 1, no. 1, 1993.
J. Haung, J. Stankovic, D. Towsley and K.
Ramamritham,
“Real-time
transaction
processing: Design, implementation and
performance evaluation,” COINS technical
report May 1990. Department of computer
science, University of Massachusettes at
Amherst.
J. Harista, M. Carey and M. Livny, “Data access
scheduling in firm real-time database systems,”
Journal of real-time systems, vol. 4, sept. 1992.
H. T. Kung and J. Robinson, “On optimistic
method for concurrency control,” ACM Trans.
Database Syst., vol. 6, no. 2, June 1981.
T. Harder, “Observations on optimistic
concurrency control schemes,” Information
Systems, vol. 9, no. 2, 1984.
J. Lee and S. H. Son, “Using Dynamic Adjusting
of Serialization Order for Real-Time Database
Systems,” Proc. of the 14th Real-Time Systems
Symposium, Raleigh-Durham, NC, Dec 1993
R. Agrawal, M. Carey and M. Livny,
“Concurrency control performance modeling:
Alternatives and Implications,” ACM Trans. on
Database Systems, Dec. 1987.
P. Yu and D. Dias, “Analysis of hybrid
concurrency control schemes for a high data
contention environment,” IEEE Trans. Software
Engg., vol. 18, no. 2, Feb 1992.
N. Kaur, and al, “Concurrency Control for
multilevel secure Database”, Inter. Journal of
Network security, vol.9, No. 1, July 2009
Md Anisur and Md Hossain “ A comprehensive
Concurrency Control Techniques for Real-Time
Database Systems” Global Journal of Computer
Science and Technology” vol. 13 issue 2, 2013
http://www.ijettjournal.org
Page 419
Download