Data consistency in intermittently connected distributed systems

advertisement
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6, NOVEMBERIDECEMBER 1998
896
Data Consistency in Intermittently
Connected Distributed Systems
Evaggelia Pitoura, Member, /€E€ Computer Society, and Bharat Bhargava, Fellow, /E€€
Abstract-Mobile computing introduces a new form of distributed computation in which communication is most often intermittent,lowbandwidth, or expensive, thus providing only weak connectivity. In this paper, we present a replication scheme tailored for such
environments. Bounded inconsistency is defined by allowing controlled deviation among copies located at weakly connected sites. A
dual database interfaceis proposed that in addition to read and write operations with the usual semantics supports weak read and write
operations, In contrast to the usual read and write operations that read consistent values and perform permanent updates, weak
operations access only local and potentially inconsistent copies and perform updates that are only conditionally committed. Exploiting
weak operations supports disconnected operation since mobile clients can employ them to continue to operate even while
disconnected. The extended database interface coupled with bounded inconsistency offers a flexible mechanism for adapting replica
consistency to the networking conditions by appropriately balancing the use of weak and normal operations, Adjusting the degree of
divergence among copies provides additional support for adaptivity. We present transaction-oriented correctness criteria for the
proposed schemes, introduce corresponding serializability-basedmethods, and outline protocols for their implementation.Then, some
practical examples of their applicability are provided. The performance of the scheme is evaluated for a range of networking conditions
and varying percentages of weak transactions by using an analytical model developed for this purpose.
Index Terms-Mobile computing, concurrency control, replication, consistency, disconnected operation, transaction
management, adaptability.
+
1 INTRODUCTION
A
DVANCBS in telecommunications and in the development of portable computers have provided for wireless
communications that permit users to actively participate in
distributed computing even while relocating from one
support environment to another. The resulting distributcd
environment is subject to rcstrictions imposed by the nature
of the networking environment that provides varying,
intermittent, and weak connectivity.
hi particular, mobile clients encounter wide variations in
connectivity ranging from high-bandwidth, low latency
communications through wired networks to total lack of
connectivity [7], [13], [27]. Between these two extremes,
connectivity is frequently provided by wireless networks
characterized by low bandwidth, excessive latency, or high
cost. To overcome availability and latency barriers and
reduce cost and powcr consumption, mobile clients most
often deliberatcly avoid use of the network and thus
operate switching between connected and disconnected
modes of operation. To support such behavior, disconnected
operation, that is the ability to operate disconnected, is
essential for mobile clients [13],1141, [30], [25]. In addition to
disconnected operation, operation that exploits zuenk connectivity, that is connectivity provided by intermittent, lowbandwidth, or ,expensive networks, is also desirablc [ZO],
[lZ]. Besides mobilc computing, weak and intermittent
~~~~
~~~~~~~~
~
E. Pitoirrii is with the Dcpnrtmcnt of Computer Science, University of
Ionnninn, GR 45110 loamitto, Greece. E-ninil: pitoura8cs.uoi.gr.
R. Riinrpvn is iuitk the De{iartmrnt of Cornpiitrr Sciences, Purdiic
Uriivcrsit!/, West Lnfuyette, IN 47907. E - m i l : bb@'cs.piirdue.edu.
Maitiiscrini m c i v e d 2 ,ldii
* 199G:. revised 19 Miit,
* 1997.. mcentrd
, 22 inn. 1998.
For irforniutioii or, obtoiniriy reprints cf this article, please send e-mnii to:
tkdLBaiaipi~fer.oy,nnd reference I E E B C S Log Nunibrr 104369.
connectivity also applies to computing using portable
laptops. In this paradigm, clients operate disconnected
most of the time and occasionally connect through a wired
telephone line or upon returning to their working
environment.
Private or corporate databases will be stored at mobile as
well as static hosts and mobile users will query and update
these databases over wired and wireless networks. These
databases, for reasons of reliability, performance, and cost
will be distributed and replicated over many sites. In this
paper, we propose a replication scheme that supports weak
connectivity and disconnected operation by balancing
network availability against consistency guarantees.
In the proposed scheme, data located at strongly
connccted sites are grouped together to form clusters.
Mutual consistency is required for copies located at the
same cluster, while degrees of inconsistency are tolerated
for copies at different clusters. The interface offered by the
database management system is enhanccd with operations
providing weaker consistency guarantees. Such weak
operations allow access to locally, i.c., in a cluster,
availablc data. Weak reads access bounded inconsistent
copies and weak writes make conditional updates. The
usual operations, called strict in this paper, are also
supported. They offer access to consistent data and
perform permanent updates.
The scheme supports disconnected operation since users
can operate even when disconnected by using only weak
operations. In cases of weak connectivity, a balanced use of
both weak and strict operations provides for better
bandwidth utilization, latency and cost. h i cases of strong
connectivity, using only strict operations makes the scheme
reduce to the usual one-copy semantics. Additional support
1041-4347/99/5100Ot~
1999 IEEE
~
PITOURA AND BHARGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS
for adaptability is possible by tuning the degree of
inconsistency among copies based on the networking
conditions.
In a sense, weak operations offer a form of applicationaware adaptation [Zl].Application-aware adaptation characterizes the design space between two extreme ways of
providing adaptability. At one extreme, adaptivity is
entirely the responsibility of the application, that is there
is no system support or any standard way of providing
adaptivity. At the other extreme, adaptivity is subsumed by
the system, here the database management system. Since, in
general, the system is not aware of the application
semantics, it cannot provide a single adequate form of
adaptation. Weak and strict operations lie in an intermediate point between these two extremes, serving as middleware
between a database system and an application. They are
tools offered by the database system to applications. The
application can at its discretion use weak or strict transactions based on its semantics. Thc implementation, consistency control, and the underlying transactional support is
the job of the database management system.
The remainder of this paper is organized as follows. In
Section 2, we introduce the replication model along with an
outline of a possible implementation that is based on
distinguishing data copies into core and quasi. In Sections 3
and 4, we define correctness criteria, prove corresponding
serializability-based theorems, and present protocols for
maintaining weak consistency under the concurrent execution of weak and strict transactions and for reconciling
divergent copies,, respectively. Examples of how the
scheme can bc used arc outlined in Section 5. In Section 6,
we develop an analytical model to evaluate the performance of the scheme and the interplay among its various
parameters. The model is used to demonstrate how the
percentage of weak transactions can be effectively tuned to
attain the desired performance. The performance parameters considered are the system throughput, the number
of messages, and the response time. The study is performed
for a range of networking conditions, that is for different
values of bandwidth and for varying disconnection intervals. In Section 7, we provide an estimation of the
reconciliation cost. This estimation can be used for instance
to determine an appropriate frequency for the reconciliation
events. In Section 8, we compare our work with related
research, and we conclude the paper in Section 9 by
summarizing.
897
unavailable. To this end, weak transactions are defined as
access copies at a single p-cluster. At the same time, the
usual atomic, consistcnt, durable, and isolated distributed
transactions, called strict, arc also supported.
2.1 The Extended Database Operation Interface
To increase availability and reduce intercluster communication, direct access to locally, e.g., in a p-cluster, available
copies is achieved through weak read ( W R )and weak write
(WW) operations. Weak operations are local at a p-cluster,
i.e., they access copies that reside at a single p-cluster. We
say that a copy or item is locally available at a p-cluster if
there exists a copy of that item a t the p-cluster. We call the
standard read and write operations strict rend (SR)and strict
write (SW) operations. To implement this dual database
interface, we distinguish the copies of each data item as core
and quasi. Core copies are copies that have permanent
values, while quasi copies are copies that have only
conditionally committed values. When connectivity is
restored, the values of core and quasi copies of each data
item arc reconciled to attain a system-wide consistent value.
To process thc operations of a transaction, the database
management system translates operations on data items
into operations on copies of these data items. This
procedure is formalized by a translotion function 11,.
Function ii maps each read operation on a data itcm z into
a number of read operations on copies of x and returns one
value (e.g., the most up-to-date value) as the value read by
the operation. That is, we assume that IL when applied to a
read operation returns one value rather than a set of values.
In particular, h maps each S R [ z ]operation into a number of
read operations on core copies of T and returns one from
these values as the value read by the operation. Depending
on how each weak read operation is translated, we define
two types of translation functions: a best-effort translation
function that maps each WR,[z]operation into a number of
read operations on locally available core or quasi copies of x
and returns the most up-to-date such value, and a
conservative translation function that maps each weak
read operation into a number of read operations only on
locally available quasi copies and returus the most up-todate such value.
Based on the time of propagation of updates of core
copies to quasi copies, we define two types of translation
functions: an eventual translation function that maps an
SW[s] into writes of core copies only and an immediate
translation function that in addition updates the quasi
2 THECONSISTENCY
MODEL
copies that reside at the same p-cluster with the core copies
The sites of a distributed system arc grouped together in
For an immediate /I,, conservative and best-effort
written.
sets called physical clusters (or p-clusters) so that sites that
have
the
same rcsult. Each W W [ z ]operation is translated
arc strongly connected with each other belong to the same
/
I
into
a
number of write operations of local quasi copies
by
p-cluster, while sites that arc weakly connected with each
of
z.
Table
1 summarizes the semantics of operations.
other belong to different p-clusters. Strong connectivity
How many and which core or quasi copies are actually
refers to connectivity achieved through high-bandwidth
and low-latency communications. Weak connectivity in- read or written when a database operation is issued on a
cludes connectivity that is intermittent or low bandwidth. data item depends on the coherency algorithm used, e.g,
The goal is to support autonomous operation at each quorum consensus or ROWA [3].
Users interact with the database system through transacphysical cluster, thus eliminating the need for communication among p-clusters since such intercluster communica- tions, that is, through the execution of programs that
tion may be expensive, prohibitively slow, and occasionally include database operations:
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11. NO. 6, NOVEMBERIDECEMBER 1999
898
TABLE 1
Variations of the Translation Function
Reads local copics and rclurns as tlic value read the most recenl one
Weak Read (WK)
Variations
n c a t - c r r ~I):
~t
Roads both corc and quasi local copies
C:unuervntive 11:
Rcnds only local quasi copica
Stricl Rcad (SK)
Rcads corc copies and returns as the value read the most rcconl onc
Woak N'rite (WW)
Writcs local quasi copies
Eventual h:
Strict Write (SW)
Variations
lrnmediilte h:
Writcs unly core copies
Wntos both core and quasi copies at the
corrcsponding p-clustcrs
2.2 Data Correctness
As usual, a database is a set of data items and a database
and < represents their execution order. The operntions include state is defined as a mapping of every data item to a value of
the follozuing data operations: weak (1YR.Ior strict rends ( S R ) its domain. Data items are related by a number of
and weak ( W W ) or strict writes ( S W ,as well as abort (A) restrictions called integrity constraints that express relationand commit (C) operations. The partial order must specify the ships among their values. A database state is consistent if
order of conflicting data operations and contains exactly one the integrity constraints are satisfied 1221. In this paper, we
abort UY commit operation which is the last in the order. Two consider integrity constraints to be arithmetic expressions
weak or strict data operations conflict if the!{ access the same that have data items as variables. Consistency maintenance
copy ofu data item and nt least one of them is II weak or strict in traditional distributed environments relies on the
assumption that normally all sites are connected. This
zurite operation.
assumption, however, is no longer valid in mobile computing
since the distributed sites are only intermittently
Two types of transactions are supported, weak and strict.
connected.
Similar network connectivity conditions also
A strict transaction ( S T )is a transaction where OP does not
include any weak operations. Strict transactions are atomic, hold in widely distributed systems as well as in computing
consistent, isolated and durable. A w e d transaction (WT)is with portable laptops.
To take into account intermittent connectivity, instead of
a transaction where 01' does not include any strict
requiring maintenance of all integrity constraints of a
operations. Weak transactions access data copies that rcside
distributed database;we introduce logical clusters as units
at the same physical cluster and thus are executed locally at
of consistency. A logical cluster, I-cluster, is the set of all
this p-cluster. Weak transactions are locnlly committed at
quasi copies that reside at the same p-cluster. In addition,
the p-cluster at which they are cxccuted. After local
all core copies constitute a single system-wide logical
commitment, their updates are visible only to weak
cluster independently of the site at which they reside
transactions in the same p-cluster; other transactions are physically. We relax consistency in the sense that integrity
not aware o f these updates. Local commitment of weak constraints are ensured only for data copies that belong to
transactions is conditional in the sense that their updates the same logical cluster. An intracluster integrity constraint
might become permanent only after reconciliation when is an integrity constraint that can be fully evaluated using
local transactions may become globally committed. Thus, data copies of a single logical cluster. All other integrity
there are two commit events associated with each weak constraints are called intercluster integrity constraints.
transaction, a local conditional commit in its associated
cluster and an implicit global commit at reconciliation.
Definition 2. A mobile database state is consistent if nll
Other types of transactions that combine weak and strict
intracluster integrity constraints are satisfied and all interoperations can be envisioned; their semantics, however,
cluster integrity constraints are bounded-inconsistent.
become hard to define. Instead, weak and strict transactions
can be seen as transaction units of some advanced
In this paper, we focus only on replication intercluster
transaction model. In this regard, upon its submission, integrity constraints. For such integrity constraints,
each user transaction is decomposed into a number of weak bounded inconsistency means that all copies in the same
and strict subtransaction units according to semantics and logical cluster have the same value whilc among copies at
the degree of consistency required by the application.
different logical clusters there is bounded divergence 1321,
Definition 1. A transaction (2') is a partial order ( O r , < .I,
where OP is the set of operations executed by the transnction,
PlTOURA AND BHARGAVA: DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS
899
TABLE 2
Divergence Among Copies
iiiaxiinuiii numbcr or transactions h a t opcratc on quasi copics
range olacccptablc values that a data itcni can talc
niaxinium number of copics pcr data itcm that divcrge rrom a prc-assigncd
d isthc:
primary copy olthe item, i.c., thc corc copies
maximum number or data itcnis that have divergent copics
maximum nuinbcr of updates per data itcin that are not rcflected at all
copies of thc item
[l]. Bouudcd divergence is quantified by a positive integer
d, called degree of divergence; possible definitions of d are
listed in Tablc 2. A replication constraint for z thus hounded
is called d-consistcnt. The degree of divcrgence among
copies can be tuued based on thc strength of connection
among physical clustcrs, by keeping the divergence small in
instances of high bandwidth availability and allowing for
greater deviation in instances of low bandwidth availability.
Immediate Translation and Consistency. To handle intcgrity constraints besides replication, in the case of an
immediate translation function h, / I should be defined such
that the intcgrity constraints betwcen quasi copies in the
same logical cluster arc not violated. The following example
is illustrative.
Example 1. For simplicity, consider only one physical
cluster. Assume two data items z and TI, related by the
integrity constraint I : > 0 + $1 > 0, and a consistent
database state zc = -I, :I:,, = -1, TIc = 2, and TIq= -4,
whcre the subscripts c and q denote corc and quasi
copics, respectively.
Consider tlre'transaction program:
x = 10
ify<O
then y = 1 0
If the above program i s executed as a strict transaction
S W ( z ) SR(y) C, we get the database state zi. = IO,
1 2 , and
zq = It),
= -4 in which the integrity
constraint betwcen the quasi copies of :I: and ?lisviolated.
Note that the quasi copics were updated because we
considered an immediate translation function.
The problem arises from the fact that quasi copies are
updatcd to the current value of the core copy without
taking into consideration intcgrity constraints among them.
Similar problems occur wheu refreshing individual copies
of a cache ['I]. Possible solutions include: 1) Each time a
quasi copy is updatcd at a physical clustcr as a result of a
strict writc, the quasi copies of all data in this cluster relatcd
to it by some integrity constraint are also updated either
after or prior to the execution of the transaction. This update
is done following a reconciliation procedure fur merging
core and quasi copies (as in Section 4). In the abovc
example, the core and quasi copics of z and y should have
been reconciled prior to the execution of the transaction,
producing for instance the database state z,: = -1, z, = -I,
?I,. = 2, and y, = 2. Then, the execution of the transaction
would result in the database statc xi. = 10, zq = IO,
= 2,
and yQ = 2, which is consistent. 2) If a strict transaction
updates a quasi copy at a physical cluster, its read
operations are also mapped into reads of quasi copies at
this cluster. In cases of incompatible reads, again, a
reconciliation procedure is initiated having a rcsult similar
to the one above. 3 ) Updating quasi copies is postponad by
deferring any updates of quasi copies that result from
writes of the corresponding core copics. A log of wcak
writes resulting from strict writes is kept. In this scenario,
the execution of the transaction rcsults in the database state
xc = 10, zq = -1, y,, = 2, and ?Iq = ~ - 4 , which is consistent.
The first two approaches may force an immediate reconciliation among copies, while the third approach defers this
reconciliation and is preferable in cases of low conneclivity
among physical clusters.
3
WEAKCONNECTIVITY
OPERATION
In this section, we provide serializability-based criteria,
graph-based tests and a locking protocol for correct
executions that exploit weak conncctivity. When o1 is an
operation, the subscript ,I denotes that o belongs to
transaction j , while the subscript on a data copy idcntifies
the physical cluster at which the copy is located. Without
loss of generality, we assnine that there is only one quasi
copy per physical cluster. This assumption can be easily
lifted but with significant complication in notation. Since all
quasi copies in a physical clustcr have the same value, this
single copy can be regarded as their representative. Rcad
and write operations on copies are denoted by vend and
write, respectively.
A complete intracluster schedule (IAS) is an observation
of an interleaved execution of transactions in a given
physical cluster configuration, that includes (locally) committed weak transactions and (globally) committed strict
transactions. Formally,
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERIDECEMBER 1999
900
Definition 3 (bitrncliister schedule). Let 7’ = Po,Ti, ...,Trb}be
a set of trnnsnctions. A (complete) intracluster schedule, IfLS,
over ‘1’ is n pnir (OP, <,‘ ) in wlzich <,, is a pnvtial ordering
consider a weak form of correctness, in which the only
requirement for weak transactions is that they read
consistent data. The requirement for strict transactions is
stronger, because they must produce a system-wide
relatioii sucli that
consistent database state. In particular, the execution of
1. 01’ = /L(U;=~
2 ; )for sorne tr~mslntioizfiriiction 11.
strict transactions must hide the existence of multiple core
2. For each :I; aiidnll i~perntionsnpb, opl in Y;,f o p i <, copies per item, i.e., it must be view-equivalent to a oneopl, then every operation in h(opb) is relnted by <,< to copy schedule [3].A replicated-copy schedule S is v i e w
every operotion i r i h(op,).
equivalent to n one-copy scheduleSlc if 1) S and S,U have the
3. All pnirs of conflictirrg operations are related by <,, , same reads-from relationship for all data items, and 2) for
where two operatioris conflict if they access tlze same each final write W’Xi(z) in ,Sic, W I
(zj)is a final write in
copy and one of them is n write operation.
S for some copy sj of z. These requirements for the
4. For all rend operntions, r m d j [ there is at lenst one execution of strict and weak transactions are expressed in
writer[?:,]siich that ‘ior.ite~[:c,] <,, vcad,j(z:i]
the following definition:
5 . Z,fSW,[:r]< 3 SX,j[2:]ami r ~ n d , ~ t( :Ii(SI$[z]),
~)
then Definition 4 (IAS weak corwctness). An intracluster schedule
7wtiej(s,) E h(SJVj[s]).
S,,\,T is weakly correct iffall the folluwing three conditions are
6.
urik,[zi]t h(SW1[:c])
for soriie strict tramaction
satisfied:
Ti, Hien wr.ite,j[y]t Ii(SW,[g]),for all !/ writteiz by
I . All transnctions in SIa,y have a coiisislent view, i.e., all
Tj for 7ulrich there is a f/i at pliysical cluster Cli, where
intqrily constraints that can be evalirnted using the
zi is a qirnsi copy when li is conservative nnd any,
data rend nrc valid;
quasi or core, copy zulieii Ir is best effort.
2. There is a one copy serial scliedule S such thot: a) it is
If
Condition 1 states that the transaction managers translate
each operation on a data item into appropriate operations
on data copies. Condition 2 states that the intracluster
schedule preserves the ordering <, stipulated by each
transaction 7; and Condition 3 that it also records the
execution order of conflicting operations. Condition 4 states
that a transaction cannot read a copy unless it has been
previously initialized. Conditiou 5 states that, if a transaction writes a data itcm 2: before it reads :r:, then it must write
to the same copy of :E that it subsequently reads. I’inally,
Condition 6 indicates that for a strict transaction, if a write
is translated to a write on a core copy at a physical cluster
Cli then all other writes of this transaction must also write
any corresponding copies at this cluster. This condition is
necessary for ensuring that weak transactions do not see
partial results of a strict transactinn.
A read operation on a data item z rends-x-from a
transaction Y;, if it reads (i.e., returns as the value scad) a
copy of z written by 7: and no other transaction writes this
copy in between. We say that a transaction ’1; has the S~IIIC
readsfom relationship in schedule SI as in schedule S2 if,
for any data item :E, Tireads-2:-from Ti in SI,
then it reads-sfrom 7; in S,. Given a schedule S,the projection of S on strict
transactions is the schedule obtained from S by deleting all
weak operations, and the projection of S on a pliysicnl clnster
Cli. is the schedule obtained from 9 by deleting all
operations that access copies not in C l h . Two schedules
are conflict t*qiiivaleiil if they are defined over the same set of
transactions, have the same set of operations and order
conflicting opcrations of committed tTansactions the same
way 131. A one-copy schedule is the single-copy interpretation of an (intraclustcr) schedule whcrc all operations on
data copies are represented as operations on the corresponding data item.
3.1 Correctness Criterion
A correct concurrent execution of weak and strict transactions must maintain bounded-inconsistency. First, we
defined on the same set of strict transactions as SMS,
6) strict traiisnctions in S liave the same reads-from
relationship as in S’IAS,ani1 c) the set offinal writes in
,S is the same as the set of final writes on core copies in
Si,w and
3. Bounded divergence nniong copies at different logical
clusters is niaintained.
Next, we discuss how to enforce the first two conditions.
Protocols for bounding the divergence among copies at
different logical clusters are outlined a t the end of this
section. A schedule is one-copy scrinlizahle if it is equivalent
to a serial one-copy schedule. The following theorem
defines correctness in terms of equivalence to serial
executions.
Theorem 1. Given that bounded divergence aiwong copies at
different logicnl clusters is maintaiized, if the projection of
intracliister sckedule S on strict transactions is onecop?/ serinliznhle and ench of its projections o n a physical
cluster is conflict-equivalent to a serin1 schedule, then S is
weakly correct.
nil
Proof. Condition 1 of Definition 4 is guaranteed for strict
transactions from the requirement of one-copy serializability since strict transactions get the same view as in a
one-copy serial schedule and read only core copies. For
weak transactions at a physical cluster, the first coiidition
is provided from the requirement of serializability of the
projection of the schedule on this cluster given that the
projection of each (weak or strict) transaction on the
cluster satisfies all intracluster integrity constraints when
executed alone. Thus, it suffices to prove that such
projections maintain the intracluster integrity constraints. This trivially holds for weak transactions, since
they are local at a physical cluster. The condition also
holds for strict transactions since, if a strict transaction
maintains consistency of all datahase integrity constraints, then its projection on any cluster also maintains
thc consistency of intracluster integrity constraints as a
~
PITOURA AND BHARGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS
consequence of Condition 6 of Definition 3. Finally, one
copy serializability of the projection of strict transactions
suffices to guarantee Conditions 2b and 2c since strict
transactions read only core copies and weak transactions
0
do not write core copies, respectively.
Note, that intercluster integrity constraints other than
replication constraints among quasi copies of data items
at different clusters may be violated. Weak transactions
however are unaffected by such violations, since they read
only local data. Although, the above correctness criterion
suffices to ensure that each weak transaction gets a
consistent view, it does not suffice to ensure that weak
transactions at different physical clusters get the same view,
even in the absence of intercluster integrity constraints. The
following example is illustrative.
901
relntionship as in S1c and b) the set of final writes
on core copies in SSis the same as in SIC;and
3. Bounded divergence among copies at diferent logicnl
clusters is maintained.
Lemma 1. Given that bounded divergence among copies at
d$erent logical clusters is maintained if the projection of an
intracluster schedule S is conflict-equivalent to n serin1
schedule SS and its projection on strict transactions is view
equivalent to a one-copy serial schedule Slcsuch that the order
of transactions in SS is consistent with the order of
transactions in S,c, S is strongly correct.
Proof. We need to prove that in SICstrict transactions have
the same reads-from and final writes as in S.7 which is
straightforward since strict transaction only read data
produced by strict transactions and core copies are
written only by strict transactions.
0
Example 2. Assume two physical clusters Cl1 = {z,y} and
CLz = {w,
z,1} that have both quasi and core copies of
the corresponding data items, and the following two
Since weak transactions do not conflict with weak
strict transactions ST1 = SW,[z] SWL[UJ]CI
and 37'2 = transactions at other clusters directly, the following is an
SWz [2/1SWz[z]SRz[z]Cz.
In addition, at cluster C11 we equivalent statement of the above lemma:
C:+,
have the weak transaction WT, = WR3[z]WRx[?/]
and at cluster Clz the weak transactions WTa = W R I [ ~ ]
Corollary 1. Given that bounded divergence among copies at
WVV4[1] C4, a n d WT, = W&[w] WR,[l] CS. For
different logical clusters is maintained, if the projection of nn
simplicity, we do not show the transaction that
intracluster schedule S on strict trnnsactions is view
initializes all data copies. We consider an immediate
equivalent to a one-copy serial schedule Sic, and each of its
and best effort translation function 6. For notational
projections on a physical cluster Cli is conflict-equivalent to
simplicity, we do not use any special notation for the
a serial schedule S., such that the order of trnnsactions in Ss2
core and quasi copies, since which copies are read is
is consistent with the order of transactions in Sic, S is
inferred by the translation function.
strongly correct.
Assume that the execution of the above transactions
produces the following schedule which is weakly correct:
If weak IAS correctness is used as the correctness
criterion, then the transaction managers at each physical
[w]sw1[z]
WR,[z]SW, [U]]Cl sw2 [y]SW2 [z]
cluster must only synchronize projections on their cluster.
Global control is required only for synchronizing strict
SE2 [z]cz W'% [$ C:,WRa [z]WWa [1]ca
[1]C,
transactions. Therefore, no control messages are necessary
The projection of S on strict transactions is: ,SW,[x] between transaction managers at different clusters for
S w ~ [ ? uCI
] SWz[y] SWz[z]CZ,which is equivalent to the synchronizing weak transactions. The proposed scheme is
1SR schedule: SWI[Z] SWl[w]Cl SWZ[Y]
S W z [ z ]CZ
flexible, in that any coherency control method that
The projection of S on C ~ ISWl[z]
:
WR3[2]CI SWz[?/] guarantees one-copy serializability (e.g., quorum conscnsus
SRz[z]WRs[y]C3 is serializable as ST1 STz i
WTI.
or primary copy) can be used for synchronizing core copies.
The projection of S on Ch: W&,[w]SWI[VJ]
CI SW2[z] The scheme reduces to one-copy serializability when only
CZ W & [ z ] WW4[1] Ca WRs[l]C, is serializable as strict transactions are used.
STz -> W'Z,+ WT, + ST,.
3.2 The Serialization Graph
Thus, weak correctness does not guarantee that there To determine whether an IAS schedule is correct, we use a
is a serial schedule equivalent to the intracluster schedule modified serialization graph, that we call the intrncluster
as a whole, that is, including all weak and strict serialization graph (IASG) of the IAS schedule. To construct
transactions. The following is a stronger correctness the IASG, first a replicated data serialization graph (SG) is
criterion that ensures that all weak transactions get the built that includes all strict transactions. An SG [3] is a
same consistent view. Obviously, strong correctness serialization graph augmented with additional edges to
take into account the fact that operations on different copies
implies weak correctness.
of the same data item may also create conflicts. Acyclicity of
Definition 5 (IAS strong correctness). An intracluster schednle the SG implies one-copy serializability of the corresponding
S is strongly correct $there is a serial schedule S,s such that schedule. Then, the SG is augmented to include weak
transactions as well as edges that represent conflicts
between weak transactions in the same cluster and weak
1. Ss is conflict-equivalent with S;
2. There is a one-copy schedule SE such that a) strict and strict transactions. An edge is called a dependency edge if
transnctions in SS have the same reads-from it represents the fact that a transaction reads a value
s=wn,
w&
-
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERIDECEMBER 1999
902
produced by another transaction. An edge is called a
precedence edge if it represents the fact that a transaction
reads a value that was later changed by another transaction.
It is easy to see that in the IASG there are no edges between
weak transactions at different clusters since weak transactions at different clusters read different copies of a data
item. In addition:
mi
Property 1. Let WT, be n weak transaction at cluster C1, nnd S3'
a strict transaction. The IASG graph induced b!/ nn IAS mny
iiiclude only the following edges betiueen them:
a dependency edge from SI' to Mi21 nnd
a precedence edge from C V Z to ST.
Proof. The proof is straightforward since the only conflicts
between weak and strict transactions are due to strict
writes and weak reads of the same copy of a data
0
item.
Theorem 2. Let Srn.7 he nti intracluster schedule. If Sins has an
acyclic IASG, then SI,,,y is strongly correct.
Proof. When a graph is acyclic then each of its subgraphs
is acyclic, thus the underlying SG on which the IASG
was built is acyclic. Acyclicity of the SG implies one-copy
serializability of strict transactions, since strict transactions only read values produced by strict transactions.
,T,L be all transactions in Slas. Thus,
T,,Tz:.. . ,T,, are the nodes of the IASG. Since IASG is
acyclic it can be topologically sorted. Let Ti,,
Ti,,. . . ,T,,,
be a topological sort of the edges in IASG, then by a
straightforward application of the serializability theorem [3],Sins is conflict equivalent to the serial schedule
SC = T,,,T,,, . , . ,T,,,.
This order is consistent with the
partial order induced by a topological sorting of the SG,
allowing Slc to be the serial schedule corresponding to
this sorting. Tli~is,the order of transactions in S.7 is
0
consistent with the order of transactions in S,c.
3.3 Protocols
Serializability. Wc distinguish between coherency and
concurrency control protocols. Coherency control ensures
that all copies of a data item have the same value. In the
proposed scheme, we must maintain this property
globally for core and locally for quasi copies. Concurrency control ensures the maintenance of the other
integrity constraints, here the intracluster integrity constraints. For coherency control, we assume a gcncric
quorum-based scheme [3].Each strict transaction reads q7
core copies and writes qa core copies per strict read and
write operation. The values of qT and q,,, for a data item x
are such that qv qw > I L ~ ,where n,,l is the number of
available core copies of 2 . For concurrency control we
use strict two plzase locking, where each transaction
releases its locks upon commitment [3]. Weak transactions release their locks upon local commitment and strict
transactions upon global commitment. There are four lock
modes (WX, WW, SI?, SW) corrcsponding to the four
data operations. Before the execution of each operation,
+
ww
SW
(C)
(d)
Fig. 1. Lock compatibility matrices. A X entry indicates that the lock
modes are compatible. (a) Eventual and conservative /L. (b) Eventual
and best effort /L. (c)Immediate and conservative ii. (d) Immediate and
best effort /L.
the corrcsponding lock is requested. A lock is granted
only if the data copy is not locked in an incompatible
lock mode. Fig. 3 depicts the compatibility of locks for
various types of translation functions and is presented to
demonstrate the interference between operations on
items. Differences in compatibility stem from the fact
tHe operations access different kinds of copies. The basic
overhead on the performance of weak transactions
imposed by thesc protocols is caused by other weak
transactions at the same cluster. This overhead is small,
since weak transactions do not access the slow network.
Strict transactions block a weak transaction only when
they access the same quasi copies. This interference is
limited and can be controlled, e.g., by letting in cases of
disconnections, strict transactions access only core copies
and weak transactions access only quasi copies.
Bounding divergence among copies. At each p-cluster,
the degree for each data itcm expresses the divergence of
the local quasi copy from the value of the core copy. This
difference may result either from globally uncommitted
weak writes or from updates of core copies that have not
yet been reported at the cluster. As a consequence, the
degree may be bounded by either limiting the number of
weak writes pending global commitment or by controlling
the h function. In Table 3, we outline ways of maintaining
d-consistency for different ways of defining d.
4
A CONSISTENCY
RESTORATION
SCHEMA
After the execution of a number of weak and strict
transactions, for each data item, all its core copies have
the same value, while its quasi copies may have as many
different values as the number of physical clusters.
Approaches to reconciling the various values of a data
item so that a single value is selected vary from purely
syntactic to purely semantic ones [ 5 ] . Syntactic approaches use serializability-based criteria, while semantic
approaches use either the semantics of transactions or the
PITOURA AND BHARGAVA: DATA CONSISTENCY IN INTERMlrrENTLY CONNECTED DISTRIBUTED SYSTEMS
903
TABLE 3
Maintaining Bounded Inconsistency
Applicahle method
+-
Appropriately boond the nomnber olwcsli transactions nt
The maximum numbcr ollransaclions
tliot opcralc on quasi wpics
cach physical cluster. In thc case r r l i i dynamic clustcr
rcconfigumtion, lhc distribution d rvcak lrmsactioiis at cach
cluster ~niislbc re-adjuslcd.
A rango of accepliihlr villiics n data item
ctin take
'lhc niaxiiniiin innnbcr or divcrgcnt copics
pcr data itcm
Allow only weak writes with values iiisidc thc acccptablc raiigc.
For cnch data itcm, h ~ i i i dtlic iiurnbcr of physical clustcrs that cnn
hnve quasi copies.
The inaxiiiiniii nuiinhcr of data ilems that
have 'l,\'ergcnt copics
Dcfinc h so that each skicl wrilc inodifics the quasi copies at each
The iiiaxiiniiin nuinbcr of updates per data
itcin not reflcclcd
ill
all copies
physical cluster at lcast after d opdntcs.Nnte, thal Ihis canlint bc
casurcd Cor disconnected r;lustcrs. sincc therc i s no way oCnotiSying
thcni for rcinotc opdatcs.
semantics of data items. We adopt a purely syntactic and
thus application-independent approach. The exact point
when reconciliation is initiated depends on the application requirements and the distributed system characteristics. For instance, reconciliation may be forced to keep
inconsistency inside the required limits. Alternatively, it
may be initiated periodically or on demand upon the
occurrence of specific events, such as the restoration of
network connectivity, for instance when a palmtop is
plugged-back to the stationary network or a mobile host
enters a region with good connectivity.
4.1 Correctness Criterion
Our approach for reconciliation is based on the following
rule: A weak transaction becomes globally commited if
the inclusion of its write operations in the schedule does
not violate the one-copy serializability of strict transactions. That is, we assume that weak transactions at
different clusters do not interfere with each other even
after reconciliation, that is weak operations of transactions at different clusters never conflict. A (complete)
intercluster schedule, IES, models cxecntion after reconciliation, where strict transactions become aware of weak
writes, i.e., weak transactions become globally committed.
Thus, in addition to the conflicts reported in the
intracluster schedule, the intercluster schedule reports
all relevant conflicts between weak and strict operations.
In particular:
Definition 6 (Intercluster schedule). An intercluster schedule
( I E S ) S m based on an intrucluster schedule STAS= (OP,
) is a pnir (OP', <,: where:
1.
OP'
opl
= OP,
and for any op, and up, E OP', if
opi <,. op, in Si,s,
in
<, op, in Sins, then
nddition:
For ench pnir of weak write opi = WW,[z] and
strict rend up, = SR,[z] operations, either for all
pairs of operations, copi t h ( o p j ) and cup, E /i(opj),
cupi <, cop, or m p , <,. copi;
3 . Fur each pair of wenk write up, = WW,[z] and
strict write op, = ST.Vj[x] operations, either for all
pnirs of operations, cop, E h(op8)and copj t li(upi),
cop, <, cop1 or cup, <, cupz.
2.
We extend the reads-from relationship for strict
transactions so that weak writes are taken into account.
A strict read operation on a data item z reads-z-from a
transaction 2; in an IES schedule, if it reads a copy of z
and T,, has written this or any quasi copy of z and no
other transaction wrote this or any quasi copy of z in
between. A weak write is acceptable as long as the
extended reads-from relationship for strict transactions is
not affected, that is strict transactions still read values
produced by strict transactions. In addition, correctness
of the underlying IAS schedule implies one-copy serializability of strict transactions and consistency of weak
transactions.
iEEE TRANSACTiONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERiOECEMBER 1999
904
Fig. 2 The reconciliation steps.
Definition 7 (IES correctness). An intercluster schedule is
correct fl
1. It is based on U correct IAS schedule STAS,and
2. The reuds-fvom relationship for strict transactions is
the same with their reads-from relationship in the
SIAS
4.2
The Serialization Graph
To determine correct I E S schedules, we define a modified
serialization graph that we call the interclnster serialization
graph (IESG). To construct the IESG, we augment the
serialization graph IASG of the underlying intracluster
schedule. To force conflicts among weak and strict
transactions that access different copies of the same data
item, we induce
0
- -
First, a write order as follows: If 2 ; weak writes and
T h strict writes any copy of an item z then either
T, Tk or Tk T, ; and
then, a strict read order as follows: if a strict
transaction Si;reads-x-from ST, in S,n,s and a weak
transaction WT follows ST,, we add an edge
STj --t IYT.
Theorem 3. Let S m , y be an IES sckednle based on an IAS
schedule S,,,,?.
SIRS has nn acyclic IESG, then ,!SIBS
If
is correct.
Proof. Clearly, if the IESG graph is acyclic, the corresponding graph for the IAS is acyclic (since to get the
IESG we only add edges to the IASG) and thus the IAS
schedulc is correct. We shall show that if the graph is
acyclic, then the reads-from relationship for strict
transactions in the intercluster schedule SIBS is the
same as in the underlying intracluster schedule SI,,^.
Assume that ST, reads-x-from ST, in SI,,,^. Then
ST, + ST,. Assume for the purposes of contradiction,
that ST,reads-x-from a weak transaction WT. Then WT
writes x in SIE,Sand since s'11 also writes x either a)
ST,+ WT, or b) WT + ST%.In case a, from the
definition of the IESC, we get ST, + I,vT, which is a
contradiction since ST, reads-x-from W?'. In case b,
WT + STz, that is WT precedes ,521 which precedes
STj, which again contradicts the assumption that STj
reads-x-from W T .
4.3 PrOtOCOl
To get a correct schedule we need to break potential cycles
in the IES graph. Since to construct the IESG, we start from
an acyclic graph and add edges between a weak and a strict
transaction, there is always at least one weak transaction in
each cycle. We rollback such weak transactions. Undoing a
transaction T may result in cascading aborts of transactions
that have read the values written by T; that is, transactions
that are related to T through a dependency edge. Since
weak transactions write only quasi copies in a single
physical cluster and since only weak transactions in the
same cluster can read these quasi copies, we get the
following lemma:
Lemma 2. Only weak trflnsactinns in the same physicul cluster
read vulues written by weuk transnctions in thut cluster.
The above lemma ensures that when a weak transaction
is aborted to resolve conflicts in an intercluster schedule,
only weak transactions in the same p-cluster are affected. In
practice, fewer transactions ever need to he aborted. In
particular, we need to abort only weak transactions whose
output depends on the exact values of the data items they
read. We call these transactions exact. Most weak transactions are not exact since by definition, weak transactions are
transactions that read local &consistent data. Thus, even if
the value they read was produced by a transaction that was
later aborted, this value was inside an acceptable range of
inconsistency and this is probably sufficient to guarantee
their correctness.
Detecting cycles in thc IESG can bc hard. The difficulties
arise from the fact that between transactions that wrote a
data item an edge can have any direction, thus resulting in
polygraphs [22]. Polynomial tests for acyclicity are possible,
if we make thc assumption that transactions read a data
item before writing it. Then, to get the IES graph from the
IAS graph, we need only:
Induce a rcad order as follows: If a strict
transaction ST reads an item that was written by
a weak transaction WT, we add a precedence
4W ' .
edge E/'
Fig. 2 outlines the reconciliation steps.
~
PITOURA AND BHARGAVA: DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS
5
DISCUSSION
In the proposed hybrid scheme, weak and strict transactions
coexist. Weak transactions let users process local data thus
avoiding the overhead of long network accesses. Strict
transactions need access to the network to guarantee
permanence of their updates. Weak rends provide users
with the choice of reading an approximately accurate value
of a datum in particular in cases of total or partial
disconnections. This value is appropriate for a variety of
applications that do not require exact values. Such applications include gathering information for statistical purposes
or making higli-level dccisioiis and reasoning in expert
systems that can tolerate bounded uncertainty in input data.
Weak writes allow users to update local data without
confirming these updates immediately. Update validation
is delayed until the physical clusters are connected. Delayed
updates can be performed during periods of low network
activity to reduce demand nn the pcaks. Furthermore,
grouping together weak updates and transmitting them as a
block rather than one at a time can improve bandwidth
usage. Por example, a salesperson can locally update many
data items, till these updates are finally confirmed, when
the machine is plugged hack to the network at the end of the
day. However, since weak writes may not he finally
accepted, they must he used only when compensating
transactions arc available, or when the likelihood of
conflicts is very low. For examplc, users can cmploy weak
transactions to update mostly private data and strict
transactions to update frequently used, heavily shared data.
The cluster configuration is dynamic. Physical clusters
may be explicitly created or merged upon a forthcoming
disconnection or connection of thc associated mobile clients.
To accommodate migrating locality, a mobile host may join
a different p-cluster upon entering a new support environment. Besidcs defining clusters based on the physical locution
of data, other definitions arc also possible. Clusters may he
dcfined based on the semantics of data or applications.
Information about access patterns, for instance in the form
of a user's profile that includes data describing the user's
typical behavior, may he utilized in determining clusters.
Somc examples follow.
Example 1 (Cooperative cnvironment). Consider the case of
users working o n a common project using mobile hosts.
Groups are formed that consist of users who work on
similar topics of the project. Physical clusters correspond
to data used by people in the same group who need to
maintain consistency among their interactions. We
consider data that are most frequently accessed by a
group as data belonging to this group. At each physical
cluster (group), the copies of data items belonging to the
group are corc copies, while the copies of data items
belonging to other groups are quasi. A data item may
belong to more than one group, if more than one group
frequently accesses it. In this case, core copies of that
data item exist in all such physical clusters. In cach
physical clustcr, operatinns on items that do not belong
to the group arc weak, while operations on data that
belong to the group are strict. Weak updates on a data
905
item arc accepted only when they do not conflict with
updates by the owners of that data item.
Example 2 (Caching). Clustering can be used to model
caching in a client/server architecture. In such a setting,
a mobile host acts as a client interacting with a server at a
fixed host. Data are cached at the client for performance
and availability. The cached data arc considered quasi
copies. The data at the fixed host are corc copies.
Transactions initiated by the server are always strict.
Transactions initiated by the client that invoke updates
are always weak, while read-only client transactions can
be strict when strict consistency is required and weak
otherwise. At reconciliation, weak writes are accepted
only if they do not conflict with strict transactions at
the server. The frequency of reconciliation depends on
the user consistency requirements and on networking
conditions.
Example 3 (Caching Location Data). In mobile computing,
data representing the location of a mobile user arc fastchanging. Such data are frequently accesscd to locate a
host. Thus, location data must be replicated at many sites
to reduce the overhead of searching. Most of the location
copies should be considered quasi. Only a few core
copies are always updated to reflect changes in location.
6 QUANTITATIVE
EVALUATION
OF WEAK
CONSISTENCY
To quantify thc improvement in performance attained by
sacrificing strict consistency in weakly connected environments and understand the interplay among the various
parameters, we have developed an analytical model. The
analysis follows an iteration-based methodology for coupling standard hardware resource and data contention as in
[391. Data contention is the result of concurrency and
coherency control. Resources include the network and the
processing units. We generalize previous results to take into
account a) nonuniform access of data, that takes into
consideration hotspots and the changing locality, b) weak
and strict transaction types, and c) various forms of data
access, as indicated by the compatibility matrix of Fig. 1.An
innovative feature of the analysis is the employment of a
vacation system to model disconnections of the wireless
medium. The performance parameters under consideration
are the system throughput, the nmnber of messages sent,
and the response time of weak and strict transactions. The
study is performed for a range of networking conditions,
that is, for different values of bandwidth and varying
disconnection intervals.
6.1 Performance Model
We assume a cluster configuration with TL physical clusters
and a Poisson arrival rate for both queries and updates. Let
A, and A,, respectively, be the average arrival rate of queries
and updates on data items initiated at each physical clustcr.
We assume fixed length transactions with N operations on
-1- A,,)]Nof which are queries and
data itcms, Nq = [&,/(A,
N,, = [AJ(A<,
A,,)]iV are updates. Thus the transaction rate,
i.e., the rate of transactions initiated at cach p-cluster, is
A N = A,JN,,.
+
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERIDECEMBER 1999
906
messages with average size 771, then the service rate sl. is
equal to W / m . Let 1, be the network transmission time.
Number of Messages. The total number of messages
transmitted per second among clusters is:
Let c be the consistency factor of the application under
consideration, that is, c is the fraction of the arrived
operations that arc strict. To model hutspots, we dividc
data at each p-cluster into hot and cold data sets. Let D bc
thc number of data items per p-cluster, D,of which are cold
and Ill,hot. To capture locality, we assume that a fraction o
of transactions exhibit locality, that is they access data from
thc hot set with probability h and data from the cold set
with probability 1 - h. The remaining transactions access
hot and cold data uniformly. Due to mobility, a transaction
may move to a different physical cluster and thus the data it
accesses may no longcr belong to the hot data of the new
cluster. This can be modeled by letting o diminish. Locality
is taken advantage of by the replication scheme by
assuming that thc probability that a hot data has a core
copy at a p-cluster is 1, and that a cold data has a core copy
is I', where normally 1' < 1. Lct pi be the probability that an
operation at a clustcr acccsses a data item for which there is
a core copy at the cluster:
p i = o[id
+ (1 -
/L)//]
+ (I
-
li.I = 2rLC[A,(pi(q,. - 1 )
+ X,,(Pi(h
-
+
(1 -p1)q7)
1)+ (1-PI)%,)],
The first term corresponds to query traffic; the second, to
update traffic.
Execution Time. For simplicity, we ignore thc communication overhead inside a cluster, assuming either that
each cluster consists of a single node or that the communication among the nodes inside a clustcr is relatively fast.
Without taking into account data contention, the average
response time for a weak read on a data item is R; = iii t q
and for a weak update f?;; = w t,,, where 11) is the average
wait time at each cluster. I.ct 6, bc 0 if q7 = I and 1
otherwise, and I],,, be 0 if (I,,, = I and 1 otherwise. Then for a
strict read on a data item
+
+
+
0)[(1'D,)/D ( W , ) / U ]
+
+ +
(qv - l)ti, -1- h,.(26, 1, W ) ]
R; = pl[UJiFor simplicity, we assume that there is one quasi copy of
(1 - Pl)(Q.tl, + 2t, t, w )
each data item at each p-cluster. Let qr be the read and q,,>
the write quorum and iVS bc the mean number of and for a strict write
operations on data copics pcr strict transaction. The
Rf, = p i [ t o -1- t,, -1- (qt,>- 1 ) t b G;,,(2tr t,, w)]
transaction model consists of nI. 2 states, where ii,, is
(1 - p i ) ( y , " t a + 2 t , + t , , + , ~ ) .
the random variable of items acccssed by the transaction
and NI, its mcan. Without loss of generality, we assume that The computation of iii is given in the Appendix.
N L is equal to the number of operations. Thc transaction has
Average Transmission Time. The average transmission
an initial setup phase, state 0. Then, it progress to states time t, equals thc scrvicc time plus the wait time to1.at each
1 , 2 , .. . ,nil in that order. If successful, at the end of state n,~, network link, t,. = l/s, tu1.The arrival rate A, at each link
the transaction enters into the commit phase at state nr,+l. is Poisson with mean M / ( n ( n - 1)).The computation of w7.
The transaction response time 'r
can he expressed a s
is given in the Appendix.
Throughput. The transaction throughput, i.e., input rate,
lLI/,
is
bounded
by: a) thc processing time at each cluster (since
TLvniis = TINPI, f TI<
ywJ
tmmn,ii,
,i&l
X 5 E[z], where X is the arrival rate of all requests at
where n,,,,is the number of lock waits during the run of thc each cluster and I < [ z ] is the mean service time); b) the
transaction, T,,,, is the waiting time for the j t h lock available bandwidth (since A, 5 I,?); and c) the disconneccontention, 'rli is the sum of the execution times in states tion intervals (since A, 5 E [ v ] ,where E[?,]is the mean
,ni, excluding lock waiting times, rr,\rI,p is tlic duration of a disconncction).
+
+
+ +
+
+
+ +
+
+
+
(4
execution time in state 0, and t,,,,,,,,ii is the commit time to
reflect the updates in the database.
6.1.1 Resource Contention Analysis
Wc modcl clusters as M/G/1 systems. Thc avcrage service
time for the various types of requests, all exponentially
distributed, can be determined from the following parameters: the processing time, (, of a query on a data copy,
the time L,, to install an update on a data copy, and the
overhead time, 4, to propagate an update or query to
another clustcr. In each M/G/1 scrvcr, all requests arc
processed with the same priority on a first-come, firstserved basis. Clustcrs bccomc disconncctcd and recoiinected. To capture disconnections, we model each connection among two clusters as an M / M / 1 system with
vacations. A vacation system is a system in which the
scrver becomes unavailable for occasional intervals of time.
If W is the available bandwidth bctween two clusters and if
we assume exponcntially distributed packet lengths for
6.1.2 Data Contention Analysis
We assume an eventual and best effort h. In the following,
op stands for one of WIZ, WW, SIZ, SW. Using (A) the
response time for strict and weak transactions is:
~V,I',,RSR
+
l~.~i~~~
= iniNm
~l~
-1- "H, ~ , i~ " , ~ l , ~N,,I:,ll.sw
~,
+ TL,,,,,ii
H,,,,,,s~in~,~.~.~.i~,~,~
= ~ ( I N P L-k l?.
where Pal,is the probability that a transaction contents for
an 01, operation on a data copy, and R , is the average
time spent waiting to get an u p lock given that lock
contention occurs.
and I:, arc, respectively, the
probability that at lcast one operation on a data copy
per strict read or write conflicts. Specifically, P, =
I - (1 - Psn)" and P,, = I - (1 - PSlv)"".An outline of the
is given in the Appendix.
estimation of p",,and
PITOURA AND BHARGAVA: DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS
907
TABLE 4
Input Parameters
Dcscription
numher of physical clusters
query arrival rate
updatc arrival rate
consistency fact,or
rend quorum
write quoruni
local transactions accessing hot data
probability that a local transaction acccss hot data
probability a hot data has a core copy at a givcn cluster
probability a cold data has a core copy at a given cluster
processing time for an update
processing time for a q u a y
propagation overhead
vacation int,erval
availablc bandwidth
avcrage size of a mcssage
number of cold data iterns per p-cluster
number of hot data items per pclustcr
average number of operations pcr trausact,ion
Value
5
12 queries/sec
3 upllatcs/sec
ranges from 0 to 1
ranges from 1 to n
rangcs from 1 to n
ranges from 0 to 1
ranges from 0 to 1
ranges from 0 to 1
ranges from 0 to 1
0.02 SEC
0.005 scc
0.00007 sec
ranges
ranges
200
10
__
r
1
,
02
0.4
,
,
0.“
~,
.*n,
. ~_._,__;
~.
(““‘i.ln’^y
mtm
(‘oeris,a,cy Lctm
la)
(bl
Fig. 3. Maximum allowable input rate of updates lor various values of the consistency factor. (a) Limits imposed by the processing rate at each
cluster ( X 5 E M ) . (b) Limits imposed bandwidth restrictions (A,. 5 t?).
6.2 Performance Evaluation
The following performance results show how the percentage of weak and strict transactions can be effectively tuned
based on the prevailing networking conditions such as the
availablc bandwidth and the duration of disconnections to
attain the desired throughput and latency. Table 4 depicts
some realistic values for the input parameters. The
bandwidth dcpends on the type of technolojg used, for
infrared a typical value is 1Mbps, for packet radio 2 Mbps,
and for cellular phone 9-14 Kbps [7].
6.2.1 System Throughput
Fig. 3a, Fig. 3b, Fig. 4a, and Fig. 4b show how the maximum
transaction input, or system throughput, is bounded by the
processing time, the available bandwidth, and the disconnection intervals, respectively. We assume that queries are
four times more common than updates A, = 1Azt. As shown
in Fig. 3a, the allowable input rate when all transactions are
weak (c = 0) is almost double the rate when all transactions
are strict (c = 1). This is the result of the increase in the
workload with c caused by the fact that strict operations on
data items may be translated into more than one operation
on data copies. The perccntage of weak transactions can be
effectively tuned to attain the desired throughput based on
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11. NO. 6. NOVEMBERIDECEMBER 1999
90s
Fig. 4. Maximum allowable input rate for updates for various values of the consistency factor.Limits imposed by disconnections and their duration
(A, 5 R[4).(a) Disconnections lasting from 115 lo 1 seconds and (b) longer disconnections lasting from 15 to 75 minutes.
(a)
Fig. 5. Number of messages. (a) For various values of
o-U.ti,l=U.O,
(b)
C.
(b) With
(C)
locality. (c) For different replication of hot core copies. Unless stated otherwise:
i'=U.4,i~-U.!),andr=U.7.
the networking conditions such as the duration of disconnections and the available bandwidth. As indicated in
Fig. 3b, to get, for instance, the same throughput with
200 bps as with 1,000 bps aiid c = 1 we must lower the
consistency factor below 0.1. Thc duration of disconnections
may vary from seconds when they are caused by hand offs
[19] to minutes, for instance when they are voluntary. Fig. 4
depicts the effect of the duration of a disconnection on the
system throughput for short (Fig. 4a) and long discnnnections (Fig. 4b). For long disconnections, only a very small
perccntage of strict transactions can be initiated at disconnected sites (Fig. 44. To keep the throughput comparable to that for shorter disconnections (Fig. 4b), the
consistency factor must drop at around three orders of
magnitude.
6.2.2 Communication Cost
We estimate the communication cost by the number of
messages sent. The number of messages depends on the
following parameters of the replication schcme: 1) the
consistency factor c, 2) the data distribution 1 for hot and 1'
for cold data, 3) the locality factor 0, and 4) the quorums, qr
and qa,, of the coherency scheme. We assume a ROWA
scheme (q, = 1, q-,, = 'nJ, if not stated otherwise. As shown
in Fig. 5a, the numbcr of messages increases linearly with
the consistency factor. As expected the number of messages
decreases with the percentage of transactions that access hot
data, since then local copies are more frequently available.
To balance the increase in the communication cost caused
by diminishing locality, thcrc may be a need to appropriately decrease the consistency factor (Fig. 5b). The
number of messages decreases, when the replication factor
of hot core copies increases (Fig. 5c). The decrease is more
evident since most operations are queries and the coherency
scheme is ROWA, thus for most opcrations no messages are
sent. The decrease is more rapid when transactions exhibit
locality, that is, when they access hot data more frequently.
On the contrary, the number of messages increascs with the
replication factor of cold core copies because of additional
writes caused by coherency control (Fig. 6a). Finally, the
relationship between the rcad quorum and the iininber of
messages depends on the relative iiumbcr of queries aud
updates (Fig. 6b).
6.2.3 Transaction Response Time
The response time for weak and strict transactions is
depicted in Fig. 7 for various values of c. The larger values
of response times are for 200bps bandwidth, while faster
response times are the result of higher network availability
set at 2Mbps. The values for the other input parameters are
PITOUHA AND BHAHGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS
909
. ............ .
(a)
(b)
Fig. 6 . Number of messages. (a) For different replication of cold core copies. (b) For differentvalues of the read quorum. Unless stated otherwise:
n=n.~,l=fl.~1,1'=~1.~,i~-n.~,andr=~.7.
as indicated in Table 4. The additional parameters are set as
follows: 1)the locality parameters are o = 0.9 and h = 0.8,2)
the data replication parameters are 1' = 0.2 and 1 = 0.8, 3)
the disconnection parameters are p = 0.1 and the vacation
intervals are exponentially distributed with E[v]= 1,'s sec,
to model disconnection intervals that correspond to short
involuntary disconnections such as those caused by handoffs .[lY], and 4) the coherency control scheme is ROWA.
The latency of weak transactions is about 50 times greater
than that of strict transactions. However, there is a trade-off
involved in using weak transactions, since weak updates
may be aborted later. The time to propagate updatcs during
reconciliation is not counted. As c iiicrcases, the response
time for both weak and strict transactions increases, since
more conflicts occur. The increase is more dramatic for
smaller values of bandwidth. Fig. 8a and Fig. Sb show the
response time distribution, respectively, for strict and weak
transactions and 2Mbps bandwidth. For strict transactions,
the most important overhead is due to network transmission. All times increase, as c increases. For weak transactions, the increase in the response timc is the result of longer
waits for acquiring locks, since weak transactions that want
Fig. 7. Comparison of the response times for weak and strict
transactions for various values of the consistency factor.
to read up-to-date data conflict with strict transactions that
write them.
7
RECONCILIATION
COST
We provide an estimation of the cost of restoring
consistency in terms of the number of weak transactions
that need to bc rolled back. We focus on conflicts among
strict and weak transactions for which we have outlined a
reconciliation protocol and do not consider conflicts among
weak transactions at different clusters. A similar analysis is
applicable to this case also.
A weak transaction W'T is rolled back, if its writes
conflict with a read operation of a strict transaction ST that
follows it in the IASG. Let PI he the probability that a weak
transaction WT writes a data item read by a strict
transaction ST and P2 he the probability that S?' follows
PVT in the serialization graph. Then, '
I = PIP2 is the
probability that a weak transaction is rolled back. Assume
that reconciliation occurs after N,. transactions of which
k = c N v are strict and K = (1 -c)N,. are weak. For
simplicity, we assume uniform access distribution.
Although it is reasonable to assume that granule access
requests from different transactions are independent,
independence cannot hold within a transaction, if a
transaction's granule accesses are distinct. However, if the
probability of accessing any particular granule is small, e.g.,
when thc number of granules is large and the access
distribution is uniform, this approximation should bc very
accurate. Then, PI rr I - (1 - A?lJU)N,2.
Let picr. be the probability that, in the IASG, there is an
edge from a given transaction of typc I< to a given
transaction of typc I,, where the type of a transaction is
either weak or strict. Let &,,(m, m') be the probability that
in an IASG with m strict and m' weak transactions there is
an edge from a given transaction of typc Jf to any
transaction of type L. The formulas for
and
pk,,(m,m') are given in the Appendix. Let p(m,m',i) he
the probability that there is an acyclic path of length 7, i.e., a
path with i -i. I distinct nodes, from a given weak transac-
910
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6, NOVEMBERIDECEMBER 1999
'i
Consistency fzictor
(a)
Fig. 8. (a) Response time distribution for strict transactions. (b) Response time distribution for weak transactions.
ooo21
~~~~
0.1
~~
-
"8
~
a"
0.3
"3
o,>"l~tr?'y/1*0r E
0.6
oi
~
_
0"
7
"1
(a)
Fig. 9. Probability of abort for 3,000 data items.
tion to a given strict transaction in an lASC with
and
7111 weak
711
strict
transactions. Then,
n
b+i'-l
Pz=l-
[l-p(k,kJ,i)].
i-1
The values of p(k, K ,i) can be computed from the following
recursive rclations:
p ( m ,vi', I) = p w s ,
p(711,711', 0) = 0,
p(m,,0 , i j = 0,
p(O,nl',i) = 0
Cor all m > 0. m' > 0
fur d 1 111 > 0: 111' > 0,
[or all m 2 0: i > 0, m d
for a11 in' 2 0, i > 0
p(k:k',i-I- I ) = ] -[(1 -pil,cr,("~')p(lc,Id-l,i)j
where the first term is the probability of a path whose first
edge is between weak transactions, the second of a path
whose first edge is between a weak and a strict transaction
and includes at least one more weak transaction and the last
of a path whose first edge is between a weak and a strict
transaction and does not include any other weak transactions. Thus, the actual number of weak transactions that
need to be undone or compensated for because their writes
= Pk'. We also need to
cannot become permanent is Nuboll
roll back all exact weak transactions that read a value
written by a transaction that is aborted. Let P be the
percentage of weak transactions that are exact, thcn
XT"(L= ell - (1- ~~,/D)N.]k'lV,,,,.,.
Fig. 9 depicts the probability that a weak transaction
cannot be accepted because of a conflict with a strict
transaction for reconciliation events occurring after varying number of transactions and for different values of the
consistency factor. Fig. 10 shows the same probability for
varying database sizes. More accurate estimations can be
achieved for specific applications for which the access
patterns of transactions are known. These results can be
used to determine an appropriate reconciliation point that
balances the cost of initiating reconciliations against the
number of weak transactions that need to be aborted. For
PITOURA AND BHARGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS
91 1
I
(a)
Fig. IO. Probability of abort for 60 transactions.
consistency requirements arc considered a special case of
weak transactions that have no write operations.
Two requirements for read-only transactions were
introduced in [a]: consistency and currency requirements.
Consistency requirements specify the degree of consistency
8 RELATEDWORK
needed by a rcad-only transaction. In this framework, a readOne-copy serializahility [3] hides from the user the fact that only transaction may have: a) no consistencyrequirements; b)
there can be multiple copics of a data itcm and ensures weak consistencyrequirements, if it requires a consistent view
strict consistency. Whereas one-copy serializability may he (that is, if all integrity constraints that can be fully evaluated
an acceptable criterion for strict transactions, it is too with thc data read by the transaction must bc true); or c) sfyong
restrictive for applications that tolerate hounded inconsis- consistency requirements, if the schedule of all update
tency and also causes unbearable overhcads in cases of transactions togcthcr with all other strong consistency
weak connectivity. Thc weak transaction model described queries must he consistent. While our strict read-only
in this paper was first introduced in 12.61, while preliminary transactions always have strong consistency requirements,
performancc results were prcsented in [24].
weak read-only transactions canhc tailored to have any of the
above degrees based on the criterion used for IAS correctness.
8.1 Network Partitioning
Weak read-only transactions may have: no consistency
The partitioning of a databasc into clusters rcscmbles the
requirement, if ignored from the IAS schedule; weak
iietwork partition problem [5], where site or link failures
consistency, if part of a weakly correct IAS schedule; and
fragment a network of database sites into isolated subnetstrong consistency, if part of a strongly correct schedule.
works called partitions. Clustering is conceptually different
Currency requirements specify what update transactions
than partitioning in that it is electively done to increase
should
be reflectcd by the data read. In terms of currency
performance. Whereas all partitions are isolatcd, clusters
requirements,
strict rcad-only transactions read thc most-upmay be rveakly connccted. Thus, clients may operate as
to-date
data
item
available (i.e., committed). Weak read-only
physically discoimccted even while remaining physically
transactions
may
read older versions of data, depending on
conuected. Strategics for network partition face similar
competing goals of availahility and correctness as cluster- the definition of the d-degree.
Epsilon-serinliznbility (ESR) [%I, [29] allows tcmporary
ing. These strategies range from optiinistic, where any
and
bounded inconsistencies in copies to be sccn by queries
transaction is allowed to be executed in any partition, to
prssinzistic, wherc transactions in a partition are restrictcd by during the period among the asynchronous updatcs of the
making worst-casc assumptions about what transactions at various copies of a data item. Read-only transactions in this
other partitions are doing. Our model offers a hybrid framework are similar to weak read-only transactions with
approach. Strict transactions may be performed only if one- no consistency requirements. ESR bounds inconsistency
copy sciializahility is ensured (in a pessimistic manner). directly by bounding the number of updates. In [3R], a
Weak transactions may bc performed locally (in an generalization of ESR was proposed for high-level type
optimistic manner). To mcrge updates pcrforined by weak specific operations on abstract data types. In contrast, our
approach deals with low-level read and write operations.
transactions we adopt a purcly syntactic approach.
In an N-ignorant system, a transaction need not see the
8.2 Read-only Transactions
results of at most N prior transactions that it would have
Rcad-only transactions do not modify the database state, seen if the execution had been scrial [15]. Strict transactions
thus their execution cannot lcad to inconsistent database are @ignorantand weak transactiuns are 0-ignorant of other
states. In our schemc, read-only transactions with weaker wcak transactions at thc same cluster. Weak transactions
instance, for a givcn c: =: 0.5, to kcep the probability
helow a threshold of say 0.00003, rcconciliation events
must take place as often as every 85 transactions (Fig. 9b).
912
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERIDECEMBER 1999
are ignorant of strict and weak transactions at other clusters.
The techniques of supporting N-ignorance can be incorporated in the proposed model to define d as the ignorance
factor N of weak transactions.
8.3 Mobile Database Systems
The effect of mobility on replication schemes is discussed in
[Z]. The need for the management of cached copies to be
tuned according to the available bandwidth and the
currency requirements of the applications is stvessed. In
this respect, &degree consistency and weak transactions
realize both the above requirements. The restrictive nature
of one-copy serializability for mobile applications is also
pointed out in [16) and a more relaxed criterion is proposed.
This criterion although sufficient for aggregate data is not
appropriate for general applications arid distinguishable
data. Furthermorc, the criterion does not support any form
of adaptability to the current network conditions.
The Bayou system [35], 161, 1231 is a platform of replicated
highly available, variable-consistency, mobile databases on
which to build collaborative applications. A read-any/
write-any weakly-consistent replication scheme is employcd. Each Bayou database has one distinguished server,
the primary, which is responsible for committing writes.
The other secondary servers tentatively accept writcs and
propagate them towards the primary. Each server maintains two views of the database: a copy that only reflects
committed data and another full copy that also reflects
tentative writes currently known to the server. Applications
may choose between committed and tentative data.
Tentative data are similar to our quasi data, and committed
data similar to core data. Correctness is defined in terms of
session, rather than on serializability as in the proposed
model. A session is an abstractioll for the
of read
and writes of an application. pour types of guarantees can
be requested per session: a) read yiiur writes, bj mollotonic
reads (successive reads reflect a nondecreasing set cif
writes), c) writes follow read (writes are propgated after
reads on which they depend), and d) monotonic writcs
(writes are propagated aftc:r xvritcs that logically
them). To reconcile copies, &y,ou
adopts an application.
based approach as opposed to the syntactic based procedurc used here. The detcctioll mechallism is based on
dependency
and tllc pcr.write conflict resolutioll
method is based on client-provided mcrgc procedures 1361.
In the truo-tier replication scheme [‘)I, replicated data have
two versions at mobile nodes: master and tentative
versions. A master version records the most recent
received while the site was connected. A tentative version
records local updates. There are two types of transactions
somewhat analogous to our weak attd strict transactions:
tentative and base transactions. A tentntive transnctio,l works
on local tentative data and produces tentative data. A bnse
tvnnsaction works only on master data and produces
data. Base transactions involve <inlyconnected sites. upon
reconnection, tentative transactiolls
reprocessed as base
transactions. If they fail ti1 meet some application.specific
acceptance criteria, they are aborted and a message is
returned to the mobile nodc. ourscllcmc
two.ticr
replication in that weak coslnectivity is supported by
employing a combination of weak and strict transactions.
We have provided the foundations for the correctness of
the proposed scheme as well as an evaluation of its
perfcirmance.
8.4 Mobile File Systems
Coda [14] treats disconnections as network partitions and
follows an optimistic strategy. An elaborate reconciliation
algorithm is used for merging file updates after the sites are
connected to the fixed network. No degrees of consistency
are defined and no transaction support is provided.
Isolation-only transactions (IOTs) 1171, [18] extend Coda
with a new transaction service. IOTs arc sequences of file
accesses that unlike traditional transactions have only the
isolation property. IOTs do not guarantee failure atomicity
and only conditionally guarantee permanence. IOTs are
similar to weak transactions.
Methods for refining consistency semantics of cached
files to allow a mobile client to select a mode appropriate
for the current networking conditions are discussed in [lo].
The proposed techniques are delayed writes, optimistic
replication and failing instead of fetching data in cases of
cache misses.
The idea of using different kinds of operations to access
data is also adopted in [32], [33], where a weak read
operation was added to a file service interface. The
semantics of file opevations are different in that no weak
write is provided and since there is no transaction
support, the correctness criterion is not based on om-copy
serializability.
9 SUMMARY
To overcome bandwidth, cost, and latency barriers, clients
of mobile information systems switch between connected
and discolinccted modes of operation. In this paper, we
propose a replication scheme appropriate for such operation. Data located a t strongly connected sites arc grouped in
clusters. Bounded inconsistency is defined by requiring
mutual consistency among copies located at the same
cluster and controlled deviation among copies at different
clusters. The database interface is extended with weak
oPeratiol1s. Weak operations query local, potentially inconsistent copies and perform tentative updates. The usual
operations, called strict in this framework in contradistinction to weak, are also supported. Strict operations access
col1sistent data and perform permanent updates.
Clients may operate disconnected by employing only
wreak operatious. To accommodate weak connectivity, a
mobile client selects an appropriate combinatioll of weak
and strict transactions based on the consistency requirements of its applications and on the prevailing networking
conditions. Adjusting the degree of divergence provides an
additional support for adaptability. The idea of providing
weak operations can be applied to other types of intercluster integrity constraints besides replication, Such coilStraints can be vertical and horizontal partitions or
arithmetic constraints 1311. Another way of defining the
semantics of weak operations is by exploiting the semantics
of data. In [371, data are fragmented and later merged based
011 their object semantics.
~
PlTOuRA AND BHARGAVA: DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS
APPENDIX
A. COMPUTATION
OF RESOURCE
WAITING TIMES
Processor waiting time. At each cluster there are the following
types of requests: Queries are initiated at a rate of A,. From
the locally initiated queries, A i = (I - c)X, are wcak and are
serviced locally with an average service time 81 = t,,. Thcn,
from the remaining CA, strict queries A 2 = zcA, have service
time 82 = (q? - 1 ) L h t, and the rest A1 = (1- z)cA, have
service time O3 = qvtb. Queries are also propagated from
other clusters at a rate An = [z(qv- I ) l ( 1- ~ ) q , . ] c X ,and
~
have service time 84 = l,l. Analogous formulas hold for the
arrival rates and service times of updates. The combined
flow of request forms a Poisson proccss with arrival rate,
+
8
x = CAi
t=I
The service time of the combined flow, T, is no longer
exponentially distributed but its means and second moments are.
Then, the wait time by using the Pollaczek-I<hinchin (P-K
formula) 141 is:
Note that the above analysis as well as the following
analysis on network links arc worst cases. In practice,
when a locking method is used for concurrency control, a
number of transactions is waiting to acquire lucks and
not competing for system resources. T l i ~the rate of
arrival of operations at the resource queues and the
waiting time at each queue may be less than the valuc
assumed in this section.
Transmission waiting time. Wc consider a nonexhaustive
vacation system where after the end of each service the
server takes a vacation with prubability 1 - p or continues
service with probability 11. This is called a queue system
with Bernoulli scheduling [34].In this case:
1111
=
913
We divide the state i of each weak transaction into two
substates, a lock state i l , and an execution state iz. In
substate i l the transaction holds i - I locks and is waiting
for the ith lock. In substate i z it holds i locks and is
executing. Similarly, we divide each state of a strict
transaction into three substates in, i l , and i z . Let
1 1 ~ = (NQ/h')q7
I (NL/N)q?,,.In substate i o , a transaction is
at its initiating cluster, holds ( i - l)q2: locks and sends
messages to other clusters. In substate i,, the transaction
holds ( i - l)qcrlocks and is waiting for the ith set of locks. In
substate %a it holds ( i - l)q:,: qv ( ( i - l)q7 I q,.) locks and is
cxecuting. The probability that a transaction cnters substate
i~ upon leaving state i - 1 or i n is i'i,v,(, PI^., P,, and P,,,
respcctively, for WR, W W , SR, and S W lock requests. The
mean timc sol, spent at substate i y is computed from the
resource contention analysis; for instance, qirn = ui tP.1.et
cOv for a strict o p be the mean time spent at state iu for
instance, c . s n = u ~ + ( l- p ~ ) ( y i . t b + l , ) + p l ( ( q l . - I ) L ~ , i - b ~ t ~ ) ,
The time spent a t state il is R,,,and the unconditional
mean time spent in substate i l is bop; for instance,
6 1 4 =
~ Pi4qzri?.I+w.
Let d i , (d&) be the mean number of hot (cold) copies
written by an ijp operation and if the mean number of
op operations par copy. For instance, for or) = X7R and
hot copies,
+
+
and
Given a mean lock holding time of Tpl, (?;.) for weak (strict)
transactions and assuming that the lock request times are a
Poisson process, the probability of contention on a lock
request for a copy equals thc lock utilization. Let 1',,1~,,,,,
stand for the probability that an o,J,-lock request conflicts
with an op-lock request; then, for example,
P;l~l~/lvw
= ~;l,jiG;,lv~iv
+ ~;v,&lvZv
and
PI,l,r? =
d;"I<(l;,lvTs
+
I;,bl,TilT)
+ d;l,l,(l'FcvTs + I;vclJiv)
Let Civ (C,?)be the sum of the mean luck holding times over
all N copics accessed by a weak (strict) transaction,
]
E [ $ ] A,(Sj') + (1- p ) ( 2 ( l / , 5 v ) j ? [ ? J+E'[v"])}
2.E["]
2{1-p--(l -p)X,Ejv]}
~
+
where syl is the second moment of the scrvice ratc and 'IJ the
vacation interval, that is the duration of a disconnection.
B. DATACONTENTION
ANALYSIS
From the resource contention analysis,
N,&
+ N,,R:,
RP:~,,,,~
= N,RY
+ N,,R::.
14.,,;,
=
and
where 7) is the mean time to commit. Then TIIT
= GI{./ N .
Similar formulas hold for G,?and Ty.
Let Nlf, ( N ; ) be the mean number of weak (strict)
and,CP,t;,,,<jI,2
be the
transactions pcr cluster in substate %
conditional probability that an opl-lock request contents
with a transaction in substate i, that holds an incompatible
opr-lock given that lock contention occurs. Now, we can
approximate R,,I,,for instance,
914
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6, NOVEMBERIDECEMBER 1999
~
A. Demers, K. Pctcrscn, M. Spreitzer, D. Terry, M. Theimer, and B,
Welch, "The Bayou Architecture: Support for Data Sharing among
Mobile Users," Proc. 1EEE Workshop Mobile Computiq Systems ana
Applications, pp, 2-7, Dec. 1994.
[71 G.H. Forman and J. Zahorjan, "The Challenges of Mobile
Computing," Conrputer, vol. 27, no, 6, pp. 38-47, Junf 1994.
[XI H. Garcia-Molina and G. Wiederhold, "Read-Only Transactions in
a Distributcd Database," ACM Trmis. Dntnbnsc Systems, vol. 7, no. 2,
pp. 209-234, June 1982.
[O] J. Gray, P. Helland, F. ONeil, and D. Shasha, "The Dangers of
Rcplication and a Solution," Pmc. ACM SIGMOD Cor#, pp. 173[6]
182,1996.
[lo] P. Honcyman and L.B. Nuston, "Communication and Consistcncy
in Mobile File Systems," IEEE Pwsomd Coinnr., vol. 2, no. 6, Dec.
1995
where the factors f2 express the mean remaining times at
the corresponding substate and depend on the distribution
of times at each substate. Finally, ,siq, ( s.7,) is the mean time
for a weak (strict) transaction from acquiring the 'ith lock till
the end of commit, for instance:
C. RECONCILIATION
The probabilities of edgcs in the serialization graphs are
givcn below:
p;ss(m,m')= I - (1 -pss)(7"-1),
where pc = 1./n2is the probability that two given transactions are initiated at the samc cluster.
ACKNOWLEDGMENTS
Rharat Bhargava was supported, in part, by the US.
National Science Foundation Grant No. 9405931.
REFERENCES
[l] P. Alonso, D. Barbara, and H. Garcia-Molina, "Data Caching
Issues in an Information Retrieval System," ACM Trans. Database
Systems, vol. 15, no. 3, pp. 359-384, Scpt. 1990
[2] D. Barbar5 and II. Garcia-Molina, "Replicated Data Management
in Mobile Environments: Anything New under the Sun?" Pmc.
F I P Co,i$ Appiicalioiis in Parallel and Disfribirted Computing, Apr.
1994.
P.A. Bemstcin, V. Hadjilacns, and N. Goodman, Concurrency
Coatrul orid Recovery in Datnbnsr Systems. Addisson-Wcsley, 1987.
[4] D. Bertsekes and R. Gallager, Dnta Networks. Prcntice Hall, 1987
[5] S.B. Davidson, H. Garcia-Molina, and D. Skecn, "Consistency in
Partitionad Networks," ACM Coinpirtinfi Surveys, vol. 17, no. 3,
pp. 041-370, Sept. 1985.
[3]
[I I ] Y. Huang, P. Sistla, and 0.Wolfson, "Data Replication for Mobilc
W . 13-24.. Ma"
Comuuters." Proc. 1994 SIGMOD Conf.
, . I I
, 1994.
[I21 L.B. Hustnn and P. Honcymnn, "Partially Connected Oycration,"
Conipiiting Systems, vol. 4, no. 8, Fall 1995.
[I31 T. Imielinksi and B. R. Uadrinath, "Wireless Mobile Computing:
Challenges in Data Managcmcnt," Comm. ACM, vol. 37, no. 10,
Oct. 1994.
[I41 J.J. Kistler and M. Satyanarayanan, "Disconnected Operation in
the Coda File System," ACM Tmns. Computer Systems. vol. 10, no. 1,
pp. 213-225, Feb. 1992.
[I51 N. Krinshnskumer and A.J. Bernstein, "Bounded Ingnorancc: A
Techniquc for Increasing Concurrency in B Replicated System,''
ACM Tmns. Dalnbnse Systems, vol. 19, no. 4, pp. 586-625, Dec. 1994.
[I61 N. Krishnskumur and R. Jain, "Protocols for Maintaining
Inventory Databases and User Profiles in Mobile Sales Applicstions," Pmc. Mubidntn Workshop, Oct. 1994.
[I71 Q. 1.u and M. Setysnsrayman, "Improving Data Consistency in
Mobile Computing Using Isolation-Only Transactions," Proc. F f l l i
Workshop Hof Topics ii! Operaling Systenis, May 1995.
[I81 Q. Lu and M. Satyunurayenan, "Rcsource Conservation in a
Mobile Transaction System," IEEE Trans. Computers, vol. 46, no. 3,
pp. 299311, Mar. 1997.
[ 191 K. Miller, "Cellular Essentials far Wireless Data Transmission,"
Data Comm., vol. 23, no. 5, pp. 61-67, Mar. 1994.
[ZO] L.B. Mummert, M.R. Ebling, and M. Satyanarayanan, "Exploiting
Weak Connectivity for Mobile File Acccss," Proc. 15th ACM Symp.
Operating Systems Principles, Dec. 1995.
[21] B.D. Noble, M. Price, and M. Satysnarayanan, "A Programming
Interface for Application-Aware Adaptation in Mobile Camputing," Compuling Systems, vol. 8, no. 4, Winler 1996.
[22] C. Papadimitrinu, The Theory of Datnbiisc Cnnarrrency Confro!.
Computer Science Press, 1986.
[23] K. Petersen, M. Sprcitrer, D.B. Terry, M. Theimer, and A.J.
Demcrs, "Flexible Update Propagation for Weakly Consistent
Replication," Proc. 17th ACM Symp. Opernting Systenis Principles,
pp. 288-301, 1997.
[24] E. Pitoura, "A Replication Schcmit to Support Weak Connectivity
in Mabilc Information Systems," Pmc. Seventh Int'l Cqf, Database
nnd Expert Systems Applicntions ( D E X A '96), pp. 510-520, Sept.
I
1996.
[25] E. l'itoura and B. Bharguva, "Building Information Systems for
Mobile Environments," I'roc. Third Int'l Conf. Iitformntion nnd
Knowlcd,ye Managcinent, pp. 371-378, Nov. 1994.
1261 E. Pitaura and B. Bharguva, "Maintaining Consistency of Data in
Mnbile Distributcd Environments," Proc. 15th IEEE Int'l CO,$
Distributed Coni~utinvSustcms. DD. 404413. Mav 1995.
[27] E. Fitaura and 6.Sama&s, Data' Mnnafiementfor'Mobi!e Computing.
K l i i w. p.~
r,~
. 1998.
[28] C. 1"' and A. Lcff, "Replica Control in Distributcd Systems: An
Asynchronous Approach," I'roc. ACM SIGMOD, pp. 377-386,
~
1991.
[29] K. Rumamritham and C. Pu, "A Tonne1 Characterization of
Epsilon Serislirability," 1EEE Trans. Knowledge nrrd Datn Eng.,
vol. 7, no. 6, pp. 997-1,007, 1995.
[30] M. Satyanarayman, J.J. Kistler, L.B. Mummert, M.R. Ebling, 1.'
Kumor, and Q. Lu, "Experience with Disconxctcd Opcrotion in i\
Mobile Camp uting Environment," Proc. 1993 Useriix Symp. Mobile
and Location-Independmt Computing, Aug. 1993.
[31] A. Sheth and M. Rusinkiewicz, Management of Interdependent
Data: Specifying Dependency and Consistency Requirements,
I'roc. Workshop Monagement of Replicnted Uatn, pp. 133-136, Nov.
19YO.
PITOUHA AND BHAHGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS
C.D. Tait and U. Duchamp, "Service Interface and Heplica
Management Algorithm for Mobilc File Systcm Clients," Proc.
Firs1 Int'l Cnnf, Piimllel nnd Distributed lnformatioii Systcins, pp. 190197, 1991.
C.D. Tnit and D. Duchamp, "An Efficient Variable-Consistency
Replicated File Service," !'roc. Usenix File Systenis Workshop,
pp. 111-126, May 1992.
11. Tukagi, Qiiriicing Anniysis, Vol. 1: Vocation and Priority Systenis.
North-Holland, '1991.
D. Terry, A. Demers, K. Pctcrsen, M. Sprcitzer, M. Theimrr, and B.
Welch, "Session Guarantees lor Weakly Consistent Replicated
Data," Proc. lnt'l Con/ Pnmllcl and Disfributed lnforniafion Systenis,
pp. 140-149, Sept. 1994.
D.B. Terry,M.MTheirner, K. Petersen, A.J.Demers, M.J. Spreitzer,
and C.H. Hauser, "Managing Update Conflicts in Bayou, A
Weakly Connected Rcplicated Storage System," Proc. 15th ACM
Synip. Opcniting Syslcnis Principles, Dec. 1995.
G. Walborn and P. K. Chrysanthis, "Supporting Scrnentics-Bmcd
Transaction Processing in Mobile D n t a h s e Applications," Proc.
14th Symp. Reliable Disfribirfed Systcins, Scpt. 1995.
M.H. Wung and D. Agmwal, "Tolerating Bounded Inconsistency
for Increasing Coiicurrency in Database Systcms," Proc. l l t k ACM
Symp. l'rinciples of Dntnbnsc S!/stems (PODS), pp. 236-245, 1992.
P.S Yu. I1.M. Dias. and S.S Lavcnbere. "On the Analvtical
Modeling of Database Concurrency Cok&"
I. ACM vd. 40,
no. 4, pp. 831-872, Sept. 1993.
915
Evaggelia Pitoura received her diploma from
the Department of Computer Science and
Engineering of the University of Patras, Greece,
in 1990; and her MSc and PhD degrees in
computer science from Purdue University in
1993 and 1995, respectively. Since September
1995, she has been with the Department of
Computer Science at the University of loannina,
Greece. Her research interests include database
issues in mobile computing, heterogeneous
databases, and distributed systems. Her publications in the above
areas include several journal and conference proceedings alticles, plus
a recently published book on mobile computing. She is a member of the
IEEE Computer Society
Bharat Bhargava is currently a professor in the
Department of Computer Science at Purdue
University. His research involves both theoretical and experimental studies in distributed
systems. His research group has implemented
a robust and adaptable distributed database
system, called RAID, to conduct experiments in
replication control, checkpointing, and communications. He has conducted experiments in
large-scale distributed systems, communicain implementing object support on top of the
relational model. He has recently developed an adaptable video
conferencing system using the NV system from Xerox PARC. He is
conducting experiments with research issues in large-scale communication networks to support emerging applications such as digital
libraries and multimedia databases. He chaired the IEEE Symposium on
Reliable and Distributed Systems that was held at Purdue in 1998. and
he is on the editorial board of three international journals. At the 1988
IEEE Data Engineering Conference, he and John Riedl received the
best paper award for their work on "A Model lor Adaptable Systems for
Transaction Processing." He is a fellow of the IEEE and Institute of
Electronics and Telecommunication Engineers. He has been awarded
the Gold Core member distinction by the IEEE Computer Society for his
distinguished service. He received the Outstanding instructor award
from the Purdue chapter of the ACM.
Download