IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6, NOVEMBERIDECEMBER 1998 896 Data Consistency in Intermittently Connected Distributed Systems Evaggelia Pitoura, Member, /€E€ Computer Society, and Bharat Bhargava, Fellow, /E€€ Abstract-Mobile computing introduces a new form of distributed computation in which communication is most often intermittent,lowbandwidth, or expensive, thus providing only weak connectivity. In this paper, we present a replication scheme tailored for such environments. Bounded inconsistency is defined by allowing controlled deviation among copies located at weakly connected sites. A dual database interfaceis proposed that in addition to read and write operations with the usual semantics supports weak read and write operations, In contrast to the usual read and write operations that read consistent values and perform permanent updates, weak operations access only local and potentially inconsistent copies and perform updates that are only conditionally committed. Exploiting weak operations supports disconnected operation since mobile clients can employ them to continue to operate even while disconnected. The extended database interface coupled with bounded inconsistency offers a flexible mechanism for adapting replica consistency to the networking conditions by appropriately balancing the use of weak and normal operations, Adjusting the degree of divergence among copies provides additional support for adaptivity. We present transaction-oriented correctness criteria for the proposed schemes, introduce corresponding serializability-basedmethods, and outline protocols for their implementation.Then, some practical examples of their applicability are provided. The performance of the scheme is evaluated for a range of networking conditions and varying percentages of weak transactions by using an analytical model developed for this purpose. Index Terms-Mobile computing, concurrency control, replication, consistency, disconnected operation, transaction management, adaptability. + 1 INTRODUCTION A DVANCBS in telecommunications and in the development of portable computers have provided for wireless communications that permit users to actively participate in distributed computing even while relocating from one support environment to another. The resulting distributcd environment is subject to rcstrictions imposed by the nature of the networking environment that provides varying, intermittent, and weak connectivity. hi particular, mobile clients encounter wide variations in connectivity ranging from high-bandwidth, low latency communications through wired networks to total lack of connectivity [7], [13], [27]. Between these two extremes, connectivity is frequently provided by wireless networks characterized by low bandwidth, excessive latency, or high cost. To overcome availability and latency barriers and reduce cost and powcr consumption, mobile clients most often deliberatcly avoid use of the network and thus operate switching between connected and disconnected modes of operation. To support such behavior, disconnected operation, that is the ability to operate disconnected, is essential for mobile clients [13],1141, [30], [25]. In addition to disconnected operation, operation that exploits zuenk connectivity, that is connectivity provided by intermittent, lowbandwidth, or ,expensive networks, is also desirablc [ZO], [lZ]. Besides mobilc computing, weak and intermittent ~~~~ ~~~~~~~~ ~ E. Pitoirrii is with the Dcpnrtmcnt of Computer Science, University of Ionnninn, GR 45110 loamitto, Greece. E-ninil: pitoura8cs.uoi.gr. R. Riinrpvn is iuitk the De{iartmrnt of Cornpiitrr Sciences, Purdiic Uriivcrsit!/, West Lnfuyette, IN 47907. E - m i l : bb@'cs.piirdue.edu. Maitiiscrini m c i v e d 2 ,ldii * 199G:. revised 19 Miit, * 1997.. mcentrd , 22 inn. 1998. For irforniutioii or, obtoiniriy reprints cf this article, please send e-mnii to: tkdLBaiaipi~fer.oy,nnd reference I E E B C S Log Nunibrr 104369. connectivity also applies to computing using portable laptops. In this paradigm, clients operate disconnected most of the time and occasionally connect through a wired telephone line or upon returning to their working environment. Private or corporate databases will be stored at mobile as well as static hosts and mobile users will query and update these databases over wired and wireless networks. These databases, for reasons of reliability, performance, and cost will be distributed and replicated over many sites. In this paper, we propose a replication scheme that supports weak connectivity and disconnected operation by balancing network availability against consistency guarantees. In the proposed scheme, data located at strongly connccted sites are grouped together to form clusters. Mutual consistency is required for copies located at the same cluster, while degrees of inconsistency are tolerated for copies at different clusters. The interface offered by the database management system is enhanccd with operations providing weaker consistency guarantees. Such weak operations allow access to locally, i.c., in a cluster, availablc data. Weak reads access bounded inconsistent copies and weak writes make conditional updates. The usual operations, called strict in this paper, are also supported. They offer access to consistent data and perform permanent updates. The scheme supports disconnected operation since users can operate even when disconnected by using only weak operations. In cases of weak connectivity, a balanced use of both weak and strict operations provides for better bandwidth utilization, latency and cost. h i cases of strong connectivity, using only strict operations makes the scheme reduce to the usual one-copy semantics. Additional support 1041-4347/99/5100Ot~ 1999 IEEE ~ PITOURA AND BHARGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS for adaptability is possible by tuning the degree of inconsistency among copies based on the networking conditions. In a sense, weak operations offer a form of applicationaware adaptation [Zl].Application-aware adaptation characterizes the design space between two extreme ways of providing adaptability. At one extreme, adaptivity is entirely the responsibility of the application, that is there is no system support or any standard way of providing adaptivity. At the other extreme, adaptivity is subsumed by the system, here the database management system. Since, in general, the system is not aware of the application semantics, it cannot provide a single adequate form of adaptation. Weak and strict operations lie in an intermediate point between these two extremes, serving as middleware between a database system and an application. They are tools offered by the database system to applications. The application can at its discretion use weak or strict transactions based on its semantics. Thc implementation, consistency control, and the underlying transactional support is the job of the database management system. The remainder of this paper is organized as follows. In Section 2, we introduce the replication model along with an outline of a possible implementation that is based on distinguishing data copies into core and quasi. In Sections 3 and 4, we define correctness criteria, prove corresponding serializability-based theorems, and present protocols for maintaining weak consistency under the concurrent execution of weak and strict transactions and for reconciling divergent copies,, respectively. Examples of how the scheme can bc used arc outlined in Section 5. In Section 6, we develop an analytical model to evaluate the performance of the scheme and the interplay among its various parameters. The model is used to demonstrate how the percentage of weak transactions can be effectively tuned to attain the desired performance. The performance parameters considered are the system throughput, the number of messages, and the response time. The study is performed for a range of networking conditions, that is for different values of bandwidth and for varying disconnection intervals. In Section 7, we provide an estimation of the reconciliation cost. This estimation can be used for instance to determine an appropriate frequency for the reconciliation events. In Section 8, we compare our work with related research, and we conclude the paper in Section 9 by summarizing. 897 unavailable. To this end, weak transactions are defined as access copies at a single p-cluster. At the same time, the usual atomic, consistcnt, durable, and isolated distributed transactions, called strict, arc also supported. 2.1 The Extended Database Operation Interface To increase availability and reduce intercluster communication, direct access to locally, e.g., in a p-cluster, available copies is achieved through weak read ( W R )and weak write (WW) operations. Weak operations are local at a p-cluster, i.e., they access copies that reside at a single p-cluster. We say that a copy or item is locally available at a p-cluster if there exists a copy of that item a t the p-cluster. We call the standard read and write operations strict rend (SR)and strict write (SW) operations. To implement this dual database interface, we distinguish the copies of each data item as core and quasi. Core copies are copies that have permanent values, while quasi copies are copies that have only conditionally committed values. When connectivity is restored, the values of core and quasi copies of each data item arc reconciled to attain a system-wide consistent value. To process thc operations of a transaction, the database management system translates operations on data items into operations on copies of these data items. This procedure is formalized by a translotion function 11,. Function ii maps each read operation on a data itcm z into a number of read operations on copies of x and returns one value (e.g., the most up-to-date value) as the value read by the operation. That is, we assume that IL when applied to a read operation returns one value rather than a set of values. In particular, h maps each S R [ z ]operation into a number of read operations on core copies of T and returns one from these values as the value read by the operation. Depending on how each weak read operation is translated, we define two types of translation functions: a best-effort translation function that maps each WR,[z]operation into a number of read operations on locally available core or quasi copies of x and returns the most up-to-date such value, and a conservative translation function that maps each weak read operation into a number of read operations only on locally available quasi copies and returus the most up-todate such value. Based on the time of propagation of updates of core copies to quasi copies, we define two types of translation functions: an eventual translation function that maps an SW[s] into writes of core copies only and an immediate translation function that in addition updates the quasi 2 THECONSISTENCY MODEL copies that reside at the same p-cluster with the core copies The sites of a distributed system arc grouped together in For an immediate /I,, conservative and best-effort written. sets called physical clusters (or p-clusters) so that sites that have the same rcsult. Each W W [ z ]operation is translated arc strongly connected with each other belong to the same / I into a number of write operations of local quasi copies by p-cluster, while sites that arc weakly connected with each of z. Table 1 summarizes the semantics of operations. other belong to different p-clusters. Strong connectivity How many and which core or quasi copies are actually refers to connectivity achieved through high-bandwidth and low-latency communications. Weak connectivity in- read or written when a database operation is issued on a cludes connectivity that is intermittent or low bandwidth. data item depends on the coherency algorithm used, e.g, The goal is to support autonomous operation at each quorum consensus or ROWA [3]. Users interact with the database system through transacphysical cluster, thus eliminating the need for communication among p-clusters since such intercluster communica- tions, that is, through the execution of programs that tion may be expensive, prohibitively slow, and occasionally include database operations: IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11. NO. 6, NOVEMBERIDECEMBER 1999 898 TABLE 1 Variations of the Translation Function Reads local copics and rclurns as tlic value read the most recenl one Weak Read (WK) Variations n c a t - c r r ~I): ~t Roads both corc and quasi local copies C:unuervntive 11: Rcnds only local quasi copica Stricl Rcad (SK) Rcads corc copies and returns as the value read the most rcconl onc Woak N'rite (WW) Writcs local quasi copies Eventual h: Strict Write (SW) Variations lrnmediilte h: Writcs unly core copies Wntos both core and quasi copies at the corrcsponding p-clustcrs 2.2 Data Correctness As usual, a database is a set of data items and a database and < represents their execution order. The operntions include state is defined as a mapping of every data item to a value of the follozuing data operations: weak (1YR.Ior strict rends ( S R ) its domain. Data items are related by a number of and weak ( W W ) or strict writes ( S W ,as well as abort (A) restrictions called integrity constraints that express relationand commit (C) operations. The partial order must specify the ships among their values. A database state is consistent if order of conflicting data operations and contains exactly one the integrity constraints are satisfied 1221. In this paper, we abort UY commit operation which is the last in the order. Two consider integrity constraints to be arithmetic expressions weak or strict data operations conflict if the!{ access the same that have data items as variables. Consistency maintenance copy ofu data item and nt least one of them is II weak or strict in traditional distributed environments relies on the assumption that normally all sites are connected. This zurite operation. assumption, however, is no longer valid in mobile computing since the distributed sites are only intermittently Two types of transactions are supported, weak and strict. connected. Similar network connectivity conditions also A strict transaction ( S T )is a transaction where OP does not include any weak operations. Strict transactions are atomic, hold in widely distributed systems as well as in computing consistent, isolated and durable. A w e d transaction (WT)is with portable laptops. To take into account intermittent connectivity, instead of a transaction where 01' does not include any strict requiring maintenance of all integrity constraints of a operations. Weak transactions access data copies that rcside distributed database;we introduce logical clusters as units at the same physical cluster and thus are executed locally at of consistency. A logical cluster, I-cluster, is the set of all this p-cluster. Weak transactions are locnlly committed at quasi copies that reside at the same p-cluster. In addition, the p-cluster at which they are cxccuted. After local all core copies constitute a single system-wide logical commitment, their updates are visible only to weak cluster independently of the site at which they reside transactions in the same p-cluster; other transactions are physically. We relax consistency in the sense that integrity not aware o f these updates. Local commitment of weak constraints are ensured only for data copies that belong to transactions is conditional in the sense that their updates the same logical cluster. An intracluster integrity constraint might become permanent only after reconciliation when is an integrity constraint that can be fully evaluated using local transactions may become globally committed. Thus, data copies of a single logical cluster. All other integrity there are two commit events associated with each weak constraints are called intercluster integrity constraints. transaction, a local conditional commit in its associated cluster and an implicit global commit at reconciliation. Definition 2. A mobile database state is consistent if nll Other types of transactions that combine weak and strict intracluster integrity constraints are satisfied and all interoperations can be envisioned; their semantics, however, cluster integrity constraints are bounded-inconsistent. become hard to define. Instead, weak and strict transactions can be seen as transaction units of some advanced In this paper, we focus only on replication intercluster transaction model. In this regard, upon its submission, integrity constraints. For such integrity constraints, each user transaction is decomposed into a number of weak bounded inconsistency means that all copies in the same and strict subtransaction units according to semantics and logical cluster have the same value whilc among copies at the degree of consistency required by the application. different logical clusters there is bounded divergence 1321, Definition 1. A transaction (2') is a partial order ( O r , < .I, where OP is the set of operations executed by the transnction, PlTOURA AND BHARGAVA: DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS 899 TABLE 2 Divergence Among Copies iiiaxiinuiii numbcr or transactions h a t opcratc on quasi copics range olacccptablc values that a data itcni can talc niaxinium number of copics pcr data itcm that divcrge rrom a prc-assigncd d isthc: primary copy olthe item, i.c., thc corc copies maximum number or data itcnis that have divergent copics maximum nuinbcr of updates per data itcin that are not rcflected at all copies of thc item [l]. Bouudcd divergence is quantified by a positive integer d, called degree of divergence; possible definitions of d are listed in Tablc 2. A replication constraint for z thus hounded is called d-consistcnt. The degree of divcrgence among copies can be tuued based on thc strength of connection among physical clustcrs, by keeping the divergence small in instances of high bandwidth availability and allowing for greater deviation in instances of low bandwidth availability. Immediate Translation and Consistency. To handle intcgrity constraints besides replication, in the case of an immediate translation function h, / I should be defined such that the intcgrity constraints betwcen quasi copies in the same logical cluster arc not violated. The following example is illustrative. Example 1. For simplicity, consider only one physical cluster. Assume two data items z and TI, related by the integrity constraint I : > 0 + $1 > 0, and a consistent database state zc = -I, :I:,, = -1, TIc = 2, and TIq= -4, whcre the subscripts c and q denote corc and quasi copics, respectively. Consider tlre'transaction program: x = 10 ify<O then y = 1 0 If the above program i s executed as a strict transaction S W ( z ) SR(y) C, we get the database state zi. = IO, 1 2 , and zq = It), = -4 in which the integrity constraint betwcen the quasi copies of :I: and ?lisviolated. Note that the quasi copics were updated because we considered an immediate translation function. The problem arises from the fact that quasi copies are updatcd to the current value of the core copy without taking into consideration intcgrity constraints among them. Similar problems occur wheu refreshing individual copies of a cache ['I]. Possible solutions include: 1) Each time a quasi copy is updatcd at a physical clustcr as a result of a strict writc, the quasi copies of all data in this cluster relatcd to it by some integrity constraint are also updated either after or prior to the execution of the transaction. This update is done following a reconciliation procedure fur merging core and quasi copies (as in Section 4). In the abovc example, the core and quasi copics of z and y should have been reconciled prior to the execution of the transaction, producing for instance the database state z,: = -1, z, = -I, ?I,. = 2, and y, = 2. Then, the execution of the transaction would result in the database statc xi. = 10, zq = IO, = 2, and yQ = 2, which is consistent. 2) If a strict transaction updates a quasi copy at a physical cluster, its read operations are also mapped into reads of quasi copies at this cluster. In cases of incompatible reads, again, a reconciliation procedure is initiated having a rcsult similar to the one above. 3 ) Updating quasi copies is postponad by deferring any updates of quasi copies that result from writes of the corresponding core copics. A log of wcak writes resulting from strict writes is kept. In this scenario, the execution of the transaction rcsults in the database state xc = 10, zq = -1, y,, = 2, and ?Iq = ~ - 4 , which is consistent. The first two approaches may force an immediate reconciliation among copies, while the third approach defers this reconciliation and is preferable in cases of low conneclivity among physical clusters. 3 WEAKCONNECTIVITY OPERATION In this section, we provide serializability-based criteria, graph-based tests and a locking protocol for correct executions that exploit weak conncctivity. When o1 is an operation, the subscript ,I denotes that o belongs to transaction j , while the subscript on a data copy idcntifies the physical cluster at which the copy is located. Without loss of generality, we assnine that there is only one quasi copy per physical cluster. This assumption can be easily lifted but with significant complication in notation. Since all quasi copies in a physical clustcr have the same value, this single copy can be regarded as their representative. Rcad and write operations on copies are denoted by vend and write, respectively. A complete intracluster schedule (IAS) is an observation of an interleaved execution of transactions in a given physical cluster configuration, that includes (locally) committed weak transactions and (globally) committed strict transactions. Formally, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERIDECEMBER 1999 900 Definition 3 (bitrncliister schedule). Let 7’ = Po,Ti, ...,Trb}be a set of trnnsnctions. A (complete) intracluster schedule, IfLS, over ‘1’ is n pnir (OP, <,‘ ) in wlzich <,, is a pnvtial ordering consider a weak form of correctness, in which the only requirement for weak transactions is that they read consistent data. The requirement for strict transactions is stronger, because they must produce a system-wide relatioii sucli that consistent database state. In particular, the execution of 1. 01’ = /L(U;=~ 2 ; )for sorne tr~mslntioizfiriiction 11. strict transactions must hide the existence of multiple core 2. For each :I; aiidnll i~perntionsnpb, opl in Y;,f o p i <, copies per item, i.e., it must be view-equivalent to a oneopl, then every operation in h(opb) is relnted by <,< to copy schedule [3].A replicated-copy schedule S is v i e w every operotion i r i h(op,). equivalent to n one-copy scheduleSlc if 1) S and S,U have the 3. All pnirs of conflictirrg operations are related by <,, , same reads-from relationship for all data items, and 2) for where two operatioris conflict if they access tlze same each final write W’Xi(z) in ,Sic, W I (zj)is a final write in copy and one of them is n write operation. S for some copy sj of z. These requirements for the 4. For all rend operntions, r m d j [ there is at lenst one execution of strict and weak transactions are expressed in writer[?:,]siich that ‘ior.ite~[:c,] <,, vcad,j(z:i] the following definition: 5 . Z,fSW,[:r]< 3 SX,j[2:]ami r ~ n d , ~ t( :Ii(SI$[z]), ~) then Definition 4 (IAS weak corwctness). An intracluster schedule 7wtiej(s,) E h(SJVj[s]). S,,\,T is weakly correct iffall the folluwing three conditions are 6. urik,[zi]t h(SW1[:c]) for soriie strict tramaction satisfied: Ti, Hien wr.ite,j[y]t Ii(SW,[g]),for all !/ writteiz by I . All transnctions in SIa,y have a coiisislent view, i.e., all Tj for 7ulrich there is a f/i at pliysical cluster Cli, where intqrily constraints that can be evalirnted using the zi is a qirnsi copy when li is conservative nnd any, data rend nrc valid; quasi or core, copy zulieii Ir is best effort. 2. There is a one copy serial scliedule S such thot: a) it is If Condition 1 states that the transaction managers translate each operation on a data item into appropriate operations on data copies. Condition 2 states that the intracluster schedule preserves the ordering <, stipulated by each transaction 7; and Condition 3 that it also records the execution order of conflicting operations. Condition 4 states that a transaction cannot read a copy unless it has been previously initialized. Conditiou 5 states that, if a transaction writes a data itcm 2: before it reads :r:, then it must write to the same copy of :E that it subsequently reads. I’inally, Condition 6 indicates that for a strict transaction, if a write is translated to a write on a core copy at a physical cluster Cli then all other writes of this transaction must also write any corresponding copies at this cluster. This condition is necessary for ensuring that weak transactions do not see partial results of a strict transactinn. A read operation on a data item z rends-x-from a transaction Y;, if it reads (i.e., returns as the value scad) a copy of z written by 7: and no other transaction writes this copy in between. We say that a transaction ’1; has the S~IIIC readsfom relationship in schedule SI as in schedule S2 if, for any data item :E, Tireads-2:-from Ti in SI, then it reads-sfrom 7; in S,. Given a schedule S,the projection of S on strict transactions is the schedule obtained from S by deleting all weak operations, and the projection of S on a pliysicnl clnster Cli. is the schedule obtained from 9 by deleting all operations that access copies not in C l h . Two schedules are conflict t*qiiivaleiil if they are defined over the same set of transactions, have the same set of operations and order conflicting opcrations of committed tTansactions the same way 131. A one-copy schedule is the single-copy interpretation of an (intraclustcr) schedule whcrc all operations on data copies are represented as operations on the corresponding data item. 3.1 Correctness Criterion A correct concurrent execution of weak and strict transactions must maintain bounded-inconsistency. First, we defined on the same set of strict transactions as SMS, 6) strict traiisnctions in S liave the same reads-from relationship as in S’IAS,ani1 c) the set offinal writes in ,S is the same as the set of final writes on core copies in Si,w and 3. Bounded divergence nniong copies at different logical clusters is niaintained. Next, we discuss how to enforce the first two conditions. Protocols for bounding the divergence among copies at different logical clusters are outlined a t the end of this section. A schedule is one-copy scrinlizahle if it is equivalent to a serial one-copy schedule. The following theorem defines correctness in terms of equivalence to serial executions. Theorem 1. Given that bounded divergence aiwong copies at different logicnl clusters is maintaiized, if the projection of intracliister sckedule S on strict transactions is onecop?/ serinliznhle and ench of its projections o n a physical cluster is conflict-equivalent to a serin1 schedule, then S is weakly correct. nil Proof. Condition 1 of Definition 4 is guaranteed for strict transactions from the requirement of one-copy serializability since strict transactions get the same view as in a one-copy serial schedule and read only core copies. For weak transactions at a physical cluster, the first coiidition is provided from the requirement of serializability of the projection of the schedule on this cluster given that the projection of each (weak or strict) transaction on the cluster satisfies all intracluster integrity constraints when executed alone. Thus, it suffices to prove that such projections maintain the intracluster integrity constraints. This trivially holds for weak transactions, since they are local at a physical cluster. The condition also holds for strict transactions since, if a strict transaction maintains consistency of all datahase integrity constraints, then its projection on any cluster also maintains thc consistency of intracluster integrity constraints as a ~ PITOURA AND BHARGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS consequence of Condition 6 of Definition 3. Finally, one copy serializability of the projection of strict transactions suffices to guarantee Conditions 2b and 2c since strict transactions read only core copies and weak transactions 0 do not write core copies, respectively. Note, that intercluster integrity constraints other than replication constraints among quasi copies of data items at different clusters may be violated. Weak transactions however are unaffected by such violations, since they read only local data. Although, the above correctness criterion suffices to ensure that each weak transaction gets a consistent view, it does not suffice to ensure that weak transactions at different physical clusters get the same view, even in the absence of intercluster integrity constraints. The following example is illustrative. 901 relntionship as in S1c and b) the set of final writes on core copies in SSis the same as in SIC;and 3. Bounded divergence among copies at diferent logicnl clusters is maintained. Lemma 1. Given that bounded divergence among copies at d$erent logical clusters is maintained if the projection of an intracluster schedule S is conflict-equivalent to n serin1 schedule SS and its projection on strict transactions is view equivalent to a one-copy serial schedule Slcsuch that the order of transactions in SS is consistent with the order of transactions in S,c, S is strongly correct. Proof. We need to prove that in SICstrict transactions have the same reads-from and final writes as in S.7 which is straightforward since strict transaction only read data produced by strict transactions and core copies are written only by strict transactions. 0 Example 2. Assume two physical clusters Cl1 = {z,y} and CLz = {w, z,1} that have both quasi and core copies of the corresponding data items, and the following two Since weak transactions do not conflict with weak strict transactions ST1 = SW,[z] SWL[UJ]CI and 37'2 = transactions at other clusters directly, the following is an SWz [2/1SWz[z]SRz[z]Cz. In addition, at cluster C11 we equivalent statement of the above lemma: C:+, have the weak transaction WT, = WR3[z]WRx[?/] and at cluster Clz the weak transactions WTa = W R I [ ~ ] Corollary 1. Given that bounded divergence among copies at WVV4[1] C4, a n d WT, = W&[w] WR,[l] CS. For different logical clusters is maintained, if the projection of nn simplicity, we do not show the transaction that intracluster schedule S on strict trnnsactions is view initializes all data copies. We consider an immediate equivalent to a one-copy serial schedule Sic, and each of its and best effort translation function 6. For notational projections on a physical cluster Cli is conflict-equivalent to simplicity, we do not use any special notation for the a serial schedule S., such that the order of trnnsactions in Ss2 core and quasi copies, since which copies are read is is consistent with the order of transactions in Sic, S is inferred by the translation function. strongly correct. Assume that the execution of the above transactions produces the following schedule which is weakly correct: If weak IAS correctness is used as the correctness criterion, then the transaction managers at each physical [w]sw1[z] WR,[z]SW, [U]]Cl sw2 [y]SW2 [z] cluster must only synchronize projections on their cluster. Global control is required only for synchronizing strict SE2 [z]cz W'% [$ C:,WRa [z]WWa [1]ca [1]C, transactions. Therefore, no control messages are necessary The projection of S on strict transactions is: ,SW,[x] between transaction managers at different clusters for S w ~ [ ? uCI ] SWz[y] SWz[z]CZ,which is equivalent to the synchronizing weak transactions. The proposed scheme is 1SR schedule: SWI[Z] SWl[w]Cl SWZ[Y] S W z [ z ]CZ flexible, in that any coherency control method that The projection of S on C ~ ISWl[z] : WR3[2]CI SWz[?/] guarantees one-copy serializability (e.g., quorum conscnsus SRz[z]WRs[y]C3 is serializable as ST1 STz i WTI. or primary copy) can be used for synchronizing core copies. The projection of S on Ch: W&,[w]SWI[VJ] CI SW2[z] The scheme reduces to one-copy serializability when only CZ W & [ z ] WW4[1] Ca WRs[l]C, is serializable as strict transactions are used. STz -> W'Z,+ WT, + ST,. 3.2 The Serialization Graph Thus, weak correctness does not guarantee that there To determine whether an IAS schedule is correct, we use a is a serial schedule equivalent to the intracluster schedule modified serialization graph, that we call the intrncluster as a whole, that is, including all weak and strict serialization graph (IASG) of the IAS schedule. To construct transactions. The following is a stronger correctness the IASG, first a replicated data serialization graph (SG) is criterion that ensures that all weak transactions get the built that includes all strict transactions. An SG [3] is a same consistent view. Obviously, strong correctness serialization graph augmented with additional edges to take into account the fact that operations on different copies implies weak correctness. of the same data item may also create conflicts. Acyclicity of Definition 5 (IAS strong correctness). An intracluster schednle the SG implies one-copy serializability of the corresponding S is strongly correct $there is a serial schedule S,s such that schedule. Then, the SG is augmented to include weak transactions as well as edges that represent conflicts between weak transactions in the same cluster and weak 1. Ss is conflict-equivalent with S; 2. There is a one-copy schedule SE such that a) strict and strict transactions. An edge is called a dependency edge if transnctions in SS have the same reads-from it represents the fact that a transaction reads a value s=wn, w& - IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERIDECEMBER 1999 902 produced by another transaction. An edge is called a precedence edge if it represents the fact that a transaction reads a value that was later changed by another transaction. It is easy to see that in the IASG there are no edges between weak transactions at different clusters since weak transactions at different clusters read different copies of a data item. In addition: mi Property 1. Let WT, be n weak transaction at cluster C1, nnd S3' a strict transaction. The IASG graph induced b!/ nn IAS mny iiiclude only the following edges betiueen them: a dependency edge from SI' to Mi21 nnd a precedence edge from C V Z to ST. Proof. The proof is straightforward since the only conflicts between weak and strict transactions are due to strict writes and weak reads of the same copy of a data 0 item. Theorem 2. Let Srn.7 he nti intracluster schedule. If Sins has an acyclic IASG, then SI,,,y is strongly correct. Proof. When a graph is acyclic then each of its subgraphs is acyclic, thus the underlying SG on which the IASG was built is acyclic. Acyclicity of the SG implies one-copy serializability of strict transactions, since strict transactions only read values produced by strict transactions. ,T,L be all transactions in Slas. Thus, T,,Tz:.. . ,T,, are the nodes of the IASG. Since IASG is acyclic it can be topologically sorted. Let Ti,, Ti,,. . . ,T,,, be a topological sort of the edges in IASG, then by a straightforward application of the serializability theorem [3],Sins is conflict equivalent to the serial schedule SC = T,,,T,,, . , . ,T,,,. This order is consistent with the partial order induced by a topological sorting of the SG, allowing Slc to be the serial schedule corresponding to this sorting. Tli~is,the order of transactions in S.7 is 0 consistent with the order of transactions in S,c. 3.3 Protocols Serializability. Wc distinguish between coherency and concurrency control protocols. Coherency control ensures that all copies of a data item have the same value. In the proposed scheme, we must maintain this property globally for core and locally for quasi copies. Concurrency control ensures the maintenance of the other integrity constraints, here the intracluster integrity constraints. For coherency control, we assume a gcncric quorum-based scheme [3].Each strict transaction reads q7 core copies and writes qa core copies per strict read and write operation. The values of qT and q,,, for a data item x are such that qv qw > I L ~ ,where n,,l is the number of available core copies of 2 . For concurrency control we use strict two plzase locking, where each transaction releases its locks upon commitment [3]. Weak transactions release their locks upon local commitment and strict transactions upon global commitment. There are four lock modes (WX, WW, SI?, SW) corrcsponding to the four data operations. Before the execution of each operation, + ww SW (C) (d) Fig. 1. Lock compatibility matrices. A X entry indicates that the lock modes are compatible. (a) Eventual and conservative /L. (b) Eventual and best effort /L. (c)Immediate and conservative ii. (d) Immediate and best effort /L. the corrcsponding lock is requested. A lock is granted only if the data copy is not locked in an incompatible lock mode. Fig. 3 depicts the compatibility of locks for various types of translation functions and is presented to demonstrate the interference between operations on items. Differences in compatibility stem from the fact tHe operations access different kinds of copies. The basic overhead on the performance of weak transactions imposed by thesc protocols is caused by other weak transactions at the same cluster. This overhead is small, since weak transactions do not access the slow network. Strict transactions block a weak transaction only when they access the same quasi copies. This interference is limited and can be controlled, e.g., by letting in cases of disconnections, strict transactions access only core copies and weak transactions access only quasi copies. Bounding divergence among copies. At each p-cluster, the degree for each data itcm expresses the divergence of the local quasi copy from the value of the core copy. This difference may result either from globally uncommitted weak writes or from updates of core copies that have not yet been reported at the cluster. As a consequence, the degree may be bounded by either limiting the number of weak writes pending global commitment or by controlling the h function. In Table 3, we outline ways of maintaining d-consistency for different ways of defining d. 4 A CONSISTENCY RESTORATION SCHEMA After the execution of a number of weak and strict transactions, for each data item, all its core copies have the same value, while its quasi copies may have as many different values as the number of physical clusters. Approaches to reconciling the various values of a data item so that a single value is selected vary from purely syntactic to purely semantic ones [ 5 ] . Syntactic approaches use serializability-based criteria, while semantic approaches use either the semantics of transactions or the PITOURA AND BHARGAVA: DATA CONSISTENCY IN INTERMlrrENTLY CONNECTED DISTRIBUTED SYSTEMS 903 TABLE 3 Maintaining Bounded Inconsistency Applicahle method +- Appropriately boond the nomnber olwcsli transactions nt The maximum numbcr ollransaclions tliot opcralc on quasi wpics cach physical cluster. In thc case r r l i i dynamic clustcr rcconfigumtion, lhc distribution d rvcak lrmsactioiis at cach cluster ~niislbc re-adjuslcd. A rango of accepliihlr villiics n data item ctin take 'lhc niaxiiniiin innnbcr or divcrgcnt copics pcr data itcm Allow only weak writes with values iiisidc thc acccptablc raiigc. For cnch data itcm, h ~ i i i dtlic iiurnbcr of physical clustcrs that cnn hnve quasi copies. The inaxiiiiniii nuiinhcr of data ilems that have 'l,\'ergcnt copics Dcfinc h so that each skicl wrilc inodifics the quasi copies at each The iiiaxiiniiin nuinbcr of updates per data itcin not reflcclcd ill all copies physical cluster at lcast after d opdntcs.Nnte, thal Ihis canlint bc casurcd Cor disconnected r;lustcrs. sincc therc i s no way oCnotiSying thcni for rcinotc opdatcs. semantics of data items. We adopt a purely syntactic and thus application-independent approach. The exact point when reconciliation is initiated depends on the application requirements and the distributed system characteristics. For instance, reconciliation may be forced to keep inconsistency inside the required limits. Alternatively, it may be initiated periodically or on demand upon the occurrence of specific events, such as the restoration of network connectivity, for instance when a palmtop is plugged-back to the stationary network or a mobile host enters a region with good connectivity. 4.1 Correctness Criterion Our approach for reconciliation is based on the following rule: A weak transaction becomes globally commited if the inclusion of its write operations in the schedule does not violate the one-copy serializability of strict transactions. That is, we assume that weak transactions at different clusters do not interfere with each other even after reconciliation, that is weak operations of transactions at different clusters never conflict. A (complete) intercluster schedule, IES, models cxecntion after reconciliation, where strict transactions become aware of weak writes, i.e., weak transactions become globally committed. Thus, in addition to the conflicts reported in the intracluster schedule, the intercluster schedule reports all relevant conflicts between weak and strict operations. In particular: Definition 6 (Intercluster schedule). An intercluster schedule ( I E S ) S m based on an intrucluster schedule STAS= (OP, ) is a pnir (OP', <,: where: 1. OP' opl = OP, and for any op, and up, E OP', if opi <,. op, in Si,s, in <, op, in Sins, then nddition: For ench pnir of weak write opi = WW,[z] and strict rend up, = SR,[z] operations, either for all pairs of operations, copi t h ( o p j ) and cup, E /i(opj), cupi <, cop, or m p , <,. copi; 3 . Fur each pair of wenk write up, = WW,[z] and strict write op, = ST.Vj[x] operations, either for all pnirs of operations, cop, E h(op8)and copj t li(upi), cop, <, cop1 or cup, <, cupz. 2. We extend the reads-from relationship for strict transactions so that weak writes are taken into account. A strict read operation on a data item z reads-z-from a transaction 2; in an IES schedule, if it reads a copy of z and T,, has written this or any quasi copy of z and no other transaction wrote this or any quasi copy of z in between. A weak write is acceptable as long as the extended reads-from relationship for strict transactions is not affected, that is strict transactions still read values produced by strict transactions. In addition, correctness of the underlying IAS schedule implies one-copy serializability of strict transactions and consistency of weak transactions. iEEE TRANSACTiONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERiOECEMBER 1999 904 Fig. 2 The reconciliation steps. Definition 7 (IES correctness). An intercluster schedule is correct fl 1. It is based on U correct IAS schedule STAS,and 2. The reuds-fvom relationship for strict transactions is the same with their reads-from relationship in the SIAS 4.2 The Serialization Graph To determine correct I E S schedules, we define a modified serialization graph that we call the interclnster serialization graph (IESG). To construct the IESG, we augment the serialization graph IASG of the underlying intracluster schedule. To force conflicts among weak and strict transactions that access different copies of the same data item, we induce 0 - - First, a write order as follows: If 2 ; weak writes and T h strict writes any copy of an item z then either T, Tk or Tk T, ; and then, a strict read order as follows: if a strict transaction Si;reads-x-from ST, in S,n,s and a weak transaction WT follows ST,, we add an edge STj --t IYT. Theorem 3. Let S m , y be an IES sckednle based on an IAS schedule S,,,,?. SIRS has nn acyclic IESG, then ,!SIBS If is correct. Proof. Clearly, if the IESG graph is acyclic, the corresponding graph for the IAS is acyclic (since to get the IESG we only add edges to the IASG) and thus the IAS schedulc is correct. We shall show that if the graph is acyclic, then the reads-from relationship for strict transactions in the intercluster schedule SIBS is the same as in the underlying intracluster schedule SI,,^. Assume that ST, reads-x-from ST, in SI,,,^. Then ST, + ST,. Assume for the purposes of contradiction, that ST,reads-x-from a weak transaction WT. Then WT writes x in SIE,Sand since s'11 also writes x either a) ST,+ WT, or b) WT + ST%.In case a, from the definition of the IESC, we get ST, + I,vT, which is a contradiction since ST, reads-x-from W?'. In case b, WT + STz, that is WT precedes ,521 which precedes STj, which again contradicts the assumption that STj reads-x-from W T . 4.3 PrOtOCOl To get a correct schedule we need to break potential cycles in the IES graph. Since to construct the IESG, we start from an acyclic graph and add edges between a weak and a strict transaction, there is always at least one weak transaction in each cycle. We rollback such weak transactions. Undoing a transaction T may result in cascading aborts of transactions that have read the values written by T; that is, transactions that are related to T through a dependency edge. Since weak transactions write only quasi copies in a single physical cluster and since only weak transactions in the same cluster can read these quasi copies, we get the following lemma: Lemma 2. Only weak trflnsactinns in the same physicul cluster read vulues written by weuk transnctions in thut cluster. The above lemma ensures that when a weak transaction is aborted to resolve conflicts in an intercluster schedule, only weak transactions in the same p-cluster are affected. In practice, fewer transactions ever need to he aborted. In particular, we need to abort only weak transactions whose output depends on the exact values of the data items they read. We call these transactions exact. Most weak transactions are not exact since by definition, weak transactions are transactions that read local &consistent data. Thus, even if the value they read was produced by a transaction that was later aborted, this value was inside an acceptable range of inconsistency and this is probably sufficient to guarantee their correctness. Detecting cycles in thc IESG can bc hard. The difficulties arise from the fact that between transactions that wrote a data item an edge can have any direction, thus resulting in polygraphs [22]. Polynomial tests for acyclicity are possible, if we make thc assumption that transactions read a data item before writing it. Then, to get the IES graph from the IAS graph, we need only: Induce a rcad order as follows: If a strict transaction ST reads an item that was written by a weak transaction WT, we add a precedence 4W ' . edge E/' Fig. 2 outlines the reconciliation steps. ~ PITOURA AND BHARGAVA: DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS 5 DISCUSSION In the proposed hybrid scheme, weak and strict transactions coexist. Weak transactions let users process local data thus avoiding the overhead of long network accesses. Strict transactions need access to the network to guarantee permanence of their updates. Weak rends provide users with the choice of reading an approximately accurate value of a datum in particular in cases of total or partial disconnections. This value is appropriate for a variety of applications that do not require exact values. Such applications include gathering information for statistical purposes or making higli-level dccisioiis and reasoning in expert systems that can tolerate bounded uncertainty in input data. Weak writes allow users to update local data without confirming these updates immediately. Update validation is delayed until the physical clusters are connected. Delayed updates can be performed during periods of low network activity to reduce demand nn the pcaks. Furthermore, grouping together weak updates and transmitting them as a block rather than one at a time can improve bandwidth usage. Por example, a salesperson can locally update many data items, till these updates are finally confirmed, when the machine is plugged hack to the network at the end of the day. However, since weak writes may not he finally accepted, they must he used only when compensating transactions arc available, or when the likelihood of conflicts is very low. For examplc, users can cmploy weak transactions to update mostly private data and strict transactions to update frequently used, heavily shared data. The cluster configuration is dynamic. Physical clusters may be explicitly created or merged upon a forthcoming disconnection or connection of thc associated mobile clients. To accommodate migrating locality, a mobile host may join a different p-cluster upon entering a new support environment. Besidcs defining clusters based on the physical locution of data, other definitions arc also possible. Clusters may he dcfined based on the semantics of data or applications. Information about access patterns, for instance in the form of a user's profile that includes data describing the user's typical behavior, may he utilized in determining clusters. Somc examples follow. Example 1 (Cooperative cnvironment). Consider the case of users working o n a common project using mobile hosts. Groups are formed that consist of users who work on similar topics of the project. Physical clusters correspond to data used by people in the same group who need to maintain consistency among their interactions. We consider data that are most frequently accessed by a group as data belonging to this group. At each physical cluster (group), the copies of data items belonging to the group are corc copies, while the copies of data items belonging to other groups are quasi. A data item may belong to more than one group, if more than one group frequently accesses it. In this case, core copies of that data item exist in all such physical clusters. In cach physical clustcr, operatinns on items that do not belong to the group arc weak, while operations on data that belong to the group are strict. Weak updates on a data 905 item arc accepted only when they do not conflict with updates by the owners of that data item. Example 2 (Caching). Clustering can be used to model caching in a client/server architecture. In such a setting, a mobile host acts as a client interacting with a server at a fixed host. Data are cached at the client for performance and availability. The cached data arc considered quasi copies. The data at the fixed host are corc copies. Transactions initiated by the server are always strict. Transactions initiated by the client that invoke updates are always weak, while read-only client transactions can be strict when strict consistency is required and weak otherwise. At reconciliation, weak writes are accepted only if they do not conflict with strict transactions at the server. The frequency of reconciliation depends on the user consistency requirements and on networking conditions. Example 3 (Caching Location Data). In mobile computing, data representing the location of a mobile user arc fastchanging. Such data are frequently accesscd to locate a host. Thus, location data must be replicated at many sites to reduce the overhead of searching. Most of the location copies should be considered quasi. Only a few core copies are always updated to reflect changes in location. 6 QUANTITATIVE EVALUATION OF WEAK CONSISTENCY To quantify thc improvement in performance attained by sacrificing strict consistency in weakly connected environments and understand the interplay among the various parameters, we have developed an analytical model. The analysis follows an iteration-based methodology for coupling standard hardware resource and data contention as in [391. Data contention is the result of concurrency and coherency control. Resources include the network and the processing units. We generalize previous results to take into account a) nonuniform access of data, that takes into consideration hotspots and the changing locality, b) weak and strict transaction types, and c) various forms of data access, as indicated by the compatibility matrix of Fig. 1.An innovative feature of the analysis is the employment of a vacation system to model disconnections of the wireless medium. The performance parameters under consideration are the system throughput, the nmnber of messages sent, and the response time of weak and strict transactions. The study is performed for a range of networking conditions, that is, for different values of bandwidth and varying disconnection intervals. 6.1 Performance Model We assume a cluster configuration with TL physical clusters and a Poisson arrival rate for both queries and updates. Let A, and A,, respectively, be the average arrival rate of queries and updates on data items initiated at each physical clustcr. We assume fixed length transactions with N operations on -1- A,,)]Nof which are queries and data itcms, Nq = [&,/(A, N,, = [AJ(A<, A,,)]iV are updates. Thus the transaction rate, i.e., the rate of transactions initiated at cach p-cluster, is A N = A,JN,,. + IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERIDECEMBER 1999 906 messages with average size 771, then the service rate sl. is equal to W / m . Let 1, be the network transmission time. Number of Messages. The total number of messages transmitted per second among clusters is: Let c be the consistency factor of the application under consideration, that is, c is the fraction of the arrived operations that arc strict. To model hutspots, we dividc data at each p-cluster into hot and cold data sets. Let D bc thc number of data items per p-cluster, D,of which are cold and Ill,hot. To capture locality, we assume that a fraction o of transactions exhibit locality, that is they access data from thc hot set with probability h and data from the cold set with probability 1 - h. The remaining transactions access hot and cold data uniformly. Due to mobility, a transaction may move to a different physical cluster and thus the data it accesses may no longcr belong to the hot data of the new cluster. This can be modeled by letting o diminish. Locality is taken advantage of by the replication scheme by assuming that thc probability that a hot data has a core copy at a p-cluster is 1, and that a cold data has a core copy is I', where normally 1' < 1. Lct pi be the probability that an operation at a clustcr acccsses a data item for which there is a core copy at the cluster: p i = o[id + (1 - /L)//] + (I - li.I = 2rLC[A,(pi(q,. - 1 ) + X,,(Pi(h - + (1 -p1)q7) 1)+ (1-PI)%,)], The first term corresponds to query traffic; the second, to update traffic. Execution Time. For simplicity, we ignore thc communication overhead inside a cluster, assuming either that each cluster consists of a single node or that the communication among the nodes inside a clustcr is relatively fast. Without taking into account data contention, the average response time for a weak read on a data item is R; = iii t q and for a weak update f?;; = w t,,, where 11) is the average wait time at each cluster. I.ct 6, bc 0 if q7 = I and 1 otherwise, and I],,, be 0 if (I,,, = I and 1 otherwise. Then for a strict read on a data item + + + 0)[(1'D,)/D ( W , ) / U ] + + + (qv - l)ti, -1- h,.(26, 1, W ) ] R; = pl[UJiFor simplicity, we assume that there is one quasi copy of (1 - Pl)(Q.tl, + 2t, t, w ) each data item at each p-cluster. Let qr be the read and q,,> the write quorum and iVS bc the mean number of and for a strict write operations on data copics pcr strict transaction. The Rf, = p i [ t o -1- t,, -1- (qt,>- 1 ) t b G;,,(2tr t,, w)] transaction model consists of nI. 2 states, where ii,, is (1 - p i ) ( y , " t a + 2 t , + t , , + , ~ ) . the random variable of items acccssed by the transaction and NI, its mcan. Without loss of generality, we assume that The computation of iii is given in the Appendix. N L is equal to the number of operations. Thc transaction has Average Transmission Time. The average transmission an initial setup phase, state 0. Then, it progress to states time t, equals thc scrvicc time plus the wait time to1.at each 1 , 2 , .. . ,nil in that order. If successful, at the end of state n,~, network link, t,. = l/s, tu1.The arrival rate A, at each link the transaction enters into the commit phase at state nr,+l. is Poisson with mean M / ( n ( n - 1)).The computation of w7. The transaction response time 'r can he expressed a s is given in the Appendix. Throughput. The transaction throughput, i.e., input rate, lLI/, is bounded by: a) thc processing time at each cluster (since TLvniis = TINPI, f TI< ywJ tmmn,ii, ,i&l X 5 E[z], where X is the arrival rate of all requests at where n,,,,is the number of lock waits during the run of thc each cluster and I < [ z ] is the mean service time); b) the transaction, T,,,, is the waiting time for the j t h lock available bandwidth (since A, 5 I,?); and c) the disconneccontention, 'rli is the sum of the execution times in states tion intervals (since A, 5 E [ v ] ,where E[?,]is the mean ,ni, excluding lock waiting times, rr,\rI,p is tlic duration of a disconncction). + + + + + + + + + + + (4 execution time in state 0, and t,,,,,,,,ii is the commit time to reflect the updates in the database. 6.1.1 Resource Contention Analysis Wc modcl clusters as M/G/1 systems. Thc avcrage service time for the various types of requests, all exponentially distributed, can be determined from the following parameters: the processing time, (, of a query on a data copy, the time L,, to install an update on a data copy, and the overhead time, 4, to propagate an update or query to another clustcr. In each M/G/1 scrvcr, all requests arc processed with the same priority on a first-come, firstserved basis. Clustcrs bccomc disconncctcd and recoiinected. To capture disconnections, we model each connection among two clusters as an M / M / 1 system with vacations. A vacation system is a system in which the scrver becomes unavailable for occasional intervals of time. If W is the available bandwidth bctween two clusters and if we assume exponcntially distributed packet lengths for 6.1.2 Data Contention Analysis We assume an eventual and best effort h. In the following, op stands for one of WIZ, WW, SIZ, SW. Using (A) the response time for strict and weak transactions is: ~V,I',,RSR + l~.~i~~~ = iniNm ~l~ -1- "H, ~ , i~ " , ~ l , ~N,,I:,ll.sw ~, + TL,,,,,ii H,,,,,,s~in~,~.~.~.i~,~,~ = ~ ( I N P L-k l?. where Pal,is the probability that a transaction contents for an 01, operation on a data copy, and R , is the average time spent waiting to get an u p lock given that lock contention occurs. and I:, arc, respectively, the probability that at lcast one operation on a data copy per strict read or write conflicts. Specifically, P, = I - (1 - Psn)" and P,, = I - (1 - PSlv)"".An outline of the is given in the Appendix. estimation of p",,and PITOURA AND BHARGAVA: DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS 907 TABLE 4 Input Parameters Dcscription numher of physical clusters query arrival rate updatc arrival rate consistency fact,or rend quorum write quoruni local transactions accessing hot data probability that a local transaction acccss hot data probability a hot data has a core copy at a givcn cluster probability a cold data has a core copy at a given cluster processing time for an update processing time for a q u a y propagation overhead vacation int,erval availablc bandwidth avcrage size of a mcssage number of cold data iterns per p-cluster number of hot data items per pclustcr average number of operations pcr trausact,ion Value 5 12 queries/sec 3 upllatcs/sec ranges from 0 to 1 ranges from 1 to n rangcs from 1 to n ranges from 0 to 1 ranges from 0 to 1 ranges from 0 to 1 ranges from 0 to 1 0.02 SEC 0.005 scc 0.00007 sec ranges ranges 200 10 __ r 1 , 02 0.4 , , 0.“ ~, .*n, . ~_._,__; ~. (““‘i.ln’^y mtm (‘oeris,a,cy Lctm la) (bl Fig. 3. Maximum allowable input rate of updates lor various values of the consistency factor. (a) Limits imposed by the processing rate at each cluster ( X 5 E M ) . (b) Limits imposed bandwidth restrictions (A,. 5 t?). 6.2 Performance Evaluation The following performance results show how the percentage of weak and strict transactions can be effectively tuned based on the prevailing networking conditions such as the availablc bandwidth and the duration of disconnections to attain the desired throughput and latency. Table 4 depicts some realistic values for the input parameters. The bandwidth dcpends on the type of technolojg used, for infrared a typical value is 1Mbps, for packet radio 2 Mbps, and for cellular phone 9-14 Kbps [7]. 6.2.1 System Throughput Fig. 3a, Fig. 3b, Fig. 4a, and Fig. 4b show how the maximum transaction input, or system throughput, is bounded by the processing time, the available bandwidth, and the disconnection intervals, respectively. We assume that queries are four times more common than updates A, = 1Azt. As shown in Fig. 3a, the allowable input rate when all transactions are weak (c = 0) is almost double the rate when all transactions are strict (c = 1). This is the result of the increase in the workload with c caused by the fact that strict operations on data items may be translated into more than one operation on data copies. The perccntage of weak transactions can be effectively tuned to attain the desired throughput based on IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11. NO. 6. NOVEMBERIDECEMBER 1999 90s Fig. 4. Maximum allowable input rate for updates for various values of the consistency factor.Limits imposed by disconnections and their duration (A, 5 R[4).(a) Disconnections lasting from 115 lo 1 seconds and (b) longer disconnections lasting from 15 to 75 minutes. (a) Fig. 5. Number of messages. (a) For various values of o-U.ti,l=U.O, (b) C. (b) With (C) locality. (c) For different replication of hot core copies. Unless stated otherwise: i'=U.4,i~-U.!),andr=U.7. the networking conditions such as the duration of disconnections and the available bandwidth. As indicated in Fig. 3b, to get, for instance, the same throughput with 200 bps as with 1,000 bps aiid c = 1 we must lower the consistency factor below 0.1. Thc duration of disconnections may vary from seconds when they are caused by hand offs [19] to minutes, for instance when they are voluntary. Fig. 4 depicts the effect of the duration of a disconnection on the system throughput for short (Fig. 4a) and long discnnnections (Fig. 4b). For long disconnections, only a very small perccntage of strict transactions can be initiated at disconnected sites (Fig. 44. To keep the throughput comparable to that for shorter disconnections (Fig. 4b), the consistency factor must drop at around three orders of magnitude. 6.2.2 Communication Cost We estimate the communication cost by the number of messages sent. The number of messages depends on the following parameters of the replication schcme: 1) the consistency factor c, 2) the data distribution 1 for hot and 1' for cold data, 3) the locality factor 0, and 4) the quorums, qr and qa,, of the coherency scheme. We assume a ROWA scheme (q, = 1, q-,, = 'nJ, if not stated otherwise. As shown in Fig. 5a, the numbcr of messages increases linearly with the consistency factor. As expected the number of messages decreases with the percentage of transactions that access hot data, since then local copies are more frequently available. To balance the increase in the communication cost caused by diminishing locality, thcrc may be a need to appropriately decrease the consistency factor (Fig. 5b). The number of messages decreases, when the replication factor of hot core copies increases (Fig. 5c). The decrease is more evident since most operations are queries and the coherency scheme is ROWA, thus for most opcrations no messages are sent. The decrease is more rapid when transactions exhibit locality, that is, when they access hot data more frequently. On the contrary, the number of messages increascs with the replication factor of cold core copies because of additional writes caused by coherency control (Fig. 6a). Finally, the relationship between the rcad quorum and the iininber of messages depends on the relative iiumbcr of queries aud updates (Fig. 6b). 6.2.3 Transaction Response Time The response time for weak and strict transactions is depicted in Fig. 7 for various values of c. The larger values of response times are for 200bps bandwidth, while faster response times are the result of higher network availability set at 2Mbps. The values for the other input parameters are PITOUHA AND BHAHGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS 909 . ............ . (a) (b) Fig. 6 . Number of messages. (a) For different replication of cold core copies. (b) For differentvalues of the read quorum. Unless stated otherwise: n=n.~,l=fl.~1,1'=~1.~,i~-n.~,andr=~.7. as indicated in Table 4. The additional parameters are set as follows: 1)the locality parameters are o = 0.9 and h = 0.8,2) the data replication parameters are 1' = 0.2 and 1 = 0.8, 3) the disconnection parameters are p = 0.1 and the vacation intervals are exponentially distributed with E[v]= 1,'s sec, to model disconnection intervals that correspond to short involuntary disconnections such as those caused by handoffs .[lY], and 4) the coherency control scheme is ROWA. The latency of weak transactions is about 50 times greater than that of strict transactions. However, there is a trade-off involved in using weak transactions, since weak updates may be aborted later. The time to propagate updatcs during reconciliation is not counted. As c iiicrcases, the response time for both weak and strict transactions increases, since more conflicts occur. The increase is more dramatic for smaller values of bandwidth. Fig. 8a and Fig. Sb show the response time distribution, respectively, for strict and weak transactions and 2Mbps bandwidth. For strict transactions, the most important overhead is due to network transmission. All times increase, as c increases. For weak transactions, the increase in the response timc is the result of longer waits for acquiring locks, since weak transactions that want Fig. 7. Comparison of the response times for weak and strict transactions for various values of the consistency factor. to read up-to-date data conflict with strict transactions that write them. 7 RECONCILIATION COST We provide an estimation of the cost of restoring consistency in terms of the number of weak transactions that need to bc rolled back. We focus on conflicts among strict and weak transactions for which we have outlined a reconciliation protocol and do not consider conflicts among weak transactions at different clusters. A similar analysis is applicable to this case also. A weak transaction W'T is rolled back, if its writes conflict with a read operation of a strict transaction ST that follows it in the IASG. Let PI he the probability that a weak transaction WT writes a data item read by a strict transaction ST and P2 he the probability that S?' follows PVT in the serialization graph. Then, ' I = PIP2 is the probability that a weak transaction is rolled back. Assume that reconciliation occurs after N,. transactions of which k = c N v are strict and K = (1 -c)N,. are weak. For simplicity, we assume uniform access distribution. Although it is reasonable to assume that granule access requests from different transactions are independent, independence cannot hold within a transaction, if a transaction's granule accesses are distinct. However, if the probability of accessing any particular granule is small, e.g., when thc number of granules is large and the access distribution is uniform, this approximation should bc very accurate. Then, PI rr I - (1 - A?lJU)N,2. Let picr. be the probability that, in the IASG, there is an edge from a given transaction of typc I< to a given transaction of typc I,, where the type of a transaction is either weak or strict. Let &,,(m, m') be the probability that in an IASG with m strict and m' weak transactions there is an edge from a given transaction of typc Jf to any transaction of type L. The formulas for and pk,,(m,m') are given in the Appendix. Let p(m,m',i) he the probability that there is an acyclic path of length 7, i.e., a path with i -i. I distinct nodes, from a given weak transac- 910 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6, NOVEMBERIDECEMBER 1999 'i Consistency fzictor (a) Fig. 8. (a) Response time distribution for strict transactions. (b) Response time distribution for weak transactions. ooo21 ~~~~ 0.1 ~~ - "8 ~ a" 0.3 "3 o,>"l~tr?'y/1*0r E 0.6 oi ~ _ 0" 7 "1 (a) Fig. 9. Probability of abort for 3,000 data items. tion to a given strict transaction in an lASC with and 7111 weak 711 strict transactions. Then, n b+i'-l Pz=l- [l-p(k,kJ,i)]. i-1 The values of p(k, K ,i) can be computed from the following recursive rclations: p ( m ,vi', I) = p w s , p(711,711', 0) = 0, p(m,,0 , i j = 0, p(O,nl',i) = 0 Cor all m > 0. m' > 0 fur d 1 111 > 0: 111' > 0, [or all m 2 0: i > 0, m d for a11 in' 2 0, i > 0 p(k:k',i-I- I ) = ] -[(1 -pil,cr,("~')p(lc,Id-l,i)j where the first term is the probability of a path whose first edge is between weak transactions, the second of a path whose first edge is between a weak and a strict transaction and includes at least one more weak transaction and the last of a path whose first edge is between a weak and a strict transaction and does not include any other weak transactions. Thus, the actual number of weak transactions that need to be undone or compensated for because their writes = Pk'. We also need to cannot become permanent is Nuboll roll back all exact weak transactions that read a value written by a transaction that is aborted. Let P be the percentage of weak transactions that are exact, thcn XT"(L= ell - (1- ~~,/D)N.]k'lV,,,,.,. Fig. 9 depicts the probability that a weak transaction cannot be accepted because of a conflict with a strict transaction for reconciliation events occurring after varying number of transactions and for different values of the consistency factor. Fig. 10 shows the same probability for varying database sizes. More accurate estimations can be achieved for specific applications for which the access patterns of transactions are known. These results can be used to determine an appropriate reconciliation point that balances the cost of initiating reconciliations against the number of weak transactions that need to be aborted. For PITOURA AND BHARGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS 91 1 I (a) Fig. IO. Probability of abort for 60 transactions. consistency requirements arc considered a special case of weak transactions that have no write operations. Two requirements for read-only transactions were introduced in [a]: consistency and currency requirements. Consistency requirements specify the degree of consistency 8 RELATEDWORK needed by a rcad-only transaction. In this framework, a readOne-copy serializahility [3] hides from the user the fact that only transaction may have: a) no consistencyrequirements; b) there can be multiple copics of a data itcm and ensures weak consistencyrequirements, if it requires a consistent view strict consistency. Whereas one-copy serializability may he (that is, if all integrity constraints that can be fully evaluated an acceptable criterion for strict transactions, it is too with thc data read by the transaction must bc true); or c) sfyong restrictive for applications that tolerate hounded inconsis- consistency requirements, if the schedule of all update tency and also causes unbearable overhcads in cases of transactions togcthcr with all other strong consistency weak connectivity. Thc weak transaction model described queries must he consistent. While our strict read-only in this paper was first introduced in 12.61, while preliminary transactions always have strong consistency requirements, performancc results were prcsented in [24]. weak read-only transactions canhc tailored to have any of the above degrees based on the criterion used for IAS correctness. 8.1 Network Partitioning Weak read-only transactions may have: no consistency The partitioning of a databasc into clusters rcscmbles the requirement, if ignored from the IAS schedule; weak iietwork partition problem [5], where site or link failures consistency, if part of a weakly correct IAS schedule; and fragment a network of database sites into isolated subnetstrong consistency, if part of a strongly correct schedule. works called partitions. Clustering is conceptually different Currency requirements specify what update transactions than partitioning in that it is electively done to increase should be reflectcd by the data read. In terms of currency performance. Whereas all partitions are isolatcd, clusters requirements, strict rcad-only transactions read thc most-upmay be rveakly connccted. Thus, clients may operate as to-date data item available (i.e., committed). Weak read-only physically discoimccted even while remaining physically transactions may read older versions of data, depending on conuected. Strategics for network partition face similar competing goals of availahility and correctness as cluster- the definition of the d-degree. Epsilon-serinliznbility (ESR) [%I, [29] allows tcmporary ing. These strategies range from optiinistic, where any and bounded inconsistencies in copies to be sccn by queries transaction is allowed to be executed in any partition, to prssinzistic, wherc transactions in a partition are restrictcd by during the period among the asynchronous updatcs of the making worst-casc assumptions about what transactions at various copies of a data item. Read-only transactions in this other partitions are doing. Our model offers a hybrid framework are similar to weak read-only transactions with approach. Strict transactions may be performed only if one- no consistency requirements. ESR bounds inconsistency copy sciializahility is ensured (in a pessimistic manner). directly by bounding the number of updates. In [3R], a Weak transactions may bc performed locally (in an generalization of ESR was proposed for high-level type optimistic manner). To mcrge updates pcrforined by weak specific operations on abstract data types. In contrast, our approach deals with low-level read and write operations. transactions we adopt a purcly syntactic approach. In an N-ignorant system, a transaction need not see the 8.2 Read-only Transactions results of at most N prior transactions that it would have Rcad-only transactions do not modify the database state, seen if the execution had been scrial [15]. Strict transactions thus their execution cannot lcad to inconsistent database are @ignorantand weak transactiuns are 0-ignorant of other states. In our schemc, read-only transactions with weaker wcak transactions at thc same cluster. Weak transactions instance, for a givcn c: =: 0.5, to kcep the probability helow a threshold of say 0.00003, rcconciliation events must take place as often as every 85 transactions (Fig. 9b). 912 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6. NOVEMBERIDECEMBER 1999 are ignorant of strict and weak transactions at other clusters. The techniques of supporting N-ignorance can be incorporated in the proposed model to define d as the ignorance factor N of weak transactions. 8.3 Mobile Database Systems The effect of mobility on replication schemes is discussed in [Z]. The need for the management of cached copies to be tuned according to the available bandwidth and the currency requirements of the applications is stvessed. In this respect, &degree consistency and weak transactions realize both the above requirements. The restrictive nature of one-copy serializability for mobile applications is also pointed out in [16) and a more relaxed criterion is proposed. This criterion although sufficient for aggregate data is not appropriate for general applications arid distinguishable data. Furthermorc, the criterion does not support any form of adaptability to the current network conditions. The Bayou system [35], 161, 1231 is a platform of replicated highly available, variable-consistency, mobile databases on which to build collaborative applications. A read-any/ write-any weakly-consistent replication scheme is employcd. Each Bayou database has one distinguished server, the primary, which is responsible for committing writes. The other secondary servers tentatively accept writcs and propagate them towards the primary. Each server maintains two views of the database: a copy that only reflects committed data and another full copy that also reflects tentative writes currently known to the server. Applications may choose between committed and tentative data. Tentative data are similar to our quasi data, and committed data similar to core data. Correctness is defined in terms of session, rather than on serializability as in the proposed model. A session is an abstractioll for the of read and writes of an application. pour types of guarantees can be requested per session: a) read yiiur writes, bj mollotonic reads (successive reads reflect a nondecreasing set cif writes), c) writes follow read (writes are propgated after reads on which they depend), and d) monotonic writcs (writes are propagated aftc:r xvritcs that logically them). To reconcile copies, &y,ou adopts an application. based approach as opposed to the syntactic based procedurc used here. The detcctioll mechallism is based on dependency and tllc pcr.write conflict resolutioll method is based on client-provided mcrgc procedures 1361. In the truo-tier replication scheme [‘)I, replicated data have two versions at mobile nodes: master and tentative versions. A master version records the most recent received while the site was connected. A tentative version records local updates. There are two types of transactions somewhat analogous to our weak attd strict transactions: tentative and base transactions. A tentntive transnctio,l works on local tentative data and produces tentative data. A bnse tvnnsaction works only on master data and produces data. Base transactions involve <inlyconnected sites. upon reconnection, tentative transactiolls reprocessed as base transactions. If they fail ti1 meet some application.specific acceptance criteria, they are aborted and a message is returned to the mobile nodc. ourscllcmc two.ticr replication in that weak coslnectivity is supported by employing a combination of weak and strict transactions. We have provided the foundations for the correctness of the proposed scheme as well as an evaluation of its perfcirmance. 8.4 Mobile File Systems Coda [14] treats disconnections as network partitions and follows an optimistic strategy. An elaborate reconciliation algorithm is used for merging file updates after the sites are connected to the fixed network. No degrees of consistency are defined and no transaction support is provided. Isolation-only transactions (IOTs) 1171, [18] extend Coda with a new transaction service. IOTs arc sequences of file accesses that unlike traditional transactions have only the isolation property. IOTs do not guarantee failure atomicity and only conditionally guarantee permanence. IOTs are similar to weak transactions. Methods for refining consistency semantics of cached files to allow a mobile client to select a mode appropriate for the current networking conditions are discussed in [lo]. The proposed techniques are delayed writes, optimistic replication and failing instead of fetching data in cases of cache misses. The idea of using different kinds of operations to access data is also adopted in [32], [33], where a weak read operation was added to a file service interface. The semantics of file opevations are different in that no weak write is provided and since there is no transaction support, the correctness criterion is not based on om-copy serializability. 9 SUMMARY To overcome bandwidth, cost, and latency barriers, clients of mobile information systems switch between connected and discolinccted modes of operation. In this paper, we propose a replication scheme appropriate for such operation. Data located a t strongly connected sites arc grouped in clusters. Bounded inconsistency is defined by requiring mutual consistency among copies located at the same cluster and controlled deviation among copies at different clusters. The database interface is extended with weak oPeratiol1s. Weak operations query local, potentially inconsistent copies and perform tentative updates. The usual operations, called strict in this framework in contradistinction to weak, are also supported. Strict operations access col1sistent data and perform permanent updates. Clients may operate disconnected by employing only wreak operatious. To accommodate weak connectivity, a mobile client selects an appropriate combinatioll of weak and strict transactions based on the consistency requirements of its applications and on the prevailing networking conditions. Adjusting the degree of divergence provides an additional support for adaptability. The idea of providing weak operations can be applied to other types of intercluster integrity constraints besides replication, Such coilStraints can be vertical and horizontal partitions or arithmetic constraints 1311. Another way of defining the semantics of weak operations is by exploiting the semantics of data. In [371, data are fragmented and later merged based 011 their object semantics. ~ PlTOuRA AND BHARGAVA: DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS APPENDIX A. COMPUTATION OF RESOURCE WAITING TIMES Processor waiting time. At each cluster there are the following types of requests: Queries are initiated at a rate of A,. From the locally initiated queries, A i = (I - c)X, are wcak and are serviced locally with an average service time 81 = t,,. Thcn, from the remaining CA, strict queries A 2 = zcA, have service time 82 = (q? - 1 ) L h t, and the rest A1 = (1- z)cA, have service time O3 = qvtb. Queries are also propagated from other clusters at a rate An = [z(qv- I ) l ( 1- ~ ) q , . ] c X ,and ~ have service time 84 = l,l. Analogous formulas hold for the arrival rates and service times of updates. The combined flow of request forms a Poisson proccss with arrival rate, + 8 x = CAi t=I The service time of the combined flow, T, is no longer exponentially distributed but its means and second moments are. Then, the wait time by using the Pollaczek-I<hinchin (P-K formula) 141 is: Note that the above analysis as well as the following analysis on network links arc worst cases. In practice, when a locking method is used for concurrency control, a number of transactions is waiting to acquire lucks and not competing for system resources. T l i ~the rate of arrival of operations at the resource queues and the waiting time at each queue may be less than the valuc assumed in this section. Transmission waiting time. Wc consider a nonexhaustive vacation system where after the end of each service the server takes a vacation with prubability 1 - p or continues service with probability 11. This is called a queue system with Bernoulli scheduling [34].In this case: 1111 = 913 We divide the state i of each weak transaction into two substates, a lock state i l , and an execution state iz. In substate i l the transaction holds i - I locks and is waiting for the ith lock. In substate i z it holds i locks and is executing. Similarly, we divide each state of a strict transaction into three substates in, i l , and i z . Let 1 1 ~ = (NQ/h')q7 I (NL/N)q?,,.In substate i o , a transaction is at its initiating cluster, holds ( i - l)q2: locks and sends messages to other clusters. In substate i,, the transaction holds ( i - l)qcrlocks and is waiting for the ith set of locks. In substate %a it holds ( i - l)q:,: qv ( ( i - l)q7 I q,.) locks and is cxecuting. The probability that a transaction cnters substate i~ upon leaving state i - 1 or i n is i'i,v,(, PI^., P,, and P,,, respcctively, for WR, W W , SR, and S W lock requests. The mean timc sol, spent at substate i y is computed from the resource contention analysis; for instance, qirn = ui tP.1.et cOv for a strict o p be the mean time spent at state iu for instance, c . s n = u ~ + ( l- p ~ ) ( y i . t b + l , ) + p l ( ( q l . - I ) L ~ , i - b ~ t ~ ) , The time spent a t state il is R,,,and the unconditional mean time spent in substate i l is bop; for instance, 6 1 4 = ~ Pi4qzri?.I+w. Let d i , (d&) be the mean number of hot (cold) copies written by an ijp operation and if the mean number of op operations par copy. For instance, for or) = X7R and hot copies, + + and Given a mean lock holding time of Tpl, (?;.) for weak (strict) transactions and assuming that the lock request times are a Poisson process, the probability of contention on a lock request for a copy equals thc lock utilization. Let 1',,1~,,,,, stand for the probability that an o,J,-lock request conflicts with an op-lock request; then, for example, P;l~l~/lvw = ~;l,jiG;,lv~iv + ~;v,&lvZv and PI,l,r? = d;"I<(l;,lvTs + I;,bl,TilT) + d;l,l,(l'FcvTs + I;vclJiv) Let Civ (C,?)be the sum of the mean luck holding times over all N copics accessed by a weak (strict) transaction, ] E [ $ ] A,(Sj') + (1- p ) ( 2 ( l / , 5 v ) j ? [ ? J+E'[v"])} 2.E["] 2{1-p--(l -p)X,Ejv]} ~ + where syl is the second moment of the scrvice ratc and 'IJ the vacation interval, that is the duration of a disconnection. B. DATACONTENTION ANALYSIS From the resource contention analysis, N,& + N,,R:, RP:~,,,,~ = N,RY + N,,R::. 14.,,;, = and where 7) is the mean time to commit. Then TIIT = GI{./ N . Similar formulas hold for G,?and Ty. Let Nlf, ( N ; ) be the mean number of weak (strict) and,CP,t;,,,<jI,2 be the transactions pcr cluster in substate % conditional probability that an opl-lock request contents with a transaction in substate i, that holds an incompatible opr-lock given that lock contention occurs. Now, we can approximate R,,I,,for instance, 914 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 11, NO. 6, NOVEMBERIDECEMBER 1999 ~ A. Demers, K. Pctcrscn, M. Spreitzer, D. Terry, M. Theimer, and B, Welch, "The Bayou Architecture: Support for Data Sharing among Mobile Users," Proc. 1EEE Workshop Mobile Computiq Systems ana Applications, pp, 2-7, Dec. 1994. [71 G.H. Forman and J. Zahorjan, "The Challenges of Mobile Computing," Conrputer, vol. 27, no, 6, pp. 38-47, Junf 1994. [XI H. Garcia-Molina and G. Wiederhold, "Read-Only Transactions in a Distributcd Database," ACM Trmis. Dntnbnsc Systems, vol. 7, no. 2, pp. 209-234, June 1982. [O] J. Gray, P. Helland, F. ONeil, and D. Shasha, "The Dangers of Rcplication and a Solution," Pmc. ACM SIGMOD Cor#, pp. 173[6] 182,1996. [lo] P. Honcyman and L.B. Nuston, "Communication and Consistcncy in Mobile File Systems," IEEE Pwsomd Coinnr., vol. 2, no. 6, Dec. 1995 where the factors f2 express the mean remaining times at the corresponding substate and depend on the distribution of times at each substate. Finally, ,siq, ( s.7,) is the mean time for a weak (strict) transaction from acquiring the 'ith lock till the end of commit, for instance: C. RECONCILIATION The probabilities of edgcs in the serialization graphs are givcn below: p;ss(m,m')= I - (1 -pss)(7"-1), where pc = 1./n2is the probability that two given transactions are initiated at the samc cluster. ACKNOWLEDGMENTS Rharat Bhargava was supported, in part, by the US. National Science Foundation Grant No. 9405931. REFERENCES [l] P. Alonso, D. Barbara, and H. Garcia-Molina, "Data Caching Issues in an Information Retrieval System," ACM Trans. Database Systems, vol. 15, no. 3, pp. 359-384, Scpt. 1990 [2] D. Barbar5 and II. Garcia-Molina, "Replicated Data Management in Mobile Environments: Anything New under the Sun?" Pmc. F I P Co,i$ Appiicalioiis in Parallel and Disfribirted Computing, Apr. 1994. P.A. Bemstcin, V. Hadjilacns, and N. Goodman, Concurrency Coatrul orid Recovery in Datnbnsr Systems. Addisson-Wcsley, 1987. [4] D. Bertsekes and R. Gallager, Dnta Networks. Prcntice Hall, 1987 [5] S.B. Davidson, H. Garcia-Molina, and D. Skecn, "Consistency in Partitionad Networks," ACM Coinpirtinfi Surveys, vol. 17, no. 3, pp. 041-370, Sept. 1985. [3] [I I ] Y. Huang, P. Sistla, and 0.Wolfson, "Data Replication for Mobilc W . 13-24.. Ma" Comuuters." Proc. 1994 SIGMOD Conf. , . I I , 1994. [I21 L.B. Hustnn and P. Honcymnn, "Partially Connected Oycration," Conipiiting Systems, vol. 4, no. 8, Fall 1995. [I31 T. Imielinksi and B. R. Uadrinath, "Wireless Mobile Computing: Challenges in Data Managcmcnt," Comm. ACM, vol. 37, no. 10, Oct. 1994. [I41 J.J. Kistler and M. Satyanarayanan, "Disconnected Operation in the Coda File System," ACM Tmns. Computer Systems. vol. 10, no. 1, pp. 213-225, Feb. 1992. [I51 N. Krinshnskumer and A.J. Bernstein, "Bounded Ingnorancc: A Techniquc for Increasing Concurrency in B Replicated System,'' ACM Tmns. Dalnbnse Systems, vol. 19, no. 4, pp. 586-625, Dec. 1994. [I61 N. Krishnskumur and R. Jain, "Protocols for Maintaining Inventory Databases and User Profiles in Mobile Sales Applicstions," Pmc. Mubidntn Workshop, Oct. 1994. [I71 Q. 1.u and M. Setysnsrayman, "Improving Data Consistency in Mobile Computing Using Isolation-Only Transactions," Proc. F f l l i Workshop Hof Topics ii! Operaling Systenis, May 1995. [I81 Q. Lu and M. Satyunurayenan, "Rcsource Conservation in a Mobile Transaction System," IEEE Trans. Computers, vol. 46, no. 3, pp. 299311, Mar. 1997. [ 191 K. Miller, "Cellular Essentials far Wireless Data Transmission," Data Comm., vol. 23, no. 5, pp. 61-67, Mar. 1994. [ZO] L.B. Mummert, M.R. Ebling, and M. Satyanarayanan, "Exploiting Weak Connectivity for Mobile File Acccss," Proc. 15th ACM Symp. Operating Systems Principles, Dec. 1995. [21] B.D. Noble, M. Price, and M. Satysnarayanan, "A Programming Interface for Application-Aware Adaptation in Mobile Camputing," Compuling Systems, vol. 8, no. 4, Winler 1996. [22] C. Papadimitrinu, The Theory of Datnbiisc Cnnarrrency Confro!. Computer Science Press, 1986. [23] K. Petersen, M. Sprcitrer, D.B. Terry, M. Theimer, and A.J. Demcrs, "Flexible Update Propagation for Weakly Consistent Replication," Proc. 17th ACM Symp. Opernting Systenis Principles, pp. 288-301, 1997. [24] E. Pitoura, "A Replication Schcmit to Support Weak Connectivity in Mabilc Information Systems," Pmc. Seventh Int'l Cqf, Database nnd Expert Systems Applicntions ( D E X A '96), pp. 510-520, Sept. I 1996. [25] E. l'itoura and B. Bharguva, "Building Information Systems for Mobile Environments," I'roc. Third Int'l Conf. Iitformntion nnd Knowlcd,ye Managcinent, pp. 371-378, Nov. 1994. 1261 E. Pitaura and B. Bharguva, "Maintaining Consistency of Data in Mnbile Distributcd Environments," Proc. 15th IEEE Int'l CO,$ Distributed Coni~utinvSustcms. DD. 404413. Mav 1995. [27] E. Fitaura and 6.Sama&s, Data' Mnnafiementfor'Mobi!e Computing. K l i i w. p.~ r,~ . 1998. [28] C. 1"' and A. Lcff, "Replica Control in Distributcd Systems: An Asynchronous Approach," I'roc. ACM SIGMOD, pp. 377-386, ~ 1991. [29] K. Rumamritham and C. Pu, "A Tonne1 Characterization of Epsilon Serislirability," 1EEE Trans. Knowledge nrrd Datn Eng., vol. 7, no. 6, pp. 997-1,007, 1995. [30] M. Satyanarayman, J.J. Kistler, L.B. Mummert, M.R. Ebling, 1.' Kumor, and Q. Lu, "Experience with Disconxctcd Opcrotion in i\ Mobile Camp uting Environment," Proc. 1993 Useriix Symp. Mobile and Location-Independmt Computing, Aug. 1993. [31] A. Sheth and M. Rusinkiewicz, Management of Interdependent Data: Specifying Dependency and Consistency Requirements, I'roc. Workshop Monagement of Replicnted Uatn, pp. 133-136, Nov. 19YO. PITOUHA AND BHAHGAVA DATA CONSISTENCY IN INTERMITTENTLY CONNECTED DISTRIBUTED SYSTEMS C.D. Tait and U. Duchamp, "Service Interface and Heplica Management Algorithm for Mobilc File Systcm Clients," Proc. Firs1 Int'l Cnnf, Piimllel nnd Distributed lnformatioii Systcins, pp. 190197, 1991. C.D. Tnit and D. Duchamp, "An Efficient Variable-Consistency Replicated File Service," !'roc. Usenix File Systenis Workshop, pp. 111-126, May 1992. 11. Tukagi, Qiiriicing Anniysis, Vol. 1: Vocation and Priority Systenis. North-Holland, '1991. D. Terry, A. Demers, K. Pctcrsen, M. Sprcitzer, M. Theimrr, and B. Welch, "Session Guarantees lor Weakly Consistent Replicated Data," Proc. lnt'l Con/ Pnmllcl and Disfributed lnforniafion Systenis, pp. 140-149, Sept. 1994. D.B. Terry,M.MTheirner, K. Petersen, A.J.Demers, M.J. Spreitzer, and C.H. Hauser, "Managing Update Conflicts in Bayou, A Weakly Connected Rcplicated Storage System," Proc. 15th ACM Synip. Opcniting Syslcnis Principles, Dec. 1995. G. Walborn and P. K. Chrysanthis, "Supporting Scrnentics-Bmcd Transaction Processing in Mobile D n t a h s e Applications," Proc. 14th Symp. Reliable Disfribirfed Systcins, Scpt. 1995. M.H. Wung and D. Agmwal, "Tolerating Bounded Inconsistency for Increasing Coiicurrency in Database Systcms," Proc. l l t k ACM Symp. l'rinciples of Dntnbnsc S!/stems (PODS), pp. 236-245, 1992. P.S Yu. I1.M. Dias. and S.S Lavcnbere. "On the Analvtical Modeling of Database Concurrency Cok&" I. ACM vd. 40, no. 4, pp. 831-872, Sept. 1993. 915 Evaggelia Pitoura received her diploma from the Department of Computer Science and Engineering of the University of Patras, Greece, in 1990; and her MSc and PhD degrees in computer science from Purdue University in 1993 and 1995, respectively. Since September 1995, she has been with the Department of Computer Science at the University of loannina, Greece. Her research interests include database issues in mobile computing, heterogeneous databases, and distributed systems. Her publications in the above areas include several journal and conference proceedings alticles, plus a recently published book on mobile computing. She is a member of the IEEE Computer Society Bharat Bhargava is currently a professor in the Department of Computer Science at Purdue University. His research involves both theoretical and experimental studies in distributed systems. His research group has implemented a robust and adaptable distributed database system, called RAID, to conduct experiments in replication control, checkpointing, and communications. He has conducted experiments in large-scale distributed systems, communicain implementing object support on top of the relational model. He has recently developed an adaptable video conferencing system using the NV system from Xerox PARC. He is conducting experiments with research issues in large-scale communication networks to support emerging applications such as digital libraries and multimedia databases. He chaired the IEEE Symposium on Reliable and Distributed Systems that was held at Purdue in 1998. and he is on the editorial board of three international journals. At the 1988 IEEE Data Engineering Conference, he and John Riedl received the best paper award for their work on "A Model lor Adaptable Systems for Transaction Processing." He is a fellow of the IEEE and Institute of Electronics and Telecommunication Engineers. He has been awarded the Gold Core member distinction by the IEEE Computer Society for his distinguished service. He received the Outstanding instructor award from the Purdue chapter of the ACM.