Consistency And Replication Ömer Faruk SARAÇ 115112005 Outline • Introduction – Reasons for Replication – Replication as Scaling Technique • Data-Centric Consistency Models – Continuous Consistency – Consistent Ordering of Operations • Client-Centric Consistency Models – – – – – • Eventual Consistency Monotonic Reads Monotonic Writes Read Your Writes Writes Follow Reads Replica Management – Replica Server Placement – Content Replication and Placement – Content Distribution • Consistency Protocols Reasons for Replication • Enhance Reliability • Improve Performance – Scaling in numbers – Scaling in geographical area – Caching Replication as Scaling Technique • • • • • Placing replicas(data) close to clients Network bandwidth issue Cache How to keep replicas consistent? Loosen constraints • Introduction – Reasons for Replication – Replication as Scaling Technique • Data-Centric Consistency Models – Continuous Consistency – Consistent Ordering of Operations • Client-Centric Consistency Models – – – – – • Eventual Consistency Monotonic Reads Monotonic Writes Read Your Writes Writes Follow Reads Replica Management – Replica Server Placement – Content Replication and Placement – Content Distribution • Consistency Protocols Data-Centric Consistency • Data store – Shared data – Shared memory – Shared database – Distributed file system • Contracts between processes(clients) and data store(replicas) Continuous Consistency • Inconsistencies – Deviation in numerical values – Deviation in staleness – Deviation with respect to ordering of updates • Conit – Consistency unit – Vector clock representation • Granularity of conit – Too small, hard to manage systemware – Too big, irrelevant data packages • Libraries for applications Continuous Consistency Consistent Ordering of Operations • Sequential Consistency – The result of any execution is the same as if the (read and write) operations by all processes on the data store were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program. • Valid execution sequences • Output signature Consistent Ordering of Operations • Sequential Consistency Consistent Ordering of Operations • Casual Consistency – Writes that are potentially causally related must be seen by all processes in the same order. Concurrent writes may be seen in a different order on different machines. • Dependency graph • Weaker than squential consistency Consistent Ordering of Operations • Casual Consistency Consistent Ordering of Operations • Grouping Operations – Hardware based – Shared memory multiprocessor systems – Synchonization parameters – Critical section – Acquire-release sync variable • Entry consistency – Associate lock with each data item Consistent Ordering of Operations • Grouping Operations • Introduction – Reasons for Replication – Replication as Scaling Technique • Data-Centric Consistency Models – Continuous Consistency – Consistent Ordering of Operations • Client-Centric Consistency Models – – – – – • Eventual Consistency Monotonic Reads Monotonic Writes Read Your Writes Writes Follow Reads Replica Management – Replica Server Placement – Content Replication and Placement – Content Distribution • Consistency Protocols Client-Centric Consistency • Special class of distrubuted systems • Lack of simultaneous updates or easily resolved • Weak consistency models • Many consistencies are hidden relatively cheap way Eventual Consistency • Few processes perform operates • Mostly read data from data store • Examples – DNS – Web Cache servers • Eventually all replicas will be consistent • Mobile clients issue Eventual Consistency Monotonic Reads • If a process reads the value of a data item x, then any successive read operation on x by that process will always return that same value or a more recent value. • But no guarantees on concurrent access by different clients. • Mailbox example Monotonic Reads Monotonic Writes • A write operation by a process on a data item x is completed before any successive write operation on x by the same process. • Data centric FIFO consistency – Correct order of write operations • Software library example Monotonic Writes Read Your Writes • The effect of a write operation by a process on data item x will always be seen by a successive read operation on x by the same process. • Examples – Web page caches – Password management Read Your Writes Writes Follows Reads • A write operation by a process on a data item x following a previous read operation on x by the same process is guaranteed to take place on the same or a more recent value of x that was read. • Network newsgroup example Writes Follows Reads • Introduction – Reasons for Replication – Replication as Scaling Technique • Data-Centric Consistency Models – Continuous Consistency – Consistent Ordering of Operations • Client-Centric Consistency Models – – – – – • Eventual Consistency Monotonic Reads Monotonic Writes Read Your Writes Writes Follow Reads Replica Management – Replica Server Placement – Content Replication and Placement – Content Distribution • Consistency Protocols Replica Management • How, where, when • Placement, activation, deployment, migration • Server placement – Best place to locate a server(data store) • Content placement – Best replica to copy data sote Replica Server Placement • Simple issue: money • Choose best places K in set of possible places N where K<N – – – – Network latency Bandwidth Physical distance İgnore client location, use network topology • Defining regions – – – – Dence cells Cell, two dimensional rectangle Too small, many replicas in a cell Too large, too few clusters for a cell Replica Server Placement Content Replication and Placement Content Replication and Placement • Permanent replicas – Generally in small numbers – Initial set of replicas – Distribution of web site example – Mirror servers – Forward to one of the server, round-robin – Distributed database servers, different machines Content Replication and Placement • Server-initiated Replicas – Dynamic replication – Monitor incoming requests – Install a number of temporary replicas – Web hosting services example – Where to put which content – Statistical count of upcoming request for a specified resource – Backup, consistency issues Content Replication and Placement • Client-initiated Replicas – Client cache – Local store facility – Managing the cache is left entirely to the client, in principle – Improve access times to data – Limited amount of time – Placement; local cache, cache server – Really needed? File servers, enhancements on network and system resources Content Distribution • State versus Operations – Propagate only a notification of an update. – Low bandwidth, effective – Transfer data from one copy to another. – Whole data, logs, log packages – Propagate the update operation to other copies. – No data, process time/complexity Content Distribution • Pull versus Push Protocols – Push, server based, server initiated – Pull, client based, send request – Hybrid model, leases Content Distribution • Unicasting versus Multicasting – Send a message for every client – Send message to entire system – Unicasting, pull based – Multicasting, push based Consistency Protocols • • • • • Continuous Consistency Primary-based Protocols Replicated-write Protocols Cache-coherence Protocols Implementing Client-centric Consistency • Introduction – Reasons for Replication – Replication as Scaling Technique • Data-Centric Consistency Models – Continuous Consistency – Consistent Ordering of Operations • Client-Centric Consistency Models – – – – – • Eventual Consistency Monotonic Reads Monotonic Writes Read Your Writes Writes Follow Reads Replica Management – Replica Server Placement – Content Replication and Placement – Content Distribution • Consistency Protocols