COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 7: Consistency 4/13/2015 Distributed Systems - COMP 655 1 Consistency and Replication • The problems we are trying to solve • Types of consistency • Approaches to propagation 4/13/2015 Distributed Systems - COMP 655 2 Transparency in a Distributed System Transparency Description Access Hide differences in data representation and how a resource is accessed Location Hide where a resource is located Migration Hide that a resource may move to another location Relocation Hide that a resource may be moved to another location while in use Replication Hide that a resource is replicated Concurrency Hide that a resource may be shared by several competitive users Failure Hide the failure and recovery of a resource 4/13/2015 Distributed Systems - COMP 655 3 What problems does replication solve? • Some capacity and performance problems – Keep replicas on both sides of a bottleneck – Keep replicas on both sides of a connection with long delays • Two kinds of incoherence: – Replication provides some location transparency – Replication provides some failure transparency (aka fault tolerance) Continue to work if one copy goes down Continue to work if the network goes down 4/13/2015 Distributed Systems - COMP 655 4 What problems does replication cause? • Consistency – To maintain concurrency transparency, system has to keep replicas updated • Complexity – To maintain replication transparency, system has to be able to locate and select appropriate replicas • Overhead can take back capacity and performance gains 4/13/2015 Distributed Systems - COMP 655 5 If you remember only one two thing(s) … 1. There are many types of consistency, known as “consistency models” 2. • • • As the consistency model gets stronger The system gets easier to use. The system gets harder to implement. The system gets slower and consumes more resources. 4/13/2015 Distributed Systems - COMP 655 6 Consistency and Replication • The problems we are trying to solve • Types of consistency • Approaches to propagation 4/13/2015 Distributed Systems - COMP 655 7 Consistency Models • A Consistency Model is a contract between the software and the memory – it states that the memory will work correctly but only if the software obeys certain rules • The issue is how we can state rules that are not too restrictive but allow fast execution in most common cases • These models represent a more general view of sharing data than what we have seen so far! Conventions we will use: • W(x)a means “a write to x with value a” • R(y)b means “a read from y that returned value b” 4/13/2015 Distributed Systems - COMP 655 8 Data-centric Consistency Models The general organization of a logical data store, physically distributed and replicated across multiple processes. Each process interacts with its local copy, which must be kept ‘consistent’ with the other copies. 4/13/2015 Distributed Systems - COMP 655 9 Types of (data-centric) consistency 4/13/2015 Distributed Systems - COMP 655 10 Strict Consistency • Strict consistency is the strictest model – a read returns the most recently written value (changes are instantaneous) – not well-defined unless the execution of commands is serialized centrally – otherwise the effects of a slow write may have not propagated to the site of the read – this is what uniprocessors support: a = 1; a = 2; print(a); always produces “2” – to exercise our notation: P1: W(x)1 P2: R(x)0 R(x)1 – is this strictly consistent? 4/13/2015 Distributed Systems - COMP 655 11 Strict consistency D=H P1: P2: P1: D?H Yes D?H No D=H P2: D?L D:DVD players, H:high, L:low, set:=, get:? 4/13/2015 Distributed Systems - COMP 655 12 Sequential Consistency • Sequential consistency (serializability): the results are the same as if operations from different processors are interleaved, but operations of a single processor appear in the order specified by the program • Example of sequentially consistent execution: P1: W(x)1 P2: R(x)0 R(x)1 • Sequential consistency is inefficient: we want to weaken the model further 4/13/2015 Distributed Systems - COMP 655 13 Sequential consistency 1 D=H P1: P2: D=L P3: D?L D?L D?L P4: D?L D:DVD players, H:high, L:low, set:=, get:? 4/13/2015 Distributed Systems - COMP 655 14 Sequential consistency 2 D=H P1: P2: D=L P3: D?L P4: D?H D?L D?H D:DVD players, H:high, L:low, set:=, get:? 4/13/2015 Distributed Systems - COMP 655 15 Sequential consistency - not D=H P1: P2: D=L P3: D?L P4: D?H D?H D?L D:DVD players, H:high, L:low, set:=, get:? 4/13/2015 Distributed Systems - COMP 655 16 Causal Consistency • Causal consistency: writes that are potentially causally related must be seen by all processors in the same order. Concurrent writes may be seen in a different order on different machines – causally related writes: the write comes after a read that returned the value of the other write • Examples (which one is causally consistent, if any?) P1: W(x)1 P2: P3: P4: P1: W(x)1 P2: P3: P4: W(x)3 R(x)1 W(x)2 R(x)1 R(x)1 R(x)3 R(x)2 R(x)2 R(x)3 R(x)1 W(x)2 R(x)2 R(x)1 R(x)1 R(x)2 • Implementation needs to keep dependencies 4/13/2015 Distributed Systems - COMP 655 17 Causal consistency P1: D=L Potential causal relationships P2: D=M D?L D=H No causal relationship Not sequentially consistent P3: D?L D?M D?H P4: D?L D?H D?M D:DVD players, H:high, M: medium, L:low, set:=, get:? 4/13/2015 Distributed Systems - COMP 655 18 Causal consistency - not P1: D=L Potential causal relationship P2: D?L D=H P3: D?L D?H P4: D?H D?L D:DVD players, H:high, L:low, set:=, get:? 4/13/2015 Distributed Systems - COMP 655 19 Causal consistency - ok P1: D=L P2: No causal relationship D=M P3: D?L D?M P4: D?M D?L D:DVD players, M:medium, L:low, set:=, get:? 4/13/2015 Distributed Systems - COMP 655 20 Exercise Consider the following sequence of operations: P1: W(x)1 W(x)3 P2: W(x)2 P3: R(x)3 R(x)2 P4: R(x)2 R(x)3 Is this execution causally consistent? Add or modify an event to change the answer. 4/13/2015 Distributed Systems - COMP 655 21 Types of (data-centric) consistency 4/13/2015 Distributed Systems - COMP 655 22 Client-centric consistency A mobile user may access different replicas of a distributed database at different times. This type of behavior implies the need for a view of consistency that provides guarantees for single client regarding accesses to the data store. 4/13/2015 Distributed Systems - COMP 655 23 Client-centric consistency models Model The idea Monotonic reads Each read by a process returns the same value as the previous read, or a more recent value Monotonic writes Each write by a process must complete before the next write of the data item by the process Read your writes Each write by a process will be visible in any subsequent read by that process Writes follow reads Each write by a process after a read will be done at all replicas on a value that is at least as recent as the value read 4/13/2015 Distributed Systems - COMP 655 24 Monotonic Reads process moves from L1 to L2 L1 and L2 are two locations indicates propagation of the earlier write process moves from L1 to L2 No propagation guarantees A data store provides monotonic read consistency if when a process reads the value of a data item x, any successive read operations on x by that process will always return the same value or a more recent value. Example error: successive access to email have ‘disappearing messages’ a) A monotonic-read consistent data store b) A data store that does not provide monotonic reads. 4/13/2015 Distributed Systems - COMP 655 25 Monotonic Writes In both examples, process performs a write at L1, moves and performs a write at L2 A write operation by a process on a data item x is completed before any successive write operation on x by the same process. Implies a copy must be up to date before performing a write on it. Example error: Library updated in wrong order. a) A monotonic-write consistent data store. b) A data store that does not provide monotonic-write consistency. 4/13/2015 Distributed Systems - COMP 655 26 Read Your Writes In both examples, process performs a write at L1, moves and performs a read at L2 The effect of a write operation by a process on data item x will always be seen by a successive read operation on x by the same process. Example error: deleted email messages re-appear. (a) A data store that provides read-your-writes consistency. (b) A data store that does not. 4/13/2015 Distributed Systems - COMP 655 27 Writes Follow Reads In both examples, process performs a read at L1, moves and performs a write at L2 A write operation by a process on a data item x following a previous read operation on x by the same process is guaranteed to take place on the same or a more recent value of x that was read. Example error: Newsgroup displays responses to articles before original article has propagated there (a) A writes-follow-reads consistent data store (b) A data store that does not provide writes-follow-reads consistency 4/13/2015 Distributed Systems - COMP 655 28 Consistency and Replication • The problems we are trying to solve • Types of consistency • Approaches to propagation 4/13/2015 Distributed Systems - COMP 655 29 Types of replicas Any experience with server-initiated replicas? 4/13/2015 Distributed Systems - COMP 655 30 What to propagate? • Notifications • Updated data • Update operations 4/13/2015 Distributed Systems - COMP 655 31 Who initiates propagation? • Server (push-based protocol) • Client (pull-based protocol) 4/13/2015 Distributed Systems - COMP 655 32