Closing the performance gap between Causal Consistency and

Causal Consistency Without Dependency Check Messages Willy Zwaenepoel INTRODUCTION Geo-replicated data stores • Geo-replicated data centers • Full replication between data centers • Data in each data center partitioned Data center Data center Data center Data center Data center The Consistency Dilemma Strong Consistency • synchronous replication • all replicas share same consistent view • sacrifice availability Causal Consistency • asynchronous replication • all replicas eventually converge • sacrifice consistency, but … • … replication respects causality Eventual Consistency • asynchronous replication • all replicas eventually converge • sacrifice consistency The cost of causal consistency Throughput 700 600 Kops/sec 500 400 300 Eventual 200 Causal 100 0 1 2 3 4 Partitions 5 6 Can we close the throughput gap? • The answer is: yes, but there is a price STATE OF THE ART: WHY THE GAP? How is causality enforced? • Each update has associated dependencies • What are dependencies? – metadata to establish causality relations between operations – used only for data replication • Internal dependencies – previous updates of the same client session • External dependencies – read updates of other client sessions Internal dependencies Alice US Datacenter Europe Datacenter Bob Example of 2 users performing operations at different datacenters at the same partition External dependencies Alice Bob US Datacenter Europe Datacenter Charlie Example of 3 users performing operations at datacenters at the same partition How dependencies are tracked & checked • In current implementations – COPS [SOSP ’11], ChainReaction [Eurosys ‘13], Eiger [NSDI ’13], Orbe [SOCC ’13] • DepCheck(A) – “Do you have A installed yet?” Read(B) Client Partition 0 Partition 1 … Partition N Partition 0 Partition 1 … Partition N US Datacenter Europe Datacenter DepCheck(A) DepCheck(B) Encoding of dependencies • COPS [SOSP ’11], ChainR. [Eurosys ‘13], Eiger [NSDI ’13] – “direct” dependencies – Worst case: O( reads before a write ) • Orbe [SOCC ‘13] – Dependency matrix – Worst case: O( partitions ) The main issues • Metadata size is considerable – for both storage and communucation • Remote dependency checks are expensive – multiple partitions are queried for each update The cost of causal consistency Throughput 700 600 Kops/sec 500 400 300 Eventual 200 Causal 100 0 1 2 3 4 Partitions 5 6 The cost of dependency check messages Throughput 700 600 Kops/sec 500 400 Eventual 300 Causal 200 Causal - Dep check msgs 100 0 1 2 3 4 Partitions 5 6 CAUSAL CONSISTENCY WITH 0/1 DEPENDENCY CHECK MESSAGES Getting rid of external dependencies • Partitions serve only fully replicated updates – Replication Confirmation messages broadcast periodically • External dependencies are removed – replication information implies dependency installation • Internal dependencies are minimized – we only track the previous write – requires at most one remote check • zero if write is local, one if it is remote The new replication workflow Alice Replication Confirmation (periodically) US Datacenter Europe Datacenter Asia Datacenter Bob Example of 2 users performing operations at different datacenters at the same partition Reading your own writes • Clients need not wait for the replication confirmation – they can see their own updates immediately – other clients’ updates are visible once they are fully replicated • Multiple logical update spaces Alice’s update space (visible to Alice) Bob’s update space (visible to Bob) … Replication update space (not yet visible) Global update space (fully visible) The cost of causal consistency Throughput 700 600 Kops/sec 500 400 Eventual 300 01-msg Causal 200 Conventional Causal 100 0 1 2 3 4 Partitions 5 6 The price paid: update visibility increased • With new implementation ~ max ( network latency from origin to furthest replica + network latency from furthest replica to destination + interval of replication information broadcast ) • With conventional implementation: ~ network latency from origin to destination CAUSAL CONSISTENCY WITHOUT DEPENDENCY CHECK MESSAGES ?! Is it possible? • Only make update visible when one can locally determine no causally preceding update will become visible later at another partition How to do that? • Encode causality by means of Lamport clock • Each partition maintains its Lamport clock • Each update is timestamped with Lamport clock • Update visible – update.timestamp ≤ min( Lamport clocks ) • Periodically compute minimum 0-msg causal consistency - throughput Kops/sec Throughput 900 800 700 600 500 400 300 200 100 0 Eventual 0-msg Causal Conventional Causal 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Partitions The price paid: update visibility increased • With new implementation ~ max ( network latency from origin to furthest replica + network latency from furthest replica to destination + interval of minimum computation ) • With conventional implementation: ~ network latency from origin to destination How to deal with “stagnant” Lamport clock? • Lamport clock stagnates if no update in a partition • Combine – Lamport clock – Loosely synchronized physical clock – (easy to do) More on loosely synchronized physical clocks • Periodically broadcast clock • Reduces update visibility latency to – Network latency from furthest replica to destination + maximum clock skew + clock broadcast interval Can we close the throughput gap? • The answer is: yes, but there is a price • The price is increased update visibility Conclusion: Throughput, messages, latency Method Throughput # of dep.check messages Update visibility Conventional < Evt. consistency O( Rs since W ) O( partitions ) D D 01-msg ~ Evt. consistency O or 1 ~ 2 Dmax 0-msg ~ Evt. consistency 0 ~ 2 Dmax 0-msg + physical clock ~ Evt. consistency 0 ~ Dmax

Closing the performance gap between Causal Consistency and

Related documents

Products

Support

Closing the performance gap between Causal Consistency and

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib