Logic and Lattices for Distributed Programming

advertisement
Logic and Lattices for
Distributed Programming
Neil Conway
UC Berkeley
Joint work with:
Peter Alvaro, Peter Bailis,
David Maier, Bill Marczak,
Joe Hellerstein, Sriram Srinivasan
Basho Chats #004
June 27, 2012
Programming
Input
x
Output
f(x)
Distributed Programming
Output
f(x)
Input
x
Network
Behavior
Output
f(x)
Output
f(x)
Dealing with Disorder
Introduce order
– Paxos, Zookeeper, Two-Phase Commit, …
– “Strong Consistency”
Tolerate disorder
– Correct behavior in the face of many possible
network orders
– Typical goal: replicas converge to same final
state
• “Eventual Consistency”
Eventual Consistency
Popular
Hard to
program
Help developers build
reliable programs on top of
eventual consistency
This Talk
1. Theory
– CRDTs, Lattices, and CALM
2. Practice
– Programming with Lattices
– Case Study: KVS
Write:
Read:
{Alice,
{Alice,
Bob,Bob}
Carol}
Client0
Students
{Alice,
Bob,
Carol}
{Alice,
Bob}
How to resolve?
Write:
Read:
{Alice,
{Alice,
Bob,Bob}
Dave}
Client1
Students
{Alice,
{Alice,
Bob,
Bob}
Dave}
Proble
m
Replicas perceive different
event orders
Goal
Same final state at all replicas
Solutio Commutative operations
n
(“merge functions”)
Client0
Students
{Alice, Bob, Carol, Dave}
Merge = Set Union
Client1
Students
{Alice, Bob, Carol, Dave}
Commutative Operations
• Used by Dynamo, Riak, Bayou, etc.
• Formalized as CRDTs: Convergent and
Commutative Replicated Data Types
– Shapiro et al., INRIA (2009-2012)
– Based on join semilattices
– Commutative, associative, idempotent
• Practical libraries: Statebox, Knockbox
Time
“Growth”:
Larger Sets
Set
(Union)
“Growth”:
Larger Numbers
Integer
(Max)
“Growth”:
false  true
Boolean
(Or)
Read: {Alice, Bob, Carol, Dave}
Students
{Alice,
{Alice,
Bob,
Bob,
Carol,
Carol}
Dave}
Client0
Read: {<Alice,Bob>}
Write: {<Alice,Bob>, <Carol,Dave>}
Teams
Teams
{<Alice, Bob>,
{<Alice, Bob>}
<Carol, Dave>}
Replica Synchronization
Remove: {Dave}
Client1
Students
{Alice,
Bob,
Carol}
{Alice,
Bob,
Carol,
Dave}
Teams
Teams
{<Alice, Bob>,
{<Alice, Bob>}
<Carol, Dave>}
Read: {Alice, Bob, Carol}
Students
{Alice,
Bob,
Carol,
Dave}
{Alice,
Bob,
Carol}
Client0
Read: {<Alice,Bob>}
Teams
{<Alice, Bob>}
{<Alice,
Bob>}
Replica
Nondeterministic
Synchronization
Outcome!
Remove: {Dave}
Students
{Alice,
{Alice,
Bob,
Bob,
Carol,
Carol}
Dave}
Client1
Teams
{<Alice, Bob>}
{<Alice,
Bob>}
Possible Solution:
Wrap both replicated values
in a single complex CRDT
Goal:
Compose larger application
using “safe” mappings
between simple lattices
Time
Monotone function
from set  max
size()
Set
(merge = Union)
Monotone function
from max  boolean
>= 5
Integer
(merge = Max)
Boolean
(merge = Or)
Monotonicity in Practice
“The more you know,
the more you know”
Never retract
previous outputs
(“mistake-free”)
Typical patterns:
• immutable data
• accumulate knowledge over time
• threshold tests (“if” w/o “else”)
Monotonicity and Determinism
Agents strictly learn more
knowledge over time
Monotone: different learning
order, same final outcome
Result:
Program is deterministic!
A program is confluent if it produces
the same results regardless of
network nondeterminism
Output
f(x)
Input
x
Network
Behavior
Output
f(x)
Output
f(x)
20
A program is confluent if it produces
the same results regardless of
network nondeterminism
Input
x
Network
Behavior
Output
f(x)
21
Consistency
As
Logical
Monotonicity
CALM Analysis
1. All monotone programs
are confluent
2. Simple syntactic test for
monotonicity
Result: Simple static
analysis for eventual
consistency
Handling Non-Monotonicity
… is not the focus of this talk 
Basic choices:
1. Nodes agree on an event order using a
coordination protocol (e.g., Paxos)
2. Allow non-deterministic outcomes
•
If needed, compensate and apologize
Putting It Into Practice
What we’d like:
• Collection of agents
• No shared state
( message passing)
• Computation over
arbitrary lattices
Bloom
Organization
Collection of agents
Communication Message passing
State
Relations (sets)
Computation
Relational rules
over sets (Datalog,
SQL)
Bloom
Organization
BloomL
Collection of agents Collection of agents
Communication Message passing
Message passing
State
Relations (sets)
Lattices
Computation
Relational rules
over sets (Datalog,
SQL)
Functions over lattices
Quorum Vote in BloomL
QUORUM_SIZE = 5
RESULT_ADDR = "example.org"
class QuorumVote
include Bud
Annotated Ruby class
Communication interfaces
state do
channel :vote_chn, [:@addr, :voter_id]
Program state
channel :result_chn, [:@addr]
Lattice state declarations
lset :votes
lmax :vote_cnt
lbool :got_quorum
AccumulateMap
votes
set ! max
end
into set
Map max ! bool
bloom do
votes
<= vote_chn {|v| v.voter_id}
Program
vote_cnt <= votes.size
got_quorum <= vote_cnt.gt_eq(QUORUM_SIZE)
result_chn
<~ got_quorum.when_true
{ [RESULT_ADDR] }
Merge function
for set lattice
end
Threshold test on bool
end
logic
27
Builtin Lattices
Name
Description
?
atb
lbool
Threshold test
lmax
Sample Monotone Functions
false
a∨ b
Increasing
number
1
max(a,b
)
gt(n) ! lbool
+(n) ! lmax
-(n) ! lmax
lmin
Decreasing
number
−1
min(a,b)
lt(n) ! lbool
lset
Set of values
;
a[b
intersect(lset) ! lset
product(lset) ! lset
contains?(v) ! lbool
size() ! lmax
lpset
Non-negative set
;
a[b
sum() ! lmax
lbag
Multiset of values
;
a[b
mult(v) ! lmax
+(lbag) ! lbag
lmap
Map from keys to
lattice values
empty
map
when_true() ! v
at(v) ! any-lat
intersect(lmap) ! lmap
28
Case Study
Goal:
Provably eventually consistent
key-value store (KVS)
Assumption:
Map keys to lattice values
(i.e., values do not decrease)
Solution:
Use a map lattice
Time
Nested lattice value
Replica 1
Replica 2
Time
Add new K/V pair
Replica 1
Replica 2
Time
“Grow” value in extant K/V pair
Replica 1
Replica 2
Time
Replica Synchronization
Replica 1
Replica 2
Goal:
Provably eventually consistent KVS
that stores arbitrary values
Solution:
Assign a version to each
key-value pair
Each replica stores increasing
versions, not increasing values
Object Versions in Dynamo/Riak
1. Each KV pair has a vector clock version
2. Given two versions of a KV pair, prefer
the one with the strictly greater version
3. If versions are incomparable, invoke userdefined merge function
Vector Clock:
Map from node IDs  logical clocks
Solution:
Use a map lattice
Logical Clock:
Increasing counter
Solution:
Use an increasing-int lattice
Version-Value Pairs
Pair = <fst, snd>
Pair merge(Pair o)
{
if self.fst > o.fst: self
elsif self.fst < o.fst: o
else new Pair(self.fst.merge(o.fst),
self.snd.merge(o.snd))
}
Time
Replica 1
Replica 2
Time
Version increase;
NOT value increase
Replica 1
Replica 2
Time
R1’s version replaces
R2’s version
Replica 1
Replica 2
Time
New version @ R2
Replica 1
Replica 2
Time
Concurrent writes!
Replica 1
Replica 2
Merge VC (automatically),
value merge via user’s lattice
(as in Dynamo)
Time
Replica 1
Replica 2
Lattice Composition in KVS
Key-Value Store
lmap
<Version, Value> Pair
key
lpair
User-Defined
Merge Function
Vector Clock
lmap
node ID
user-lattice
lmax
Conclusion
Dealing
with EC
Many event orders  orderindependent (disorderly) programs
Lattices
Disorderly state
Monotone Disorderly computation
Functions
Monotone Lattices + monotone functions for
Bloom
safe distributed programming
Questions Welcome
Please try Bloom!
http://www.bloom-lang.org
Or:
gem install bud
Backup Slides
Lattices
hS,t,?i is a bounded join semi-lattice iff:
– S is a partially ordered set
– t is a binary operator (“least upper bound”)
• For all x,y 2 S, x t y = z where x ·S z, y ·S z, and there is
no z’  z 2 S such that z’ ·S z.
• Associative, commutative, and idempotent
– ? is the “least” element in S (8x 2 S: ? t x = x)
Example: increasing integers
– S = Z, t = max, ? = -∞
49
Monotone Functions
f : ST is a monotone function iff
8a,b 2 S : a ·S b ) f(a) ·T f(b)
Example: size(Set) ! Increasing-Int
size({A, B}) = 2
size({A, B, C}) = 3
50
From Datalog ! Lattices
Datalog (Bloom)
BloomL
State
Relations
Lattices
Example Values
[[“red”, 1], [“green”, 2]]
set: [“red”, “green”]
map: {“red” => 1, “green” => 2}
counter: 5
condition: false
Computation
Rules over relations
Functions over lattices
Monotone
Computation
Monotone rules
Monotone functions
Program Semantics
Fixpoint of rules
(stratified semantics)
Fixpoint of functions
(stratified semantics)
51
Bloom Operational Model
Fixpoint
State
Update
Local Updates
Bloom Rules
System Events
Inbound Network
atomic, local,
deterministic
Outbound
Network
52
Quorum Vote in Bloom
QUORUM_SIZE = 5
RESULT_ADDR = "example.org"
class QuorumVote
include Bud
Communication
state do
channel :vote_chn, [:@addr, :voter_id]
channel :result_chn, [:@addr]
table :votes, [:voter_id]
scratch :cnt, [] => [:cnt]
end
Persistent Storage
Transient Storage
Not (set) monotonic!
Accumulate votes
bloom do
votes
<= vote_chn {|v| [v.voter_id]}
cnt
<= votes.group(nil, count(:voter_id))
result_chn <~ cnt {|c| [RESULT_ADDR] if c >= QUORUM_SIZE}
end
end
Send message when quorum reached
53
Current Status
Writeups
BloomL: UCB Tech Report
Bloom/CALM: CIDR’11, website
Lattice
Runtime
Available as a git branch
• To be merged soon-ish
Examples, • KVS
Case
• Shopping carts
Studies
• Causal delivery
Under development:
• MDCC, concurrent editing
Download