Edelweiss: Automatic Storage Reclamation for Distributed Programming Neil Conway Peter Alvaro Emily Andrews Joseph M. Hellerstein University of California, Berkeley Mutable shared state Frequent source of bugs Hard to scale • Accumulate & exchange sets of immutable events No mutation/deletion Event Logging • To delete: add new event “Event X should be ignored” • Current state: query over event log Example: Key-Value Store Mutable State Event Logging tbl = Hash.new i_log = Set.new d_log = Set.new Insert(k, v): tbl[k] = v Insert(k, v): i_log << [k,v] Update-in-place Delete(k): Set union tbl.delete(k) Delete(k): d_log << k Deletion View(): tbl View(): i_log.notin(d_log, Compute :k => “live” keys :k) Benefits of Event Logging 1. Concurrency 2. Replication 3. Undo/redo 4. Point-in-time query, audit trails (Sometimes: performance!) Example Applications • • • • Multi-version concurrency control (MVCC) Write-ahead logging (WAL) Stream processing Log-structured file systems Also: CRDTs, tombstones, purely functional data structures, accounting ledgers. Observation: Logs consume unbounded storage Solution: Discard log entries that are “no longer useful” (garbage collection) Observation: Logs consume unbounded storage Challenge: Discard log entries that are “no longer useful” (garbage collection) Traditional Approach “No longer useful” defined by application semantics – No framework support – Every system requires custom GC logic – Reinvented many times • >25 papers propose ~same scheme! Engineering Challenges 1. Difficult to implement correctly – Too aggressive: destroy live data – Too conservative: storage leak 1. Ongoing maintenance burden – GC scheme and application code must be updated together Our Approach 1. New language: Edelweiss – Based on Datalog – No constructs for deletion or mutation! 2. Automatically generate safe, applicationspecific distributed GC protocols 3. Present several in-depth case studies – Reliable unicast/broadcast, key-value store, causal consistency, atomic registers Base Data (“Event Logs”) Query Derived Data ( “Live View”) A log entry is useful iff it might contribute to the view. The queries define how log entries contribute to the view. Goal: Find log entries that will never contribute to the view in the future. Semantics of Base Data • Accumulate and broadcast to other nodes • Datalog: monotonic –Set union: grows over time • CALM Theorem [CIDR’11]: event log guaranteed to be eventually consistent Semantics of Derived Data Grows and shrinks over time – e.g., KVS keys added and removed Hence, not monotonic Common Pattern Live View = set difference between growing sets Insertions that haven’t been deleted Reliable Broadcast Outbound messages that haven’t been acknowledged Causal Writes that haven’t been Consistency replaced by a causally later write to the same key Key-Value Store Semantics of Set Difference X=Y–Z – Z grows: X shrinks – If t appears in Z, t will never again appear in X – “Anti-monotone with respect to Z” i_log = Set.new d_log = Set.new Insert(k, v): i_log << [k,v] Delete(k): d_log << k View(): i_log.notin(d_log, :k => :k) Can reclaim from i_log upon match in d_log Other Analysis Techniques • Reclaim from negative notin input – Often called “tombstones” – E.g., how to reclaim from d_log in the KVS • Reclaim from join input tables • Disseminate GC metadata automatically • Exploit user knowledge for better GC – Punctuations [Tucker & Maier ‘03] Whole Program Analysis • For each query q, find condition when input t will never contribute to q’s output – “Reclamation condition” (RC) • For each tuple t, find the conjunction of the RCs for t over all queries – When all consumers no longer need t: safe to reclaim “Positive” program: no deletion or state mutation Edelweiss Input Program Source To Source Rewriter Compute RCs, add deletion rules Input program + deletion rules Datalog Output Program Datalog Evaluator 60 50 Only 19 rules! c a k vs us at om al k re vs gi ic st re er gi st ( re w r gi ite er st xa er ct (re s) ad xa ct s) br un oa ic dc as br a t oa st dc ( as fixe d) t( ca e po us al ch b s) re r o qu ad es ca t- r es st po ns e Number of Rules Comparison of Program Size 70 Edelweiss Bloom 40 30 20 10 0 Takeaways No storage management code! – Similar to malloc/free vs. GC Programs are concise and declarative – Developer: just compute current view – Log entries removed automatically Reclamation logic application code always in sync Conclusions • Event logging: powerful design pattern – Problem: need for hand-written distributed storage reclamation code • Datalog: natural fit for event logging • Storage reclamation as a compiler rewrite? Results: – Automatic, safe GC synthesis! – High-level, declarative programs • No storage management code • Focus on solving domain problem Thank You! Future Work: Checkpoints • Closely related to simple event logging – Summarize many log entries with a single “checkpoint” record – View = last checkpoint + Query(¢Logs) • General goal: reclaim space by structural transformation, not just discarding data Future Work: Theory • Current analysis is somewhat ad hoc • If program does not reclaim storage, two possibilities: 1. Program is “not reclaimable” in principle • (Possible program bug!) 2. Our analysis is not complete • (Possible analysis bug!) How to characterize the class of “not reclaimable” programs? Reclaiming KVS Deletions • Good question • X.notin(Y): how to reclaim from Y? 1. Y is a dense ordered set; compress it. 2. Prove that each Y tuple matches exactly one X tuple i_log = Set.new d_log = Set.new Insert(k, v): i_log << [k,v] Delete(k): k is a key d_log << k of i_log View(): i_log.notin(d_log, :k => :k)