Failure detector talk

On the Weakest Failure Detector Ever Petr Kouznetsov (Max Planck Institute for SWS) Joint work with: Rachid Guerraoui (EPFL) Nancy Lynch (MIT) © 2007 P. Kouznetsov Maurice Herlihy (Brown) Calvin Newport (MIT) Big picture Choosing a model:   Optimistic model: the system is very efficient but likely to fail Conservative model: the system is very robust but inefficient (or impossible to implement) What is the right model? © 2007 P. Kouznetsov 2 Synchrony assumptions    Asynchronous read-write shared memory model: no bounds on relative processing speed Very appealing in practice! Too conservative: most problems are not solvable [FLP85, LA87; HS,SZ,BG93]; (solvable in synchronous systems though) © 2007 P. Kouznetsov 3 So what do we need exactly? What is the minimal amount of synchrony that circumvents some asynchronous impossibility?  “minimal amount of synchrony”? - The weakest failure detector © 2007 P. Kouznetsov 4 Model Asynchronous read-write shared-memory system with failure detectors FD p q FD © 2007 P. Kouznetsov r FD 5 Comparing failure detectors Failure detector D is weaker than failure detector D’ if there exists an algorithm that emulates D using D’ D’ p D D’ D’ q D © 2007 P. Kouznetsov r D 6 The weakest non-trivial failure detector A failure detectors X that is: non-trivial: circumvents some asynchronous impossibility weaker than any non-trivial failure detector The “easiest” non-trivial problem? © 2007 P. Kouznetsov 7 A Very Weak Failure Detector Y outputs a non-empty set of process ids Eventually, the same set U is output at every correct process: U is not the current set of correct processes Example:  Π={p,q,r}, C={p,q}  Y outputs {p},{q},{p,r},{q,r},{p,q,r} © 2007 P. Kouznetsov 8 Y is non-trivial Theorem 1 Y solves (N-1)-set agreement Every process in P1,…,PN proposes a value and must decide on some proposed value so that:  At most N-1 distinct values are decided (!) not solvable in asynchronous systems [HS93,BG93,SZ93] © 2007 P. Kouznetsov 9 Set agreement is almost solvable  If N-1 or less distinct values are proposed, e.g., if N-1 or less processes participate k-convergence [YNG98]  Y should handle the case when N values are around © 2007 P. Kouznetsov 10 Citizens and gladiators   Split the system into Gladiators (the stable output of Y) and Citizens (all the rest) Gladiators eliminate at least one value using (G-1)-convergence or adopt a value from Citizens Y © 2007 P. Kouznetsov Π-Y 11 Correctness Eventually, Gladiators are not the set of correct processes ⇨ At least one gladiator is faulty, or at least one Citizen is correct ⇨ Gladiators commit on G-1 values or adopt a value from a citizen ⇨ At least one process gives up its value ⇨ at most N-1 values survive! © 2007 P. Kouznetsov 12 Y is minimal Theorem 2 Y is weaker than any stable nontrivial failure detector D D is stable if, eventually, the same value is permanently output at every correct process (e.g., P, ⃟P, Ω, Ωk) © 2007 P. Kouznetsov 13 Minimality proof: toy example Consider a “faithful” failure detector D that solves a wait-free impossible problem P: in every execution E, D outputs the same value v that depends only on correct(E) Claim 1 For all v, there is a non-empty set of processes C such that v cannot be output by D when C is the set of correct processes Suppose not: v is valid for any C => D can be replaced with a “dummy” that always outputs v --- a contradiction! © 2007 P. Kouznetsov 14 Minimality proof: general case Consider any non-trivial stable D Claim 2 For all v, there exists an infinite execution E in which v cannot be the only value output by D Reduction:  As long as D is stable on v: use E(v) to extract Y © 2007 P. Kouznetsov 15 Conclusions   Y is the weakest non-trivial stable failure detector (can be generalized to the f-resilient case – Yf) (N-1)-set agreement is the easiest non-trivial problem? © 2007 P. Kouznetsov 16 Future  Establishing the “weakest ever” result in the most general class of failure detectors (not Y!) Y is not the weakest: an unstable “composition” of Ωn and Y is even weaker! [Chen et al., Zielinski, …] © 2007 P. Kouznetsov 17 Thank you! © 2007 P. Kouznetsov 18 k-convergence [YNG98] Processes propose values and commit on or adopt one of the proposed values:  If a process commits, then at most k values are committed or adopted  If k or less values proposed, every process commits (!) wait-free solvable for any k (!!) (N-1)-convergence almost solves (N-1)-agreement! But termination is an issue in case all N values are around – that’s where Y is of use! © 2007 P. Kouznetsov 19 Minimality proof: general case Consider any non-trivial stable D Claim 2 For all v, there exists an infinite execution E in which v cannot be the only value output by D Reduction:  As long as D is stable: Locate a faulty process in a finite prefix of E (including all steps of faulty(E) ) Or, output correct(E)  Y is extracted! © 2007 P. Kouznetsov 20 Generalization to f-resilience f-resilient impossible problems: can be solved when less or f fail but cannot when f fail   Yf output a set of size ≥N-f Eventually, the same set U is permanently output at every correct process Yf is the weakest stable failure detector to circumvent an f-resilient impossibility © 2007 P. Kouznetsov 21 Big picture Addressing the WFD question contributes to:   Understanding complexity and computability bounds of distributed abstractions Establishing a clean classification of problems in distributed computing “WFD ever” corresponds to the easiest non-trivial problem in distributed computing © 2007 P. Kouznetsov 22

Failure detector talk

Related documents

Products

Support

Failure detector talk

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib