Set 9: Fault Tolerant Consensus CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS CSCE 668 Fall 2011 Prof. Jennifer Welch 1 Processor Failures in Message Passing 2 Crash: at some point the processor stops taking steps at the processor's final step, it might succeed in sending only a subset of the messages it is supposed to send Byzantine: processor changes state arbitrarily and sends messages with arbitrary content Set 9: Fault Tolerant Consensus CSCE 668 Consensus Problem 3 Every processor has an input. Termination: Eventually every nonfaulty processor must decide on a value. decision is irrevocable! Agreement: All decisions by nonfaulty processors must be the same. Validity: If all inputs are the same, then the decision of a nonfaulty processor must equal the common input. Set 9: Fault Tolerant Consensus CSCE 668 Examples of Consensus 4 Binary inputs: input vector 1,1,1,1,1 input vector 0,0,0,0,0 decision must be 0 input vector 1,0,0,1,0 decision must be 1 decision can be either 0 or 1 Multi-valued inputs: input vector 1,2,3,2,1 decision can be 1 or 2 or 3 Set 9: Fault Tolerant Consensus CSCE 668 Overview of Consensus Results 5 Synchronous system At most f faulty processors Tight bounds for message passing: crash failures Byzantine failures number of rounds f+1 f+1 total number of processors f+1 3f + 1 polynomial polynomial message size Set 9: Fault Tolerant Consensus CSCE 668 Overview of Consensus Results 6 Impossible in asynchronous case. Even if we only want to tolerate a single crash failure. True both for message passing and shared readwrite memory. Set 9: Fault Tolerant Consensus CSCE 668 Modeling Crash Failures 7 Modify failure-free definitions of admissible execution to accommodate crash failures: All but a set of at most f processors (the faulty ones) taken an infinite number of steps. In synchronous case: once a faulty processor fails to take a step in a round, it takes no more steps. In a faulty processor's last step, an arbitrary subset of the processor's outgoing messages make it into the channels. Set 9: Fault Tolerant Consensus CSCE 668 Modeling Byzantine Failures 8 Modify failure-free definitions of admissible execution to accommodate Byzantine failures: A set of at most f processors (the faulty ones) can send messages with arbitrary content and change state arbitrarily (i.e., not according to their transition functions). Set 9: Fault Tolerant Consensus CSCE 668 Consensus Algorithm for Crash Failures 9 Code for each processor: v := my input at each round 1 through f+1: if I have not yet sent v then send v to all wait to receive messages for this round v := minimum among all received values and current value of v if this is round f+1 then decide on v Set 9: Fault Tolerant Consensus CSCE 668 Execution of Algorithm 10 round 1: send my input receive round 1 msgs compute value for v in channels initially deliver events compute events round 2: Relation to Formal Model send v (if this is a new value) receive round 2 msgs compute value for v due to previous compute events deliver events compute events … round f + 1: send v (if this is a new value) receive round f + 1 msgs compute value for v decide v due to previous compute events deliver events compute events part of compute events Set 9: Fault Tolerant Consensus CSCE 668 11 Correctness of Crash Consensus Algorithm Termination: By the code, finish in round f+1. Validity: Holds since processors do not introduce spurious messages: if all inputs are the same, then that is the only value ever in circulation. Set 9: Fault Tolerant Consensus CSCE 668 12 Correctness of Crash Consensus Algorithm Agreement: Suppose in contradiction pj decides on a smaller value, x, than does pi. Then x was hidden from pi by a chain of faulty processors: q1 round 1 q2 round 2 … qf round f qf+1 round f+1 pj pi There are f + 1 faulty processors in this chain, a contradiction. Set 9: Fault Tolerant Consensus CSCE 668 Performance of Crash Consensus Algorithm 13 Number of processors n > f f + 1 rounds at most n2 •|V| messages, each of size log|V| bits, where V is the input set. Set 9: Fault Tolerant Consensus CSCE 668 Lower Bound on Rounds 14 Assumptions: n > f + 1 every processor is supposed to send a message to every other processor in every round Input set is {0,1} Set 9: Fault Tolerant Consensus CSCE 668 Failure-Sparse Executions 15 Bad behavior for the crash algorithm was when there was one crash per round. This is bad in general. A failure-sparse execution has at most one crash per round. We will deal exclusively with failure-sparse executions in this proof. Set 9: Fault Tolerant Consensus CSCE 668 Valence of a Configuration 16 The valence of a configuration C is the set of all values decided by a nonfaulty processor in some configuration reachable from C by an admissible (failure-sparse) execution. Bivalent: set contains 0 and 1. Univalent: set contains only one value 0-valent or 1-valent Set 9: Fault Tolerant Consensus CSCE 668 Valence of a Configuration 17 C 0 D 0/1 E 0/1 1 F G 0/1 0 0 0 0 0 1 0 11 1 1 1 1 0 0 1 <= decisions 0/1 : bivalent 1 : 1-valent 0 : 0-valent Set 9: Fault Tolerant Consensus CSCE 668 Statement of Round Lower Bound 18 Theorem (5.3): Any crash-resilient consensus algorithm requires at least f + 1 rounds in the worst case. Proof Strategy: round 1 show bivalent initial config. round 2 … round f-2 round f-1 round f show we can keep things bivalent through round f - 1 Set 9: Fault Tolerant Consensus CSCE 668 show we can keep a n.f. proc. from deciding in round f Existence of Bivalent Initial Config. 19 Suppose in contradiction all initial configurations are univalent. inputs valency 000…00 0 000…01 ? 000…11 ? by validity condition … 001…11 ? 0 011…11 ? 1 111…11 1 Set 9: Fault Tolerant Consensus CSCE 668 Existence of Bivalent Initial Config. 20 Let I0 be a 0-valent initial config I1 be a 1-valent initial config s.t. they differ only in pi 's input I0 pi fails initially, no other failures. By termination, eventually rest decide. all but pi decide 0 I1 This execution looks the same as the one above to all the processors except pi. Set 9: Fault Tolerant Consensus CSCE 668 all but pi decide 0 Keeping Things Bivalent 21 Let ' be a (failure-sparse) k-1 round execution ending in a bivalent config. for Show there is a one-round (f-s) extension of ' ending in a bivalent config. so k-1<f-1 has k < f rounds Suppose in contradiction every one-round (f-s) extension of ' is univalent. Set 9: Fault Tolerant Consensus CSCE 668 Keeping Things Bivalent 22 failure-free round k 1-val pi fails to send to … bival ' 1-val pi fails to send to q1,…,qj rounds 1 to k-1 pi fails to send to q1,…,qj+1 … now focus in on these two extensions 0-val pi crashes 0-val pi fails to send to q1,…,qm Set 9: Fault Tolerant Consensus CSCE 668 Keeping Things Bivalent 23 qj+1 fails in n.f. decide rd. k+1; 1 no other failures pi fails to send to q1,…,qj ' rounds 1 to k-1 1-val round k only qj+1 can tell difference 0-val pi fails to send to q1,…,qj+1 Set 9: Fault Tolerant Consensus n.f. decide 1 CSCE 668 Cannot Decide in Round f 24 We've shown there is an f - 1 round (failure-sparse) execution, call it , ending in a bivalent configuration. Extending this execution to f rounds might not preserve bivalence. However, we can keep a processor from explicitly deciding in round f, thus requiring at least one more round (f+1). Set 9: Fault Tolerant Consensus CSCE 668 Cannot Decide in Round f 25 Case 1: There is a 1-round (f-s) extension of ending in a bivalent config. Then we are done. Case 2: All 1-round (f-s) extensions of end in univalent configs. Set 9: Fault Tolerant Consensus CSCE 668 Cannot Decide in Round f 26 round f 1-val failure free pi sends to pj and pk pk either undecided or decided 1 look same to pk bival. rounds 1 to f-1 pi fails to send to nf pj , sends to another nf pk look same to pj 0-val pi fails to send to nf pj pi might send to pk Set 9: Fault Tolerant Consensus pj either undecided or decided 0 CSCE 668