Leader Election Leader Election: the idea We study Leader Election in rings Why rings? • historical reasons – original motivation: regenerate lost token in token ring networks • illustrates techniques and principles • good for lower bounds and impossibility results Outline • Specification of Leader Election • YAIR • Leader election in asynchronous rings: • An O(n2) algorithm • An O(nlog(n)) algorithm • The revenge of the lower bound! • Leader election in synchronous rings • Breaking the W(nlog(n)) barrier Message passing: Model • n processors p0,…pn-1 • connected by bi-directional communication channels • topology represented by undirected graph p0 some links may be missing p1 p2 p4 p3 Processors Each pi is a state machine • state set Qi • distinguished initial states pi’s state includes • outbufi[l]: set of messages sent on l-th channel and not yet delivered • inbufi[l]: set of messages delivered on l-th channel and not yet processed • inbufi initially empty • outbufi not accessible • could be infinite State Transitions A state transition: • input: accessible state of pi (doesn’t depend on outbufi) • consumes all messages in inbufi • outputs at most a message per channel Terminology Definition: A configuration is a vector C = (q0,…,qn-1) • • each qi is a state of pi set of outbufi are messages in transit In an initial configuration each qi is an initial state of pi Definition: An event is • • a computation event comp(i) a delivery event del(i,j,m) Definition: An execution is an infinite sequence C0,f0,C1,f1,… where • • • C0 is an initial configuration each Ci is a configuration each fi is an event Definition: A schedule for the above execution is the sequence of events f0,f1 ,… Safety and Liveness Safety property : “nothing bad happens” • holds in every finite execution prefix – Windows™ never crashes – if one general attacks, both do – a program never terminates with a wrong answer Liveness property: “something good eventually happens” • no partial execution is irremediable – Windows™ always reboots – both generals eventually attack – a program eventually terminates Admissible executions satisfy safety and liveness properties for a particular system type. A really cool theorem Every property is a combination of a safety property and a liveness property (Alpern and Schneider) Asynchronous Message-Passing Systems C0,f0,C1,f1,C2 … if fk = del(i,j,m) • • in Ck-1 – m is in outbufi[l], where l is pi’s label for channel {pi, pj} in Ck , – remove m from outbufi[l] – add m to outbufi[h], where h is pi’s label for channel {pi, pj} if fk = comp(i) • • • pi changes state according to its transition function empties inbufi in Ck-1 might add messages to outbufi in Ck Admissible if: • Every processor takes an infinite number of computation steps • Every message sent is eventually delivered Synchronous Message-Passing Systems C0,f0,C1,f1,C2 … • • • • all asynchronous constraints, plus execution partitioned into disjoint rounds one delivery event for every message in every outbuf followed by one computation event for every processor Remarks • not realistic, but • good for algorithm design • good for lower bounds Complexity TIME • each processor’s state set includes terminated states • termination: – all processors in terminated states – no messages in transit Synchronous: count number of rounds until termination Asynchronous: set unit of time as maximum message delay SPACE • Count maximum total number of messages The Problem • Final states of processes partitioned in two classes: elected non-elected • Once entered a state, always in that state • In every admissible execution, exactly one process (the leader) enters an elected state. All remaining enter a nonelected state Lots of variations... • The ring can be unidirectional or bidirectional • The number n of processors may be known or unknown • Processors can be identical or can be somehow distinguished • Communication may be synchronous or asynchronous Uni- vs. Bidirectional In unidirectional rings, messages can only be sent in a clockwise direction Can processors be distinguished? If no, anonymous algorithms • Processors have no UID • Formally: identical automata • Can distinguish between left and right. Can processors be distinguished? If yes: • processors have unique IDs • chosen from some large totally ordered space of ids (e.g. N+) • no constraint on which ID are used (e.g. integers may not be consecutive) • IDs can be either manipulated only by certain operations (e.g. comparison) • or by unrestricted operations Is n known? If no, uniform algorithms • Algorithm cannot use information about ring size Communication: Asynchronous vs. Synchronous Asynchronous: • no upper bound on message delivery time • no centralized clock • no bound on relative speed of processes Synchronous: • communication in rounds • In a round a process: – delivers all pending messages – takes an execution step (which may involve sending one or more messages) if no failures, every message sent is eventually delivered An Impossibility Result Theorem There is no deterministic solution to the leader election problem for a synchronous, non-uniform, anonymous bidirectional ring. Proof Suppose that a solution exists for a system A of n > 1 processes. Each process of A starts in the same state Lemma The states of all processors at the end of the each round of the execution of A are the same. Proof By induction on number of rounds k • Base case: k = 0 Easy, since processes start in same state. • Inductive step: Lemma holds for k = t-1 – processors are identical up to round k = t-1 – send same messages to left and right neighbors • every processors receives identical messages on left and right channel – all processors apply same transition function to identical states in round t – all processors have identical states at the end of round t Then, if one enters leader state, all do! Observations • What are the implication for asynchronous rings? • What are the implication for uniform rings? Outline • Specification of Leader Election • YAIR • Leader election in asynchronous rings: • An O(n2) algorithm • An O(nlog(n)) algorithm • The revenge of the lower bound! • Leader election in synchronous rings • Breaking the W(nlog(n)) barrier The LCR Algorithm LeLann (1977), Chang and Roberts (1979) • unidirectional • asynchronous • non anonymous: every process has uid • uniform (does not depend on n) 1: upon receiving no message 2: send uidi to left (clockwise) 3: upon receiving m from right 4: case 5: m.uid > uidi : 6: send m to left 7: m.uid < uidi : 8: discard m 9: m.uid = uidi : 10: leader := i 11: send <terminate, i> to left 12: terminate endcase 13: upon receiving <terminate, i> from right neighbor 14: leader := i 15: send <terminate, i> to left 16: terminate Correctness • messages from process with highest ID are never discarded • therefore the correct leader is elected • no other processor ID can traverse the entire ring • therefore no one else is elected Complexity Message complexity: Time complexity: O(n2) O(n) This bound is tight… n-1 n-2 0 1 Can we do better? 2 The HS algorithm Hirschenberg and Sinclair (1980) • Ring is bidirectional • Each process pi operates in phases • In each phase l, pi sends out “tokens” containing uidi in both directions • Tokens are intended to travel distance 2l and return to pi • However, tokens may not make it back Phase 2 1 0 • Token continues outbound only if greater than tokens on path • Otherwise discarded • All processes always forward tokens moving inbound If pi receives its own token while it is going outbound, pi is the leader The Protocol 0: Init: asleep := true 1: upon receiving no message 2: if asleep then asleep := false send <uidi,out,1> to left and right 3: upon receiving <uidj,out,h> from left 4: case 5: uidj > uidi and h>1 : 6: send <uidj,out,h-1> to right 7: uidj > uidi and h=1 : 8: send <uidj,in, 1> to left 9: uidj = uidi : 10: leader := i 11:endcase 12: upon receiving <uidj,out,h> from right 13: case 14: uidj > uidi and h>1: 15: send <uidj,out,h-1> to left 16: uidj > uidi and h=1: 17: send <uidj,in, 1> to right 18: uidj = uidi 19: leader := i 20: endcase 21: upon receiving <uidj,in,1> from right 22: send <uidj,in,1> to left 23: upon receiving <uidj,in,1> from left 24: send <uidj,in,1> to right 25: upon receiving <uidi,in,1> from left and right 26: phase := phase +1 27: send (uidi,out,2phase) to left and 28: right Correctness Same as LCR: • messages from process with highest ID are never discarded • therefore the correct leader is elected • no other processor ID can traverse the entire ring • therefore no one else is elected Communication Complexity • Every processor sends a token in phase 0 4n messages • For phase l > 0, – the only processors to send a tokens are those who “won” in phase l-1 l-1 – There is a winner for every 2 +1 processors – Winners in phase l > 0 – Tokens travel distance 2l n l 1 2 1 8n – Total number of messages sent in phase l is bounded by 4 2l • Total number of phases • No. of messages bound by n 2l 1 1 1 log n 8n1 log n which is O(n logn) Time Complexity • Time for each phase l 2 · 2l = 2l+1 • Final phase takes n (tokens only traveling outbound) • Next to last phase is l log n 1 • Total time complexity excluding last phase 2 2 logn Time complexity is at most 3n to 5n The revenge of the lower bound So far we have seen: • a simple O(n2) algorithm • a more clever O(n logn) algorithm • focus on message complexity Facts: • W(n logn) lower bound in asynchronous networks • W(n log n) lower bound in synchronous networks when using only comparisons Outline • Specification of Leader Election • YAIR • Leader election in asynchronous rings: • An O(n2) algorithm • An O(nlog(n)) algorithm • The revenge of the lower bound! • Leader election in synchronous rings • Breaking the W(nlog(n)) barrier • The rise and fall of randomization Leader Election with fewer than O(n logn) messages • Synchronous rings • UID are positive integers • Can be manipulated using arbitrary arithmetic operations TimeSlice VariableSpeeds • n is known to all processors • n is not known to all processors • unidirectional communication • unidirectional communication • O(n) messages • O(n) messages What about Time complexity? What is special about synchronous rings? • Can convey information by not sending a message “when your phone doesn’t ring, it’s me” TimeSlice Runs in phases • each phase consists of n rounds • in phase i 0 – if no one elected yet – processor with id i – declares itself the leader – sends token with its UID around Message complexity: Time complexity: n n · UIDmin VariableSpeeds • Each process pi initiates a token • Different tokens travel at different speeds: • for token carrying UIDv, 1 message every • (each process waits sending it out) rounds UID 2 v rounds after receiving the token before UIDv 2 • Each process keeps track of smallest UID seen • Discard token with UID greater than smallest UID Complexity Analysis • By the time UIDmin goes around the ring, the second smallest UID has gone only half way, third smallest a fourth of the way, etc. • Forwarding the token carrying UIDmin has caused more messages than all the other tokens combined • Message complexity bound by 2n • Time Complexity n2 UIDmin Variable start times Processors can start at protocol different times • processors that wake up spontaneously (participants) send token with UID around ring • processors that wake up on receiving a UID (relays) do not initiate their own token A message life cycle • A message is in phase one • until it is received by an awake processor • forwarded immediately • A message is in phase two • once received by an awake processor UID • forwarded after 2 i 1 rounds The New Algorithm When participant receives a message from pi: • if UIDi larger than minimal seen (including own), swallow it • otherwise, delay for 2UIDmin 1 rounds When relay receives a message from pi: • if UIDi larger than minimal seen (not including own), swallow it • otherwise, delay for 2UIDmin 1 rounds Correctness Lemma: Only the participant processor with the smallest identifier receives its token back Proof: • • • • Let pi be participating processor with smallest UID No processor can swallow UIDi All tokens must go through pi , and will be swallowed No other processor can receive token back Complexity Three categories of messages: • phase one messages • phase two messages sent before the message of eventual leader enters its second phase • phase two messages sent after the eventual leader enters its second phase Complexity Lemma: The total number of messages in the first category is at most n. Proof The lemma follows because at most one phase one message is forwarded by each processor • • • • • Suppose pi forwards two phase 1 messages, carrying UIDj and UIDk Assume, WLOG, that pj closer to pi than pk. Them, phase 1 message with UIDk must go through pj If pj awake, then it becomes a phase 2 message Otherwise, pj becomes a relay and does not send its UID Complexity Lemma: The total number of messages in the second category is at most n Proof • After the first process awakens, it takes at most n rounds before message with UIDmin reaches a participant UID • During this time, token with UIDv is responsible for n 2 messages at most • Max number of messages obtained when UIDs are small (0,1,…,n-1) • Max number of messages in second category: v UIDv n 2 n v1 n Complexity Lemma: The total number of messages in the third category is at most 2n Proof: analogous to complexity analysis for Variable Speeds In summary: Message Complexity: Time complexity At most 4n n n 2UIDmin And now for something completely different... RANDOMIZATION Randomized Algorithms Extend transition function to accept as input • a random number • from a bounded range • under some fixed distribution Why is it important? The bad news: randomization alone does not generally affect • impossibility results – leader election in anonymous network still impossible! • worst case bounds The good news: randomization + weakening of problem statement does Example: Randomized Leader Election • Impossibility in anonymous rings still holds • but can now elect a leader with some probability • So weaken LE as follows Safety: In every configuration of every admissible execution, at most one processor is in an elected state Behaviors allowed by weakened specification: Liveness: At least one processor is elected with some non-zero probability • terminate without a leader • never terminate Back to Leader Election • Use randomization to have processes generate a pseudo identifier • Use a deterministic leader election algorithm to work with pseudo identifiers • Not just any deterministic LE algorithm: • needs to work correctly if multiple processes generate same pseudo id • a plus is the ability to detect if no leader elected A first result Assume • synchronous ring • non-uniform ring • processor can randomly choose identifiers Theorem There is a randomized algorithm which, with probability c > 1/e, elects a leader in a synchronous ring; the algorithm sends O(n2) messages The Algorithm Code for processor pi Initially 1 with probability 1 -1 n 2 with probability 1 n 0: pidi := Observations: 1: send pidi to left • randomization used once 2: upon receiving <S> from right 3: if |S| = n then 4: if pidi is unique max(S) then 5: elected := true 6: else 7: elected := false 8: else 9: send <S||pidi> to left • one execution for each element of = {1,2}n Definitions • exec(R): execution of R in • Given a predicate P on executions Pr[P]: probability of event {R : exec(R) satisfies P} Analysis What is the probability that the algorithm terminates with a leader? n 1 1 1 1 n n n 1 1 1 n Message Complexity: O(n2) n 1 n 1 1 c 1 e n Not good enough? Trade off more time and messages for higher probability of success • if |S| = n and pi detects no single max in S – choose new pidi – restart algorithm • becomes a set of n-tuples each of which is a possibly infinite sequence over {1,2} Analysis Probability of success in iteration k (1-c)k-1· c Time complexity: • worst-case number of iterations: • expected number of iterations: 1 c e Expected value of T: E[T ] x Pr[T x] x in T Expected message complexity: O(n2) Impossibility of Uniform Algorithms Theorem There is no uniform randomized algorithm for leader election in a synchronous anonymous ring that terminates in even a single execution for a single ring size Summary • No deterministic solution for anonymous rings • No solution for uniform anonymous rings (even when using randomization) • Protocols with O(n2) and O(nlogn) messages for uniform rings • W(n log n) lower bound on message complexity for practical protocols • O(n) message complexity for uniform synchronous rings