Set 18: Wait-Free Simulations Beyond Registers CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS CSCE 668 Fall 2011 Prof. Jennifer Welch 1 Data Types Beyond Registers 2 Registers support the operations read and write We've seen wait-free simulations of one kind of register out of another kind different numbers of values, readers, writers What about (wait-free) simulating a significantly different kind of data type out of registers? More generally, what about (wait-free) simulating an object of type X out of objects of type Y ? Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Key Insight 3 Ability of objects of type Y to be used to simulate an object of type X is related to the ability of those data types to solve consensus! We are focusing on systems that are asynchronous shared memory wait-free Set 18: Wait-Free Simulations Beyond Registers CSCE 668 FIFO Queue Example 4 Sequential specification of a FIFO queue: operation with invocation enq(x) and response ack operation with invocation deq and response return(x) a sequence of operations is allowable iff each deq returns the oldest enqueued value that has not yet been dequeued (returns if queue is empty) Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Consensus Algorithm for n = 2 Using FIFO Queue 5 Initially Q = [0] and Prefer[i] = Prefer[i] := pi's input val := deq(Q) if val = 0 then decide on pi's input else temp := Prefer[1 - i] decide temp one shared FIFO queue two shared registers write my input into my register use shared queue to arbitrate between the 2 procs: first one to dequeue the initial 0 wins, decision value is its input loser obtains decision value from other proc's register Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Implications of Consensus Algorithm Using FIFO Queue 6 Suppose we want to wait-free simulate a FIFO queue using read/write registers. Is this possible? No! If it were possible, we could solve consensus: simulate a FIFO queue using registers use simulated queue and previous algorithm to solve consensus Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Extend Algorithm to More Procs? 7 Can we use FIFO queues to solve consensus with more than 2 procs? The ability to atomically dequeue a value was key to the 2-proc alg: one proc. learns it is the winner the other learns it is the loser, therefore the id of the winner is obvious Not clear how to handle 3 procs. Suppose we have a different data type: Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Compare & Swap Specification 8 compare&swap(X : shared memory address, old: value, new: value) previous := X // previous is a local var. if previous = old then X := new return previous X old new Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Consensus Algorithm Using Compareand-Swap 9 Initially First = one shared C&S object if First = then replace with my input val := compare&swap(First, , my input) if val = then simultaneously indicate the decide on my input winner and the value to be else decided by all the losers decide val Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Impossibility of 3-Proc Consensus with FIFO Queue 10 Theorem (15.3): Wait-free consensus is impossible using FIFO queues and registers if n > 2. Proof: Same structure as for registers. Key difference is when considering situation when C is bivalent p0(C) is 0-valent and p1(C) is 1-valent. Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Impossibility of 3-Proc Consensus with FIFO Queues 11 p0 and p1 must be accessing the same FIFO queue. Case 1: Both steps are deq's. 0/1 C p0 deq's p1 deq's 1 0 p1 deq's 0 p0 deq's look same to p2 1 Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Impossibility Proof 12 Case 2: p0 deq's and p1 enq's. Case 2.1: The queue is not empty in C 0/1 C p0 deq's p1 enq's 0 1 p1 enq's p0 deq's ? Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Impossibility Proof 13 Case 2: p0 deq's and p1 enq's. Case 2.2: The queue is empty in C 0/1 C queue is empty p0 deq's p1 enq's queue is still empty 0 look the same to p2 1 p0 deq's queue is empty again 1 Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Impossibility Proof 14 Case 3: Both p0 and p1 enq (on same queue). C 0/1 p0 enq's A p1 enq's B p1 enq's B p0 enq's A 0 1 : p0 takes steps until deq'ing A why do and : p0 takes exist? steps until deq'ing B : p1 takes : p1 takes steps until deq'ing B 0 steps until deq'ing A look the same to p2 Set 18: Wait-Free Simulations Beyond Registers 1 CSCE 668 Impossibility Proof 15 Case 3 cont'd: Suppose does not exist: C 0/1 p0 enq's A p1 enq's B p1 enq's B p0 enq's A 0 1 p0 takes steps until deciding but never deq's A; decides 0 p0 takes same number of steps as on the left; never deq's B; also decides 0 0 Set 18: Wait-Free Simulations Beyond Registers 1 CSCE 668 Impossibility Proof 16 Case 3 cont'd: Prove existence of similarly. Thus there is no wait-free algorithm for consensus with 3 procs using FIFO queues and registers. Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Implications 17 Suppose we want to wait-free simulate a compare&swap object using FIFO queues (and registers). Is this possible? Not if n > 2! If it were possible, we could solve consensus using FIFO queues (and registers): simulate a compare&swap object using FIFO queues (and registers) use simulated compare&swap object and c&s algorithm to solve consensus Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Generalize these Arguments 18 Previous results concerning FIFO queues and compare&swap suggest a criterion for determining if wait-free simulations exist: based on ability of the data types to solve consensus for a certain number of procs. Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Consensus Number 19 Data type X has consensus number n if n is the largest number of procs. for which consensus can be solved using only objects of type X and read/write registers. data type consensus number read/write register 1 FIFO queue 2 compare&swap ∞ Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Using Consensus Numbers 20 Theorem (15.5): If data type X has consensus number m and data type Y has consensus number n with n > m, then there is no wait-free simulation of an object of type Y using objects of type X and read/write registers in a system with more than m procs. X reg X reg X reg … Y … Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Using Consensus Numbers 21 Proof: Suppose in contradiction there is a wait-free simulation S of Y using X and registers in a system with k procs, where m < k ≤ n. Construct consensus algorithm for k > m procs using objects of type X (and registers): Use S to simulate some objects of type Y using objects of type X (and registers) Use the (simulated) type Y objects (and registers) in the kproc consensus algorithm that exists since CN(Y) = n. Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Corollaries 22 There is no wait-free simulation of any object with consensus number > 1 using just read/write registers. There is no wait-free simulation of any object with consensus number > 2 using just FIFO queues and read/write registers. Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Universality 23 Let's now consider positive results relating to consensus number. A data type is universal if objects of that type (together with read/write registers) can wait-free simulate any data type. Theorem: If data type X has consensus number n, then it is universal in a system with at most n procs. Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Proving Universality Result 24 1. Describe an algorithm that simulates any data type uses compare&swap (instead of any object with consensus number n) simulation is only non-blocking, weaker than wait-free 2. Modify to use any object with consensus number n 3. Modify to be wait-free 4. Modify to bound shared memory used Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Non-Blocking 25 Non-blocking vs. wait-free is analogous to nodeadlock vs. no-lockout for mutual exclusion. Non-blocking simulation: at any point in an execution, if at least one operation is pending (response is not yet ready to be done), then there is a finite sequence of steps by a single proc that completes one of the pending operations. Does not ensure that every pending operation is eventually completed. Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Universal Construction 26 Keep history of operations that have been applied to the simulated object as a shared linked list. To apply an operation on the simulated object, the invoking proc. must insert an appropriate "node" into the linked list: it is convenient to put the newest node at the head of the list A compare&swap object is used to keep track of the head of the list Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Details on Linked List 27 Each linked list node has operation invocation new state of the simulated object operation response pointer to previous node (previous op) Head anchor invocation invocation state state initial state response response before before Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Simulation 28 Initially Head points to anchor node represents initial state of simulated object When inv is invoked: allocate a new linked list node in shared memory, pointed to by local var point point.inv := inv repeat depends on h := Head // h is a local var simulated data type point.state, point.response := apply(inv,h.state) if Head still points to point.before := h same node h points until compare&swap(Head,h,point) = h to, then make Head point to new node. do the output indicated by point.response Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Simulation Figure 29 invocation state response point pi h before invocation Head state response … if compare&swap indicates that Head has moved on, then try again to insert the new node, at the new location before Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Strengthenings of Algorithm 30 To replace compare&swap object with any object with consensus number n (the number of procs): define a consensus object (data type version of consensus problem) get around the difficulty that a consensus object can only be used once by adding a consensus object to each linked list node that points to next node in the list Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Strengthenings of Algorithm 31 To get a wait-free implementation, use idea of helping: procs help each other to finish pending operations (not just their own) To reduce the size of the linked list (so it doesn't grow without bound), need to keep track of which list nodes can be recycled. Set 18: Wait-Free Simulations Beyond Registers CSCE 668 Effect of Randomization 32 Suppose we relax the liveness condition for linearizable shared memory: operations must terminate with high probability Now a randomized consensus algorithm can be used to simulate any data type out of any other data type, including read/write registers I.e., hierarchy collapses. Set 18: Wait-Free Simulations Beyond Registers CSCE 668