Computing with Concurrent Objects speaker sara tucci piergiovanni institution università di roma “la sapienza” dipartimento di informatica e sistemistica midlab laboratory Outline ﺱConcurrent Objects Definition ﺱThe “outside” view-point” Linearizability: what makes a concurrent object a “meaningful” programming abstraction ﺱThe “inside” view-point Wait-free implementation: what makes a concurrent object a “possible” programming abstraction ﺱA “global” look to the universe of concurrent objects Object hierarchy: what makes happy a “would-be theoretician” like me June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Lecture1: the “outside” view-point speaker sara tucci piergiovanni institution università di roma “la sapienza” dipartimento di informatica e sistemistica midlab laboratory What is an object? ﺱAn object in languages such as Java and C++ is a container for data. ﺄEach object provides a set of methods that are the only way to manipulate that object’s state. ﺄEach object has a class which describes how its methods behave ﺱObject description ﺄThe application programmer interface (API) ﺭpre-condition (describing the object’s state before invoking the method) ﺭpost-condition, describing the object’s state and return value after the method returns. ﺄFor example, if the FIFO queue object is non-empty (pre-condition), then the deq() method will remove and return the first element (post-condition), and otherwise it will throw an exception (another pre- and post-condition). June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Pre and Post Conditions ﺱDefining objects in terms of pre-conditions and post-conditions makes perfect sense in a sequential model computation where a single thread manipulates a collection of objects. ﺱIn this case methods are called once at a time, each method invocation is followed by the corresponding return and a sequence of method calls can be defined method invocation exception p enq(a;ok) enq(b;ok) deq( ;a) deq( ;b) deq() method response June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects time Concurrent Model ﺱIf an object’s methods can be invoked by concurrent threads, then the method executions can overlap in time, and it no longer makes sense to talk about the order of method calls. ﺱWhat does it mean, in a multithreaded program, if a and b are enqueued on a FIFO queue during overlapping intervals? Which will be dequeued first? where is the trick? p enq(a;ok) q June 8-9 2006 enq(b;ok) deq( ;?) deq( ;?) method call enq(a;ok) queue’s state a a a a a a a at the end of the invocation, the queue surely contains a, but during the invocation what did it happen? God knows Seminars in Distributed Computing - Computing with Concurrent Objects The Linearizability Manifesto ﺱThe Linearizability Manifesto. Each method call should appear to “take effect” instantaneously at some moment between its invocation and response. p enq(a;ok) q enq(b;ok) deq( ;b) deq( ;a) time S a June 8-9 2006 a b b Seminars in Distributed Computing - Computing with Concurrent Objects Linearizability: scenario ﺱAgain, is this execution linearizable? ...try to put a point for each method call... p enq(a;ok) q r enq(b;ok) deq( ;a) deq( ;c) enq(c;ok) deq( ;b) S c June 8-9 2006 c a c a b a b a Seminars in Distributed Computing - Computing with Concurrent Objects Linearizability: scenario ﺱAgain, is this execution linearizable? ...try to put a point for each method call... p enq(b;ok) enq(c;ok) q enq(a;ok) r deq( ;a) deq( ;c) deq( ;b) S b June 8-9 2006 b c b c a no way... Seminars in Distributed Computing - Computing with Concurrent Objects Formalizing Linearizability ﺱUntil now we had fun by putting points here and there...now the play is getting harder...we should formalize what putting points would mean -------------------------------- Definitions and Basic Notation --------------------------------------- ﺱAn execution of a concurrent system is modeled by a history H, which is a finite sequence of method invocation and response events. < inv (op(args), X) p> < res (op(res), X) p> where op is the name of the method, args is a list of input arguments, res is a list of results (ok for void) , X is the name of the object, p the name of the process. ﺱA method invocation event is denoted as ﺱA method response event is denoted as -------------------------------------------------------------------------------------------------------------------enq(a;ok) p H: inv (enq(a), X) p June 8-9 2006 res (enq(ok), X) p Seminars in Distributed Computing - Computing with Concurrent Objects t Formalizing Linearizability ﺱConcurrent execution example and related History p enq(a;ok) q deq( ;a) enq(b;ok) deq( ;b) H: inv(enq(a),X)p, res(enq(ok)X)p, inv(enq(b)X)q, res(enq(ok)X)q, inv(deq()X)p, res(deq(a)X)p, inv(deq()X)q, res (deq(b)X)q June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Formalizing Linearizability ﺱConcurrent execution example and related History p enq(a;ok) q enq(b;ok) deq( ;a) deq( ;b) H: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p, res(enq(ok)X)q, inv(deq()X)p, inv(deq()X)q, res(deq(a)X)p res (deq(b)X)q June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Formalizing Linearizability -------------------------------------------- Definitions --------------------------------------------- ﺱA response matches an invocation if their objects names agree and their process names agree. ﺱAn invocation is pending in a history if no matching response follows the invocation. ﺱIf H is a history, complete(H) is the maximal subsequence of H consisting only of invocations and matching responses. ----------------------------------------------------------------------------------------------------------------------- June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Formalizing Linearizability ﺱConcurrent execution example and related History p enq(a;ok) deq( ; enq(b;ok) q deq( ;) H: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p, res(enq(ok)X)q, inv(deq()X)p, inv(deq()X)q complete(H): inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p, res(enq(ok)X)q June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Formalizing Linearizability ----------------------------------------Definitions ---------------------------------------- ﺱA history H is sequential if ( ﺄ1) The first event of H is an invocation. ( ﺄ2) Each invocation, except possibly the last, is immediately followed by a matching response. A history that is not sequential is concurrent. ﺱA process subhistory, H|p (H at p), of a history H is the subsequence of all events in H whose process names are p. (An object subhistory H|X is similarly defined for an object X.) ﺱTwo histories H and H’ are equivalent if for every process p, H|p = H’|p. ﺱA history H is well-formed if each process subhistory H|p of H is sequential (in the following we will assume well-formed subhistories) ----------------------------------------------------------------------------------------------------------------------------------------------------- June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Formalizing Linearizability ﺱSequential History H p enq(a;ok) q deq( ;a) enq(b;ok) deq( ;b) H: inv(enq(a),X)p, res(enq(ok)X)p, inv(enq(b)X)q, res(enq(ok)X)q, inv(deq()X)p, res(deq(a)X)p, inv(deq()X)q, res (deq(b)X)q H|p: inv(enq(a),X)p, res(enq(ok)X)p, inv(deq()X)p, res(deq(a)X)p H|q: inv(enq(b)X)q, res(enq(ok)X)q, inv(deq()X)q, res (deq(b)X)q well-formed subhistories June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Formalizing Linearizability ﺱConcurrent History H’ p enq(a;ok) q enq(b;ok) deq( ;a) deq( ;b) H’: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p, res(enq(ok)X)q, inv(deq()X)p, inv(deq()X)q, res(deq(a)X)p, res (deq(b)X)q H’|p: inv(enq(a),X)p, res(enq(ok)X)p, inv(deq()X)p, res(deq(a)X)p H’|q: inv(enq(b)X)q, res(enq(ok)X)q, inv(deq()X)q, res (deq(b)X)q H and H’ are equivalent June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Formalizing Linearizability ---------------------------------Definitions -------------------------------------------------- ﺱA history H induces an irreflexive partial order H on methods: op1 H op2 if res(op1) precedes inv(op2) in H. ﺱIf H is sequential, then H is a total order. ----------------------------------------------------------------------------------------------------------------------p q enq(a;ok) deq( ;a) enq(b;ok) deq( ;b) H’: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p, res(enq(ok)X)q, inv(deq()X)p, inv(deq()X)q, res(deq(a)X)p, res (deq(b)X)q enq(a) H’ enq(a) H’ enq(b) H’ enq(b) H’ June 8-9 2006 deq(a) deq(b) deq(b) deq(a) Seminars in Distributed Computing - Computing with Concurrent Objects Formalizing Linearizability ---------------------------------Definitions ---------------------------------------------------- ﺱA set S of histories is prefix-closed if whenever H is in S, every prefix of H is also in S. ﺱA sequential specification for an object is a prefix-closed set of sequential histories for the object. ﺱA sequential history H is legal if each object subhistory H|X belongs to the sequential specification for X. ----------------------------------------------------------------------------------------------------------------------- Linearizability A history H is linearizable if it can be extended to a history H’ (by appending zero or more response events to H) such that: L1 : complete(H’) is equivalent to some legal sequential history S, and L2 : H’ S. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Linearizability ﺱInformally, extending H to H’ captures the idea that some pending invocations may have taken effect even though their responses have not yet been returned to the caller. This is visible when some successive method call returns a value set by a pending invocation. ﺱExtending H to H’ while restricting attention to complete(H’) makes it possible to complete pending methods, or just to ignore them. ﺱL1 states that complete(H’) is equivalent to an apparent sequential interleaving of method calls that does not violate the specification of the object. ﺱL2 states that this apparent sequential interleaving respects the precedence ordering of methods. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Linearizability ﺱThen, let’s try to find S... p enq(a; q deq( ; enq(b;ok) deq( ; b) deq( ; a) H: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)q, inv(deq()X)p, inv(deq()X)q, res (deq(b)X)q, inv(deq()X)q, res (deq(a)X)q step1: extending... H’: inv(enq(a),X)p, res(enq(ok)X)p, inv(enq(b)X)q, res(enq(ok)X)p, inv(deq()X)p, inv(deq()X)q, res (deq(b)X)q, inv(deq()X)q, res (deq(a)X)q June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Linearizability step2: completing... complete(H’): inv(enq(a),X)p, res(enq(ok)X)p, inv(enq(b)X)q, res(enq(ok)X)p, inv(deq()X)p, inv(deq()X)q, res (deq(b)X)q, inv(deq()X)q, res (deq(a)X)q p enq(a;ok) enq(b;ok) q deq( ; b) deq( ; a) step3: let me see the partial order... enq(a) H’ deq(b) H’ deq(a) enq(b) H’ deq(b) H’ deq(a) step4: ordering what is not yet ordered... S: enq(b) S enq(a) S deq(b) S deq(a) we got it! June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects To do... ﺱLinearizability has the following property: Theorem : H is linearizable if and only if for each object x, H|x is linearizable. ﺱTo investigate if locality holds for the following alternative correctness criteria: ﺄSequential consistency: only L1 ﺄSerizability: A history is serializable if it is equivalent to one in which transactions appear to execute sequentially, that is, without interleaving. Transaction: finite sequence of methods to a set of objects June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Lecture 2: the “inside” view-point speaker sara tucci piergiovanni institution università di roma “la sapienza” dipartimento di informatica e sistemistica midlab laboratory Objects Implementation ﺱSo, let us suppose now to have a concurrent object to implement, e.g. stack, queue, etc. ﺱhow can we implement it? to get a linearizable execution, we can try to use some form of synchronization, e.g. to rule the access of the object by using of locks, mutex to define critical sections ﺱThe only one that can release the lock is the one who acquired the lock ﺱbut we want also cope with failures...we want that a process gets a response in a finite time, no matter the failures of others... ﺱSo, if a process in the critical section fails? June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Wait-free object implementation ﺱThe meaning of wait-free computing is exaclty the following: each process (that does not crash) calling a method must be able to get a response in a finite time no matter of how slow other proccesses are and failures of other processes ﺱTo introduce wait-free computing we will consider the wait-free implementation of two concurrent objects ﺄA renaming object allows the processes to acquire new names from a smaller name space despite possible process crashes ﺄA snapshot object provides the processes with an array-like data structure (with one entry per process) offering two operations. The write operation allows a process to update its own entry. The snapshot operation allows a process to read all the entries in such a way that the reading of the whole array appears as it is was an atomic operation. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Setting ﺱWe consider n processes, up to f are faulty (stop prematurely by crashing) ﺱWe will consider to use as building blocks for our implementation some basic concurrent objects, called atomic registers, that behave like registers accessed sequentially (so, we are implicitly assuming that the implementation of registers has been already done...later we will come back on this point) ﺱThen processes can access these registers by invoking write() and read() operations June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects M-Renaming Problem ﺱLet us assume that the n processes have arbitrarily large (and distinct) initial names id1, . . . , idn [0, . . . , N − 1], where n <<< N. In the Mrenaming problem, each process pi knows only its initial name idi, and the processes are required to get new names in such a way that the new names belong to the set {0, . . . , M − 1}, M<< N, and no two processes obtain identical names. ﺱMore formally, the problem is defined by the three following properties: Termination. Each correct process decides a new name. Validity.A decided name belongs to [0, . . . , M − 1]. Agreement.No two processes decide the same name. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Implementation ﺱNote that the renaming problem is a problem of allocation (one process for each name) ﺱWe assume the presence of MultiWriterMultiReader registers ﺱWe will see a simple and elegant algo by Moir–Anderson. ﺱIt uses a mechanism called splitter that is particularly suited to wait-free computing (the splitter has been used to implement wait-free mutual exclusion) The renaming problem is trivial when no process can commit a crash failure. Differently, it has been shown that there is no wait-free solution to the M-renaming problem when M < n+ f , where f is an upper bound on the number of processes that can crash June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Wait-free Splitter X=undefined Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=true stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=true yawn stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=red Y=true zzzz stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=red Y=true zzzz stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=red Y=true zzzz stop= empty right={red} down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=red Y=true zzzz stop= empty right={red} down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=red Y=true awake ! stop= empty right={red} down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=red Y=true stop= empty right={red} down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=red Y=true stop= empty right={red} down={green} June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=red Y=true stop= empty right={red} down={green} June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=red Y=true stop= empty right={red} down={green} Note that green was slow and red was in late, nobody got stop June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=undefined Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=false stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=true stop= empty right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=true stop= {green} right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=true stop= {green} right=emtpy down=empty June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Splitter X=green Y=true stop= {green} right={red, orange} down=empty Note that green was on time, red and orange was in late, nobody was slow June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Moir-Anderson: the grid of Splitters n(n+1)/2 renaming splitters June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Snapshot ﺱThe object is made up of n SWMR atomic registers (one per process) ﺱTwo operations: ﺄupdate(v) invoked by pi allows to update its register to the value v ﺄsnapshot() returns the value of the n registers ﺱAll the operations appear as they were executed instantaneously ﺱThe operation updates to inform others on its progress, the snapshot allows a process to understand the what others are doing... ﺱcollect vs snapshot, collect is not atomic... June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Collect vs Snapshot snapshot(?) snapshot(?) possible values: snapshot(?) [000] [010] [012] update(1, R1;ok) update(2, R2;ok) collect(?) collect(?) collect(?) update(1, R1;ok) [000] [002] [012] possible values: [000] [010] [002] [000] [002] [010] update(2, R2;ok) June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects A simple non-wait free algo key mechanism: double collect How many times is this condition false? collect(?) collect(?) collect(?) collect(?) it does not return! update(1, R1;ok) possible values: [000] [010] [002] update(2, R2;ok) June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects A wait free snapshot-Afek and et al. key mechanism: helping mechanism double collect If a process P sees two consecutive updates issued by the same process R, it knows that the second update began after its snapshot began..Then P borrows the snapshot that Q did here June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Lecture 3: the “global” look speaker sara tucci piergiovanni institution università di roma “la sapienza” dipartimento di informatica e sistemistica midlab laboratory Wait-free computing issues ﺱThe fundamental problem of wait-free computing is to characterize circumstances under which synchronization problems have wait –free solutions and to derive efficient solutions when they exist. ﺱTo show that a wait-free implementation there exits, just draw an algo and prove it. ﺱTo show that such an implementation does not exist? ﺱHerlihy: tecnique to prove impossibility and an object hierarchy June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects The Relative Power of Objects ﺱFundamental problem: ﺄGiven two concurrent objects X and Y, does there exists a wait-free implementation of X by Y? ﺄThanks to Herhily classification of objects based on their synchronization power, we can derive impossibility: if for example we have two objects X and Y and Y has less synchronization power than X, then there exists no implementation of X by Y. ﺄHow to define the synchronization power of an object? June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects The Consensus Number ﺱEach object is classified through its possibility of solving Consensus in a wait –free manner. ﺱIn particular, an object X has Consensus Number k, if with objects X and atomic registers (all initialized to appropriate values) it is possible to solve wait-free Consensus for k processes but not for k+1 processes. ﺱFor whom does not know Consensus: processes start with an input value and eventually agree on a common input value ﺄagreement: distinct processes never decide on distinct values ﺄwait-free: each process decides after a finite number of steps ﺄvalidity: the common decision value d is the input of some process June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Consensus 6 26 Accessing Rules Defined by 75 Object the Consensus Protocol 6 June 8-9 2006 6 6 Seminars in Distributed Computing - Computing with Concurrent Objects Wait- free Consensus 26 6 75 Object 26 June 8-9 2006 26 Seminars in Distributed Computing - Computing with Concurrent Objects Objects and Consensus Numbers ﺱIf there exits a wait-free implementation of an object X by Y, and X has consensus number n, then Y has consesus number at least n ﺱIf an object X has consensus number n and an object Y has a consesus number m<n, then there exists no wait-free implementation of X by Y in a system of k > m processes. ﺱIf two objects X and Y have consensus number n, then there exists a wait-free implementation of X by Y (and viceversa) in a system with n processes. ﺱIf two objects X and Y have consensus number n , is it true that there exists no wait-free implementation of X by Y in a system of k>n processes? open issue... June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Objects Classification read-write register counters 1 list atomic snapshot 1 stack 2 1 queue 2 test&set 2 fetch&add swap 2 2 m-multiple assignment 2m-2 move and swap June 8-9 2006 augmented compare& queue swap fetch&cons sticky byte Seminars in Distributed Computing - Computing with Concurrent Objects 2 Objects with Consensus number 1 ﺱAtomic registers, counters, other interfering objects that don't return the old value (explained later) ﺱFirst observe that any type has consensus number at least 1, since 1process consensus is trivial. ﺱTo prove that an object has consensus number 1, we should show that it is impossible to solve Consensus (by the object) among two processes (with initial values in the set {0,1}) ﺄThen we show that an object has consensus number exactly 1, by running FischerLynchPaterson with 2 processes. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Read-Write Registers & Consensus 1 0 Read-Write Register ? June 8-9 2006 ? Seminars in Distributed Computing - Computing with Concurrent Objects Proof: assume otherwise Let us assume that there exits a Consensus Protocol that can use only read() and write() operations Initial State (0,0) 0 Read-Write Register 0 In this case the outcome should be: 0 June 8-9 2006 Read-Write Register Seminars in Distributed Computing - Computing with Concurrent Objects 0 Proof: assume otherwise Initial State (1,1) 1 Read-Write Register 1 In this case the outcome should be: 1 June 8-9 2006 Read-Write Register Seminars in Distributed Computing - Computing with Concurrent Objects 1 Proof: assume otherwise Initial State (1,0) 1 Read-Write Register In this case the outcome should be: 1 June 8-9 2006 Read-Write Register Seminars in Distributed Computing - Computing with Concurrent Objects 0 Proof: assume otherwise Initial State (1,0) 1 Read-Write Register 0 In this case the outcome should be: Read-Write Register June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects 0 Proof: assume otherwise Initial State (1,0) 1 Read-Write Register 0 In this case the outcome could be: 1 Read-Write Register 1 or even: 0 June 8-9 2006 Read-Write Register Seminars in Distributed Computing - Computing with Concurrent Objects 0 Proof: assume otherwise Initial State (0,1) 0 Read-Write Register 1 In this case the outcome could be: 0 Read-Write Register 0 or even: 1 June 8-9 2006 Read-Write Register Seminars in Distributed Computing - Computing with Concurrent Objects 1 Proof: reasoning about any protocol Bivalent&Univalent States: ﺱA protocol state is bivalent if both decision values are still possible. ﺄ ﺄ the outcome is not fixed (like for the (0,1) and (1,0) initial states) the protocol execution can be extended to yield a decision value but can also be extended to yield the other value ﺱA protocol state is univalent if all protocol executions yields to the same value ﺄ the outcome is fixed even if not yet known ﺱA 0-valent state yields to decide 0 ﺱA 1-valent state yields to decide 1 June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Proof: reasoning about any protocol To reason about “any” protocol, we abstract the computation in this way: we consider that Mr.Red and Mr.Green do “moves” against a protocol state. E.g. starting from the initial state either Mr. Red or Mr. Green do the first “move”, and we get in an other protocol state. At this point, again either Mr. Red or Mr. Green do the second “move” and so on... possible protocol executions s s’ s’’’ s s’ s’v s s’’ sv s s’’ sv’ ﺱA “move” can be a read() or a write() June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Proof: reasoning about any protocol Initial state Final states June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Proof: reasoning about any protocol decision:1 June 8-9 2006 decision:0 decision:0 decision:1 decision:1 Seminars in Distributed Computing - Computing with Concurrent Objects decision:0 Proof:reasoning about any protocol bivalent states decision:1 June 8-9 2006 decision:0 decision:0 decision:1 decision:1 Seminars in Distributed Computing - Computing with Concurrent Objects decision:0 Proof: reasoning about any protocol 1-valent states decision:1 June 8-9 2006 decision:0 decision:0 decision:1 decision:1 Seminars in Distributed Computing - Computing with Concurrent Objects decision:0 Proof:reasoning about any protocol 0-valent states decision:1 June 8-9 2006 decision:0 decision:0 decision:1 decision:1 Seminars in Distributed Computing - Computing with Concurrent Objects decision:0 Proof: reasoning about any protocol The initial state is bivalent (by the fact that at least one failure may occur, inputs are invisible and validity) To solve consensus we have to reach a bivalent state C that has only univalent successors (otherwise we could stay bivalent forever and the protocol is not wait-free) Now we assume that C has a 0-valent and a 1-valent successor produced by applying operations x and y of processes Mr.Red and Mr.Green: Cx=0-valent and June 8-9 2006 Cy=1-valent Seminars in Distributed Computing - Computing with Concurrent Objects Proof: reasoning about any protocol Bivalent C decision:1 June 8-9 2006 decision:0 decision:0 decision:1 decision:1 Seminars in Distributed Computing - Computing with Concurrent Objects decision:0 Proof: derive a contradiction ﺱNow, since we are looking at atomic registers, we have three cases consider cases: ( ﺄ1) x and y are both reads ( ﺄ2) x is a read and y is a write ( ﺄ3) x and y are both writes June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Proof: derive a contradiction (1) x and y are both reads ﺱIf read() comes first then the protocol decides 1 ﺱIf read() comes first then the protocol decides 0 The idea is the following: to get an univalent state the two processes should decide who is the winner, who came first. Then we start from C: Mr.red does the first move, this means that both should decide for 0. Then they both decide for 0 if Mr.Red comes before Mr.Green. Now, think that Mr.Green reads before Mr.Red. Now this order should lead to a decision opposite to the other.... June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Proof: derive a contradiction Bivalent C Mr Green reads first Mr Red reads 1-valent state Mr Red reads first 0-valent state 1-valent state But for Mr Red these two states are indistinct (C=Cy): Mr Red cannot understand if Mr Green did it something or not before him...So, you can remove the green arrow and then get a contradiction! We got that Cxy = 0-valent=Cyx=1-valent. Contradiction. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Proof: derive a contradiction (1) x is a read() and y is a write ﺱIf write() comes first then the protocol decides 1 ﺱIf read() comes first then the protocol decides 0 Let us suppose that Mr.Red runs before Mr. Green. Then, both will decide for 0. Let us suppose now that Mr. Green comes first. Now they both will decide 1. However, for Mr.Green the state C is indistinguishable from Cx. So running Mr. Green to completion gives the same decision value from both Cyx and Cxy, another contradiction. We got that Cxy = 0-valent=Cyx=1-valent. Contradiction. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Proof: derive a contradiction (1) x is a write() and y is a write ﺱIf write() comes first then the protocol decides 1 ﺱIf write() comes first then the protocol decides 0 Let us suppose that Mr.Red runs before Mr. Green. Then, both will decide for 0. Let us suppose now that Mr. Green comes first. Now they both will decide 1. However, for Mr.Green the state C is indistinguishable from Cx because Mr.Green overwrites on the value written by Mr.Red. Then we got that Cxy=Cy for Mr.Green. Another contradiction. We got that Cxy = 0-valent=Cy=1-valent. Contradiction. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Summarizing and Generalizing ﺱThe consensus protocol should allow all processes discovering who was the process that access the object first, then all processes decides the value of the process that won. ﺱSuppose that an object T has a read operation that returns its state and one or more modify-write operations that don't return anything (uninformative). ﺱWe'll say that the T operations are interfering if for any two operations x and y either: ﺄ ﺄ x and y commute: Cxy = Cyx. one between x and y overwrites the other: Cxy = Cy or Cyx = Cx. ﺱAny T object with all operations uninformative and interfering has consensus number 1 since for any two operations either they commute or the overwriter can't detect that the first operation happened June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Objects with Consensus Number 2 ﺱNow we will see what are the characteristics of objects with consensus number 2 ﺱFor all these objects there exists a wait-free consensus protocol among two processes ﺄregisters with interfering non trivial read-modify-write operations (RMW registers) ﺄobjects with non interfering operations: queue, stacks, lists ﺱWe derive also the intuition of the impossibility to solve consensus among n>2 processes using these objects June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Non-trivial RMW Registers ﺱWhat does it mean having non-trivial operations? It means that there exists an operation such that it returns the current value and writes a value obtained through a certain function that is not the identity ﺄThe operation takes 2 arguments: ﺭRegister r ﺭFunction f ﺄThe operation has the following effects: ﺭReturns value x of r ﺭReplaces x with f(x): x f(x)x ﺱThen, it is not a simple read (obtained if f were the identity), but a read that leaves an evidence... June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Non-trivial RMW Registers ﺱtest&set ﺄThe operation has the following effects: ﺭReturns value x of r ﺭReplaces x with 1 ﺱfetch&inc ﺄThe operation has the following effects: ﺭReturns value x of r ﺭReplaces x with x+1 June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Non-trivial RMW Registers ﺱswap ﺄThe operation takes an additional argument y ﺄThe operation has the following effects: ﺭReturns value x of r ﺭReplaces x with y ﺱfetch&add ﺄThe operation takes an additional argument y ﺄThe operation has the following effects: ﺭReturns value x of r ﺭReplaces x with x+y June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Non-trivial RMW Registers ﺱHow a consesus protocol among two process can be designed? ﺱtry to think.... ﺄremember, both processes should discover who was the winner... ﺄfirst step: each one writes the proposed value in a register ﺄsecond step: accessing the register object initialized to some value v with the nontrivial operation ﺄwho is the winner? ﺭthe one who reads v ﺄwho is the loser? ﺭthe one who does not read v ﺄNote that both processes can deduce who is the loser and the winner, then they go to the register to pick up the value proposed by the winner June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Consensus among two processes Initial State (1,0) 1 1 write Two readwrite Registers Non-trivial RMW register v winner 1 June 8-9 2006 read(1) write f(v) 0 0 loser Two readwrite Registers read(1) Seminars in Distributed Computing - Computing with Concurrent Objects 1 What about Queues? ﺱBy their consensus number, there exists a consensus protocol using read/write registers and queues ﺱWait-free queue object with enqueue and dequeue operations, where dequeue returns empty if the queue is empty. To solve 2-process consensus with a wait-free queue: ﺄInitialize the queue with a single value (it doesn't matter what the value is). ﺄA process wins if it successfully dequeues the initial value and loses if it gets empty. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Non-trivial RMW Registers: no Consensus for n>2 ﺱwhy a third process cannot understand who was the winner? ﺄLet F be a set of functions such that for all fi and fj, either ﺭThey commute: fi(fj(x))=fj(fi(x)) ﺭThey overwrite: fi(fj(x))=fi(x) ﺱtest&set ﺄf(x)=1 f(f(x))=1 overwrite ﺱswap ﺄf(v,x)=x f(y’(f(y,x))=f(y’,x) overwrite ﺱfetch&inc ﺄf(x)=x+1 f(f(x))=x+1+1 commutative June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Non-trivial RMW Registers: no Consensus for n>2 ﺱThe Impossibility Intuition for Non-trivial interefering RMW ﺄLet us suppose that Mr.green accesses the object with an operation x and Mr. Red with an operation y (e.g. both fethc&inc). The third process Mr. Orange that accesses (reading) the object after Mr.Green and Mr.Red cannot understand who was the winner between these two (in both cases, i.e. Mr. Red the winner, or Mr. Green the winner, the state is the same). ﺄSo if we run Mr. Orange to completion we get the same decision value after both Cx and Cy, which means that Cx and Cy can't be 0-valent and 1-valent. Contradiction. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects What about queues? ﺱQueues, stacks,lists are not interfering objects: no owerwriting and commutative operations ﺄes enq(g)enq(r) enq(r)enq(g) enq(g) ﺱthey seems more powerful...if the Mr. Orange sees by dequeing g,r or r,g it can understand who arrives first (Mr.Green in the first case, Mr.Red in the second) ﺱBut even for queues no wait-free Consensus implementation there exists for n>2, i.e. Mr. Orange is not able to understand who was the winner...why? June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Non-interfering Queues: no consensus for n>2 ﺱHere we give the intuition behind the impossibility: ﺱStart from C enq(g) enq(r) (1-valent) and C enq(r) enq(g) (0valent) ﺄRun Mr.Red until its first deq() ﺄRun Mr.Green until it does its first deq() . ﺄMr. Orange cannot distinguish between C, C enq(g) enq(r) and C enq(r) enq(g). ﺱStart from C deq() enq(r) and C enq(r) deq() on a non-empty queue. ﺄTo lose the trace of this order (we reach indistinguishable states), it suffices that ony 2 witnessess (two dequeuers) fail. Then, the queue has number ≤ 2. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Objects with consensus number ﺱAugmented Queue ﺄHas operations enq(x) and peek(), which returns the first value enqueued but does not remove it. Protocol is to enq my input and then peek and return the first value into the queue. ﺱFetch-and-cons ﺄReturns old cdr and adds new car on to the head of a list. Use preceding protocol where peek() = tail(car::cdr) ﺱSticky bits: ﺄHas write operation that fails unless register is in the initial state. Protocol is to write my input and then return result of a read. ﺱCompare-and-swap ﺄhas CAS(old, new) operation that writes new only if previous value = old. Use it to build a sticky bit. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Consensus for n processes with CAS ﺱn registers Ri ﺱFirst step: pi writes on its register Ri the value proposed ﺱSecond step: see if nobody have already access the object, if so, write the identifier of i: CAS(-1,i) and set j to the previous value of the object j=CAS(-1,i). ﺱIf j==-1 (i is the first) decide the value in Ri else decide the value in Rj June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects m-multiple assignment object ﺱSnapshot means ﺄWrite any array element ﺄRead multiple array elements atomically ﺱWhat about the dual problem: ﺄWrite multiple array elements atomically ﺄScan any array elements ﺱThis problem is called multiple assignment ﺱIt has been proved that m-multiple assignment has consensus number 2m-2 June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects 2-multiple assignment has at least number 2 ﺱHere we have a (large) collection of atomic registers augmented by an m-register write operation that performs all the writes simultaneously. ﺱThe intuition for why this is helpful is that if Mr.Red writes atomically R1 and R while Mr.Green writes atomically R2 and R, then any process can look at the state of R1, R2 and R and tell which write happened first: ﺄ ﺄ ﺄ ﺄ If Mr. Orange reads R1 = R2 =empty, then we don't care which went first, because the Mr.Orange (or somebody else) already won. If Mr.Orange reads R1 = 1 and then R2 = empty, then Mr.Green went first. If Mr.Orange reads R2 = 2 and then R = empty, then Mr. Red went first. (This requires at least one more read after checking the first case.) Otherwise if Mr.Orange see R1 = 1 and R2 = 2. Now it reads R: if it's 1, Mr. Green went first, and vice versa. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Universality of Consensus ﺱUniversality: any type that can implement n-process consensus can, together with atomic registers, give a wait-free implementation of any object in a system with n processes. ﺱthe processes repeatedly use consensus to decide between candidate histories of the simulated object, and a process successfully completes an operation when its operation (tagged to distinguish it from other similar operations) appears in a winning history June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Universality of Consensus ﺱHave a n-process consensus protocol instance for each of a series of phases 0, 1, ... . ﺱThe algorithm works as follows for a process p: 1. Post a list of the operation it wants to its register. 2. Reads all the last-phase values and takes their max. 3. Runs the consensus protocol for the max phase to get the history decided on up to that phase. 4. If the max phase history includes the process's pending operation, returns the result that operation would have had in the winning history. 5. Otherwise, constructs a new history by appending all announced operations to the previous history, and tries to win with that history in phase max+1. 6. Returns to step 2 if its operation doesn't make it into the winning history. ﺱThis terminates because even if process i doesn't get its value into the winning history, eventually some other process will pick up the announced value and include it. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Summary ﺱWhat have we learned? ﺱPer fare un albero ci vuole un seme, per fare un seme ci vuole un albero...questo lo sapevamo...ora sappiamo anche che ﺱPer fare un coda, una pila o una lista, non bastano uno o piu’ registri atomici. ﺱPer fare una coda in un sistema di 2 processi ci vogliono una o più pile. E’ vero che prendendo tante pile quante voglio non posso comunque fare una coda in un sistema con piu’ di due processi???? Nessuno lo ha mai dimostrato... ﺱPer fare un compare&swap non solo non ce ne facciamo niente dei registri atomici ma neanche delle code... ﺱPer fare un compare&swap ci vuole un oggetto con numero di consenso infinito...va bene consenso implementato tra un numero non noto di processi. June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Sources ﺱM. Herlihy and J. Wing, "Linearizability: A Correctness Condition for Concurrent Objects", ACM Transactions on Programming Languages and Systems (TOPLAS), Volume 12 , Issue 3 (July 1990), pp. 463-492 ﺱM. Herlihy, “Wait-free synchronization” ACM Transactions on Programming Languages and Systems (TOPLAS) Volume 13 , Issue 1 (January 1991) pp: 124 – 149. ﺱM. Raynal, “Wait-free computing: an introductory lecture” Future Generation Computer Systems, Volume 21 , Issue 5 (May 2005) Special issue: Parallel computing technologies, pp: 655 – 663. ﺱM. Herlihy and N. Shavit. Concurrent Objects and Linearizability. www.cs.tau.ac.il/~shanir/multiprocessor-synch-2003 June 8-9 2006 Seminars in Distributed Computing - Computing with Concurrent Objects Grazie per l’attenzione!