Computing with Concurrent Objects

advertisement
Computing with Concurrent Objects
speaker
sara tucci piergiovanni
institution
università di roma “la sapienza”
dipartimento di informatica e sistemistica
midlab laboratory
Outline
‫ ﺱ‬Concurrent Objects Definition
‫ ﺱ‬The “outside” view-point”
Linearizability: what makes a concurrent object a “meaningful” programming
abstraction
‫ ﺱ‬The “inside” view-point
Wait-free implementation: what makes a concurrent object a “possible” programming
abstraction
‫ ﺱ‬A “global” look to the universe of concurrent objects
Object hierarchy: what makes happy a “would-be theoretician” like me 
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Lecture1: the “outside” view-point
speaker
sara tucci piergiovanni
institution
università di roma “la sapienza”
dipartimento di informatica e sistemistica
midlab laboratory
What is an object?
‫ ﺱ‬An object in languages such as Java and C++ is a container for data.
‫ ﺄ‬Each object provides a set of methods that are the only way to manipulate that
object’s state.
‫ ﺄ‬Each object has a class which describes how its methods behave
‫ ﺱ‬Object description
‫ ﺄ‬The application programmer interface (API)
‫ ﺭ‬pre-condition (describing the object’s state before invoking the method)
‫ ﺭ‬post-condition, describing the object’s state and return value after the method returns.
‫ ﺄ‬For example, if the FIFO queue object is non-empty (pre-condition), then the deq()
method will remove and return the first element (post-condition), and otherwise it
will throw an exception (another pre- and post-condition).
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Pre and Post Conditions
‫ ﺱ‬Defining objects in terms of pre-conditions and post-conditions makes perfect sense in
a sequential model computation where a single thread manipulates a collection of
objects.
‫ ﺱ‬In this case methods are called once at a time, each method invocation is followed by
the corresponding return and a sequence of method calls can be defined
method invocation
exception
p
enq(a;ok) enq(b;ok)
deq( ;a) deq( ;b)
deq()
method response
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
time
Concurrent Model
‫ ﺱ‬If an object’s methods can be invoked by concurrent threads, then the method
executions can overlap in time, and it no longer makes sense to talk about the order of
method calls.
‫ ﺱ‬What does it mean, in a multithreaded program, if a and b are enqueued on a FIFO
queue during overlapping intervals? Which will be dequeued first?
where is the trick?
p
enq(a;ok)
q
June 8-9 2006
enq(b;ok)
deq( ;?)
deq( ;?)
method call
enq(a;ok)
queue’s state a a a a a a a
at the end of the invocation, the
queue surely contains a, but
during the invocation what did it
happen? God knows
Seminars in Distributed Computing - Computing with Concurrent Objects
The Linearizability Manifesto
‫ ﺱ‬The Linearizability Manifesto.
Each method call should appear to “take effect” instantaneously at some moment
between its invocation and response.
p
enq(a;ok)
q
enq(b;ok)
deq( ;b)
deq( ;a)
time
S
a
June 8-9 2006
a
b
b
Seminars in Distributed Computing - Computing with Concurrent Objects
Linearizability: scenario
‫ ﺱ‬Again, is this execution linearizable?
...try to put a point for each method call...
p
enq(a;ok)
q
r
enq(b;ok)
deq( ;a)
deq( ;c)
enq(c;ok)
deq( ;b)
S
c
June 8-9 2006
c
a
c
a
b
a
b
a
Seminars in Distributed Computing - Computing with Concurrent Objects
Linearizability: scenario
‫ ﺱ‬Again, is this execution linearizable?
...try to put a point for each method call...
p enq(b;ok) enq(c;ok)
q
enq(a;ok)
r
deq( ;a)
deq( ;c)
deq( ;b)
S
b
June 8-9 2006
b
c
b
c
a
no way...
Seminars in Distributed Computing - Computing with Concurrent Objects
Formalizing Linearizability
‫ ﺱ‬Until now we had fun by putting points here and there...now the play is getting
harder...we should formalize what putting points would mean 
-------------------------------- Definitions and Basic Notation ---------------------------------------‫ ﺱ‬An execution of a concurrent system is modeled by a history H, which is a finite
sequence of method invocation and response events.
< inv (op(args), X) p>
< res (op(res), X) p>
where op is the name of the method, args is a list of input arguments, res is a list of results
(ok for void) , X is the name of the object, p the name of the process.
‫ ﺱ‬A method invocation event is denoted as
‫ ﺱ‬A method response event is denoted as
-------------------------------------------------------------------------------------------------------------------enq(a;ok)
p
H:
inv (enq(a), X) p
June 8-9 2006
res (enq(ok), X) p
Seminars in Distributed Computing - Computing with Concurrent Objects
t
Formalizing Linearizability
‫ ﺱ‬Concurrent execution example and related History
p enq(a;ok)
q
deq( ;a)
enq(b;ok)
deq( ;b)
H: inv(enq(a),X)p, res(enq(ok)X)p, inv(enq(b)X)q, res(enq(ok)X)q,
inv(deq()X)p, res(deq(a)X)p, inv(deq()X)q, res (deq(b)X)q
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Formalizing Linearizability
‫ ﺱ‬Concurrent execution example and related History
p enq(a;ok)
q
enq(b;ok)
deq( ;a)
deq( ;b)
H: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p,
res(enq(ok)X)q, inv(deq()X)p, inv(deq()X)q, res(deq(a)X)p
res (deq(b)X)q
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Formalizing Linearizability
--------------------------------------------
Definitions
---------------------------------------------
‫ ﺱ‬A response matches an invocation if their objects names agree and their process
names agree.
‫ ﺱ‬An invocation is pending in a history if no matching response follows the invocation.
‫ ﺱ‬If H is a history, complete(H) is the maximal subsequence of H consisting only of
invocations and matching responses.
-----------------------------------------------------------------------------------------------------------------------
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Formalizing Linearizability
‫ ﺱ‬Concurrent execution example and related History
p
enq(a;ok)
deq( ;
enq(b;ok)
q
deq( ;)
H: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p, res(enq(ok)X)q,
inv(deq()X)p, inv(deq()X)q
complete(H): inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p,
res(enq(ok)X)q
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Formalizing Linearizability
----------------------------------------Definitions
----------------------------------------‫ ﺱ‬A history H is sequential if
‫( ﺄ‬1) The first event of H is an invocation.
‫( ﺄ‬2) Each invocation, except possibly the last, is immediately followed by a
matching response. A history that is not sequential is concurrent.
‫ ﺱ‬A process subhistory, H|p (H at p), of a history H is the subsequence of all events in H
whose process names are p. (An object subhistory H|X is similarly defined for an
object X.)
‫ ﺱ‬Two histories H and H’ are equivalent if for every process p, H|p = H’|p.
‫ ﺱ‬A history H is well-formed if each process subhistory H|p of H is sequential (in the
following we will assume well-formed subhistories)
-----------------------------------------------------------------------------------------------------------------------------------------------------
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Formalizing Linearizability
‫ ﺱ‬Sequential History H
p enq(a;ok)
q
deq( ;a)
enq(b;ok)
deq( ;b)
H: inv(enq(a),X)p, res(enq(ok)X)p, inv(enq(b)X)q, res(enq(ok)X)q,
inv(deq()X)p, res(deq(a)X)p, inv(deq()X)q, res (deq(b)X)q
H|p: inv(enq(a),X)p, res(enq(ok)X)p, inv(deq()X)p, res(deq(a)X)p
H|q: inv(enq(b)X)q, res(enq(ok)X)q, inv(deq()X)q, res (deq(b)X)q
well-formed subhistories
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Formalizing Linearizability
‫ ﺱ‬Concurrent History H’
p
enq(a;ok)
q
enq(b;ok)
deq( ;a)
deq( ;b)
H’: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p, res(enq(ok)X)q,
inv(deq()X)p, inv(deq()X)q, res(deq(a)X)p, res (deq(b)X)q
H’|p: inv(enq(a),X)p, res(enq(ok)X)p, inv(deq()X)p, res(deq(a)X)p
H’|q: inv(enq(b)X)q, res(enq(ok)X)q, inv(deq()X)q, res (deq(b)X)q
H and H’ are equivalent
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Formalizing Linearizability
---------------------------------Definitions
--------------------------------------------------‫ ﺱ‬A history H induces an irreflexive partial order  H on methods: op1  H op2 if
res(op1) precedes inv(op2) in H.
‫ ﺱ‬If H is sequential, then  H is a total order.
----------------------------------------------------------------------------------------------------------------------p
q
enq(a;ok)
deq( ;a)
enq(b;ok)
deq( ;b)
H’: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)p, res(enq(ok)X)q,
inv(deq()X)p, inv(deq()X)q, res(deq(a)X)p, res (deq(b)X)q
enq(a)  H’
enq(a)  H’
enq(b)  H’
enq(b)  H’
June 8-9 2006
deq(a)
deq(b)
deq(b)
deq(a)
Seminars in Distributed Computing - Computing with Concurrent Objects
Formalizing Linearizability
---------------------------------Definitions
----------------------------------------------------‫ ﺱ‬A set S of histories is prefix-closed if whenever H is in S, every prefix of H is also in S.
‫ ﺱ‬A sequential specification for an object is a prefix-closed set of sequential histories for
the object.
‫ ﺱ‬A sequential history H is legal if each object subhistory H|X belongs to the sequential
specification for X.
-----------------------------------------------------------------------------------------------------------------------
Linearizability
A history H is linearizable if it can be extended to a history H’
(by appending zero or more response events to H) such that:
L1 : complete(H’) is equivalent to some legal sequential history S, and
L2 : H’  S.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Linearizability
‫ ﺱ‬Informally, extending H to H’ captures the idea that some pending invocations may
have taken effect even though their responses have not yet been returned to the
caller. This is visible when some successive method call returns a value set by a
pending invocation.
‫ ﺱ‬Extending H to H’ while restricting attention to complete(H’) makes it possible to
complete pending methods, or just to ignore them.
‫ ﺱ‬L1 states that complete(H’) is equivalent to an apparent sequential interleaving of
method calls that does not violate the specification of the object.
‫ ﺱ‬L2 states that this apparent sequential interleaving respects the precedence ordering
of methods.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Linearizability
‫ ﺱ‬Then, let’s try to find S...
p enq(a;
q
deq( ;
enq(b;ok)
deq( ; b)
deq( ; a)
H: inv(enq(a),X)p, inv(enq(b)X)q, res(enq(ok)X)q, inv(deq()X)p,
inv(deq()X)q, res (deq(b)X)q, inv(deq()X)q, res (deq(a)X)q
step1: extending...
H’: inv(enq(a),X)p, res(enq(ok)X)p, inv(enq(b)X)q,
res(enq(ok)X)p, inv(deq()X)p, inv(deq()X)q, res (deq(b)X)q,
inv(deq()X)q, res (deq(a)X)q
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Linearizability
step2: completing...
complete(H’): inv(enq(a),X)p, res(enq(ok)X)p, inv(enq(b)X)q,
res(enq(ok)X)p, inv(deq()X)p, inv(deq()X)q, res (deq(b)X)q,
inv(deq()X)q, res (deq(a)X)q
p enq(a;ok)
enq(b;ok)
q
deq( ; b)
deq( ; a)
step3: let me see the partial order...
enq(a)  H’ deq(b)  H’ deq(a)
enq(b)  H’ deq(b)  H’ deq(a)
step4: ordering what is not yet ordered...
S: enq(b)  S enq(a)  S deq(b)  S deq(a) we got it!
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
To do...
‫ ﺱ‬Linearizability has the following property:
Theorem : H is linearizable if and only if for each object x, H|x is linearizable.
‫ ﺱ‬To investigate if locality holds for the following alternative correctness
criteria:
‫ ﺄ‬Sequential consistency: only L1
‫ ﺄ‬Serizability: A history is serializable if it is equivalent to one in which transactions
appear to execute sequentially, that is, without interleaving. Transaction: finite
sequence of methods to a set of objects
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Lecture 2: the “inside” view-point
speaker
sara tucci piergiovanni
institution
università di roma “la sapienza”
dipartimento di informatica e sistemistica
midlab laboratory
Objects Implementation
‫ ﺱ‬So, let us suppose now to have a concurrent object to implement, e.g.
stack, queue, etc.
‫ ﺱ‬how can we implement it? to get a linearizable execution, we can try to
use some form of synchronization, e.g. to rule the access of the object
by using of locks, mutex to define critical sections
‫ ﺱ‬The only one that can release the lock is the one who acquired the lock
‫ ﺱ‬but we want also cope with failures...we want that a process gets a
response in a finite time, no matter the failures of others...
‫ ﺱ‬So, if a process in the critical section fails?
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Wait-free object implementation
‫ ﺱ‬The meaning of wait-free computing is exaclty the following: each
process (that does not crash) calling a method must be able to get a
response in a finite time no matter of how slow other proccesses are
and failures of other processes
‫ ﺱ‬To introduce wait-free computing we will consider the wait-free
implementation of two concurrent objects
‫ ﺄ‬A renaming object allows the processes to acquire new names from a smaller
name space despite possible process crashes
‫ ﺄ‬A snapshot object provides the processes with an array-like data structure (with
one entry per process) offering two operations. The write operation allows a
process to update its own entry. The snapshot operation allows a process to read
all the entries in such a way that the reading of the whole array appears as it is
was an atomic operation.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Setting
‫ ﺱ‬We consider n processes, up to f are faulty (stop prematurely by
crashing)
‫ ﺱ‬We will consider to use as building blocks for our implementation some
basic concurrent objects, called atomic registers, that behave like
registers accessed sequentially (so, we are implicitly assuming that the
implementation of registers has been already done...later we will come
back on this point)
‫ ﺱ‬Then processes can access these registers by invoking write() and
read() operations
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
M-Renaming Problem
‫ ﺱ‬Let us assume that the n processes have arbitrarily large (and distinct)
initial names id1, . . . , idn  [0, . . . , N − 1], where n <<< N. In the Mrenaming problem, each process pi knows only its initial name idi, and
the processes are required to get new names in such a way that the
new names belong to the set {0, . . . , M − 1}, M<< N, and no two
processes obtain identical names.
‫ ﺱ‬More formally, the problem is defined by the three following properties:
Termination. Each correct process decides a new name.
Validity.A decided name belongs to [0, . . . , M − 1].
Agreement.No two processes decide the same name.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Implementation
‫ ﺱ‬Note that the renaming problem is a problem of allocation (one process
for each name)
‫ ﺱ‬We assume the presence of MultiWriterMultiReader registers
‫ ﺱ‬We will see a simple and elegant algo by Moir–Anderson.
‫ ﺱ‬It uses a mechanism called splitter that is particularly suited to wait-free
computing (the splitter has been used to implement wait-free mutual
exclusion)
The renaming problem is trivial when no process can
commit a crash failure. Differently, it has been shown
that there is no wait-free solution to the M-renaming
problem when M < n+ f , where f is an upper bound
on the number of processes that can crash
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Wait-free Splitter
X=undefined
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=true
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=true
yawn
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=red
Y=true
zzzz
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=red
Y=true
zzzz
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=red
Y=true
zzzz
stop= empty
right={red}
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=red
Y=true
zzzz
stop= empty
right={red}
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=red
Y=true
awake
!
stop= empty
right={red}
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=red
Y=true
stop= empty
right={red}
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=red
Y=true
stop= empty
right={red}
down={green}
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=red
Y=true
stop= empty
right={red}
down={green}
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=red
Y=true
stop= empty
right={red}
down={green}
Note that green was slow and red was in late, nobody got stop
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=undefined
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=false
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=true
stop= empty
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=true
stop= {green}
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=true
stop= {green}
right=emtpy
down=empty
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Splitter
X=green
Y=true
stop= {green}
right={red, orange}
down=empty
Note that green was on time, red and orange was in late, nobody
was slow
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Moir-Anderson: the grid of Splitters
n(n+1)/2 renaming
splitters
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Snapshot
‫ ﺱ‬The object is made up of n SWMR atomic registers (one per process)
‫ ﺱ‬Two operations:
‫ ﺄ‬update(v) invoked by pi allows to update its register to the value v
‫ ﺄ‬snapshot() returns the value of the n registers
‫ ﺱ‬All the operations appear as they were executed instantaneously
‫ ﺱ‬The operation updates to inform others on its progress, the snapshot
allows a process to understand the what others are doing...
‫ ﺱ‬collect vs snapshot, collect is not atomic...
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Collect vs Snapshot
snapshot(?)
snapshot(?)
possible values:
snapshot(?)
[000] [010] [012]
update(1, R1;ok)
update(2, R2;ok)
collect(?)
collect(?)
collect(?)
update(1, R1;ok)
[000] [002] [012]
possible values:
[000] [010] [002]
[000] [002] [010]
update(2, R2;ok)
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
A simple non-wait free algo
key mechanism:
double collect
How many times is this
condition false?
collect(?) collect(?)
collect(?) collect(?) it does not return!
update(1, R1;ok)
possible values:
[000] [010] [002]
update(2, R2;ok)
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
A wait free snapshot-Afek and et al.
key mechanism:
helping mechanism
double collect
If a process P sees two
consecutive updates
issued by the same
process R, it knows that
the second update
began after its snapshot
began..Then P borrows
the snapshot that Q did
here
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Lecture 3: the “global” look
speaker
sara tucci piergiovanni
institution
università di roma “la sapienza”
dipartimento di informatica e sistemistica
midlab laboratory
Wait-free computing issues
‫ ﺱ‬The fundamental problem of wait-free computing is to characterize
circumstances under which synchronization problems have wait –free
solutions and to derive efficient solutions when they exist.
‫ ﺱ‬To show that a wait-free implementation there exits, just draw an algo
and prove it.
‫ ﺱ‬To show that such an implementation does not exist?
‫ ﺱ‬Herlihy: tecnique to prove impossibility and an object hierarchy
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
The Relative Power of Objects
‫ ﺱ‬Fundamental problem:
‫ ﺄ‬Given two concurrent objects X and Y, does there exists a wait-free
implementation of X by Y?
‫ ﺄ‬Thanks to Herhily classification of objects based on their
synchronization power, we can derive impossibility: if for example
we have two objects X and Y and Y has less synchronization
power than X, then there exists no implementation of X
by Y.
‫ ﺄ‬How to define the synchronization power of an object?
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
The Consensus Number
‫ ﺱ‬Each object is classified through its possibility of solving Consensus in
a wait –free manner.
‫ ﺱ‬In particular, an object X has Consensus Number k, if with
objects X and atomic registers (all initialized to appropriate
values) it is possible to solve wait-free Consensus for k processes
but not for k+1 processes.
‫ ﺱ‬For whom does not know Consensus: processes start with an input
value and eventually agree on a common input value
‫ ﺄ‬agreement: distinct processes never decide on distinct values
‫ ﺄ‬wait-free: each process decides after a finite number of steps
‫ ﺄ‬validity: the common decision value d is the input of some process
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Consensus
6
26
Accessing Rules
Defined by
75
Object
the Consensus Protocol
6
June 8-9 2006
6
6
Seminars in Distributed Computing - Computing with Concurrent Objects
Wait- free Consensus
26
6
75
Object
26
June 8-9 2006
26
Seminars in Distributed Computing - Computing with Concurrent Objects
Objects and Consensus Numbers
‫ ﺱ‬If there exits a wait-free implementation of an object X by Y, and
X has consensus number n, then Y has consesus number at least n
‫ ﺱ‬If an object X has consensus number n and an object Y has a
consesus number m<n, then there exists no wait-free
implementation of X by Y in a system of k > m processes.
‫ ﺱ‬If two objects X and Y have consensus number n, then there
exists a wait-free implementation of X by Y (and viceversa) in a
system with n processes.
‫ ﺱ‬If two objects X and Y have consensus number n , is it true that
there exists no wait-free implementation of X by Y in a system of k>n
processes?  open issue...
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Objects Classification
read-write
register
counters
1
list
atomic
snapshot
1
stack
2
1
queue
2
test&set
2
fetch&add swap
2
2
m-multiple
assignment
2m-2
move
and swap

June 8-9 2006
augmented compare&
queue
swap
fetch&cons sticky byte




Seminars in Distributed Computing - Computing with Concurrent Objects
2
Objects with Consensus number 1
‫ ﺱ‬Atomic registers, counters, other interfering objects
that don't return the old value (explained later)
‫ ﺱ‬First observe that any type has consensus number at least 1, since 1process consensus is trivial.
‫ ﺱ‬To prove that an object has consensus number 1, we should show that
it is impossible to solve Consensus (by the object) among two
processes (with initial values in the set {0,1})
‫ ﺄ‬Then we show that an object has consensus number exactly 1, by running
FischerLynchPaterson with 2 processes.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Read-Write Registers & Consensus
1
0
Read-Write
Register
?
June 8-9 2006
?
Seminars in Distributed Computing - Computing with Concurrent Objects
Proof: assume otherwise
Let us assume that there exits a Consensus Protocol that can use only read() and write()
operations
Initial State (0,0)
0
Read-Write
Register
0
In this case the outcome should be:
0
June 8-9 2006
Read-Write
Register
Seminars in Distributed Computing - Computing with Concurrent Objects
0
Proof: assume otherwise
Initial State (1,1)
1
Read-Write
Register
1
In this case the outcome should be:
1
June 8-9 2006
Read-Write
Register
Seminars in Distributed Computing - Computing with Concurrent Objects
1
Proof: assume otherwise
Initial State (1,0)
1
Read-Write
Register
In this case the outcome should be:
1
June 8-9 2006
Read-Write
Register
Seminars in Distributed Computing - Computing with Concurrent Objects
0
Proof: assume otherwise
Initial State (1,0)
1
Read-Write
Register
0
In this case the outcome should be:
Read-Write
Register
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
0
Proof: assume otherwise
Initial State (1,0)
1
Read-Write
Register
0
In this case the outcome could be:
1
Read-Write
Register
1
or even:
0
June 8-9 2006
Read-Write
Register
Seminars in Distributed Computing - Computing with Concurrent Objects
0
Proof: assume otherwise
Initial State (0,1)
0
Read-Write
Register
1
In this case the outcome could be:
0
Read-Write
Register
0
or even:
1
June 8-9 2006
Read-Write
Register
Seminars in Distributed Computing - Computing with Concurrent Objects
1
Proof: reasoning about any protocol
Bivalent&Univalent States:
‫ ﺱ‬A protocol state is bivalent if both decision values are still possible.
‫ﺄ‬
‫ﺄ‬
the outcome is not fixed (like for the (0,1) and (1,0) initial states)
the protocol execution can be extended to yield a decision value but can also be
extended to yield the other value
‫ ﺱ‬A protocol state is univalent if all protocol executions yields to the same
value
‫ﺄ‬
the outcome is fixed even if not yet known
‫ ﺱ‬A 0-valent state yields to decide 0
‫ ﺱ‬A 1-valent state yields to decide 1
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Proof: reasoning about any protocol
To reason about “any” protocol, we abstract the computation in this way:
we consider that Mr.Red and Mr.Green do “moves” against a protocol
state. E.g. starting from the initial state either Mr. Red or Mr. Green do
the first “move”, and we get in an other protocol state. At this point, again
either Mr. Red or Mr. Green do the second “move” and so on...
possible protocol
executions
s
s’
s’’’
s
s’
s’v
s
s’’
sv
s
s’’
sv’
‫ ﺱ‬A “move” can be a read() or a write()
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Proof: reasoning about any protocol
Initial state
Final states
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Proof: reasoning about any protocol
decision:1
June 8-9 2006
decision:0
decision:0
decision:1
decision:1
Seminars in Distributed Computing - Computing with Concurrent Objects
decision:0
Proof:reasoning about any protocol
bivalent
states
decision:1
June 8-9 2006
decision:0
decision:0
decision:1
decision:1
Seminars in Distributed Computing - Computing with Concurrent Objects
decision:0
Proof: reasoning about any protocol
1-valent
states
decision:1
June 8-9 2006
decision:0
decision:0
decision:1
decision:1
Seminars in Distributed Computing - Computing with Concurrent Objects
decision:0
Proof:reasoning about any protocol
0-valent
states
decision:1
June 8-9 2006
decision:0
decision:0
decision:1
decision:1
Seminars in Distributed Computing - Computing with Concurrent Objects
decision:0
Proof: reasoning about any protocol
The initial state is bivalent (by the fact that at least one failure may occur,
inputs are invisible and validity)
To solve consensus we have to reach a bivalent state C that has only
univalent successors (otherwise we could stay bivalent forever and the
protocol is not wait-free)
Now we assume that C has a 0-valent and a 1-valent successor produced
by applying operations x and y of processes Mr.Red and
Mr.Green:
Cx=0-valent and
June 8-9 2006
Cy=1-valent
Seminars in Distributed Computing - Computing with Concurrent Objects
Proof: reasoning about any protocol
Bivalent C
decision:1
June 8-9 2006
decision:0
decision:0
decision:1
decision:1
Seminars in Distributed Computing - Computing with Concurrent Objects
decision:0
Proof: derive a contradiction
‫ ﺱ‬Now, since we are looking at atomic registers, we have three cases
consider cases:
‫( ﺄ‬1) x and y are both reads
‫( ﺄ‬2) x is a read and y is a write
‫( ﺄ‬3) x and y are both writes
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Proof: derive a contradiction
(1) x and y are both reads
‫ ﺱ‬If read() comes first then the protocol decides 1
‫ ﺱ‬If read() comes first then the protocol decides 0
The idea is the following: to get an univalent state the two processes should decide
who is the winner, who came first.
Then we start from C: Mr.red does the first move, this means that both should decide
for 0. Then they both decide for 0 if Mr.Red comes before Mr.Green. Now, think that
Mr.Green reads before Mr.Red. Now this order should lead to a decision opposite to
the other....
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Proof: derive a contradiction
Bivalent C
Mr Green
reads
first
Mr Red
reads
1-valent
state
Mr Red
reads
first
0-valent
state
1-valent
state
But for Mr Red these two states are indistinct (C=Cy): Mr Red cannot
understand if Mr Green did it something or not before him...So, you can
remove the green arrow and then get a contradiction!
We got that Cxy = 0-valent=Cyx=1-valent. Contradiction.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Proof: derive a contradiction
(1) x is a read() and y is a write
‫ ﺱ‬If write() comes first then the protocol decides 1
‫ ﺱ‬If read() comes first then the protocol decides 0
Let us suppose that Mr.Red runs before Mr. Green. Then, both will decide for 0. Let
us suppose now that Mr. Green comes first. Now they both will decide 1. However,
for Mr.Green the state C is indistinguishable from Cx. So running Mr. Green to
completion gives the same decision value from both Cyx and Cxy, another
contradiction.
We got that Cxy = 0-valent=Cyx=1-valent. Contradiction.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Proof: derive a contradiction
(1) x is a write() and y is a write
‫ ﺱ‬If write() comes first then the protocol decides 1
‫ ﺱ‬If write() comes first then the protocol decides 0
Let us suppose that Mr.Red runs before Mr. Green. Then, both will decide for 0. Let
us suppose now that Mr. Green comes first. Now they both will decide 1. However,
for Mr.Green the state C is indistinguishable from Cx because Mr.Green overwrites
on the value written by Mr.Red. Then we got that Cxy=Cy for Mr.Green. Another
contradiction.
We got that Cxy = 0-valent=Cy=1-valent. Contradiction.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Summarizing and Generalizing
‫ ﺱ‬The consensus protocol should allow all processes discovering who was the process
that access the object first, then all processes decides the value of the process that
won.
‫ ﺱ‬Suppose that an object T has a read operation that returns its state and one or more
modify-write operations that don't return anything (uninformative).
‫ ﺱ‬We'll say that the T operations are interfering if for any two operations x and y
either:
‫ﺄ‬
‫ﺄ‬
x and y commute: Cxy = Cyx.
one between x and y overwrites the other: Cxy = Cy or Cyx = Cx.
‫ ﺱ‬Any T object with all operations uninformative and interfering has consensus number 1
since for any two operations either they commute or the overwriter can't detect that the
first operation happened
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Objects with Consensus Number 2
‫ ﺱ‬Now we will see what are the characteristics of objects with consensus
number 2
‫ ﺱ‬For all these objects there exists a wait-free consensus protocol among
two processes
‫ ﺄ‬registers with interfering non trivial read-modify-write operations (RMW registers)
‫ ﺄ‬objects with non interfering operations: queue, stacks, lists
‫ ﺱ‬We derive also the intuition of the impossibility to solve consensus
among n>2 processes using these objects
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Non-trivial RMW Registers
‫ ﺱ‬What does it mean having non-trivial operations? It means that there
exists an operation such that it returns the current value and writes a
value obtained through a certain function that is not the identity
‫ ﺄ‬The operation takes 2 arguments:
‫ ﺭ‬Register r
‫ ﺭ‬Function f
‫ ﺄ‬The operation has the following effects:
‫ ﺭ‬Returns value x of r
‫ ﺭ‬Replaces x with f(x): x f(x)x
‫ ﺱ‬Then, it is not a simple read (obtained if f were the identity), but a read
that leaves an evidence...
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Non-trivial RMW Registers
‫ ﺱ‬test&set
‫ ﺄ‬The operation has the following effects:
‫ ﺭ‬Returns value x of r
‫ ﺭ‬Replaces x with 1
‫ ﺱ‬fetch&inc
‫ ﺄ‬The operation has the following effects:
‫ ﺭ‬Returns value x of r
‫ ﺭ‬Replaces x with x+1
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Non-trivial RMW Registers
‫ ﺱ‬swap
‫ ﺄ‬The operation takes an additional argument y
‫ ﺄ‬The operation has the following effects:
‫ ﺭ‬Returns value x of r
‫ ﺭ‬Replaces x with y
‫ ﺱ‬fetch&add
‫ ﺄ‬The operation takes an additional argument y
‫ ﺄ‬The operation has the following effects:
‫ ﺭ‬Returns value x of r
‫ ﺭ‬Replaces x with x+y
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Non-trivial RMW Registers
‫ ﺱ‬How a consesus protocol among two process can be designed?
‫ ﺱ‬try to think....
‫ ﺄ‬remember, both processes should discover who was the winner...
‫ ﺄ‬first step: each one writes the proposed value in a register
‫ ﺄ‬second step: accessing the register object initialized to some value v with the nontrivial operation
‫ ﺄ‬who is the winner?
‫ ﺭ‬the one who reads v
‫ ﺄ‬who is the loser?
‫ ﺭ‬the one who does not read v
‫ ﺄ‬Note that both processes can deduce who is the loser and the winner, then they
go to the register to pick up the value proposed by the winner
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Consensus among two processes
Initial State (1,0)
1
1
write
Two readwrite
Registers
Non-trivial
RMW register
v
winner
1
June 8-9 2006
read(1)
write
f(v)
0
0
loser
Two readwrite
Registers
read(1)
Seminars in Distributed Computing - Computing with Concurrent Objects
1
What about Queues?
‫ ﺱ‬By their consensus number, there exists a consensus protocol using
read/write registers and queues
‫ ﺱ‬Wait-free queue object with enqueue and dequeue operations, where
dequeue returns empty if the queue is empty. To solve 2-process
consensus with a wait-free queue:
‫ ﺄ‬Initialize the queue with a single value (it doesn't matter what the value is).
‫ ﺄ‬A process wins if it successfully dequeues the initial value and loses if it gets
empty.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Non-trivial RMW Registers: no Consensus for n>2
‫ ﺱ‬why a third process cannot understand who was the winner?
‫ ﺄ‬Let F be a set of functions such that for all fi and fj, either
‫ ﺭ‬They commute: fi(fj(x))=fj(fi(x))
‫ ﺭ‬They overwrite: fi(fj(x))=fi(x)
‫ ﺱ‬test&set
‫ ﺄ‬f(x)=1
f(f(x))=1 overwrite
‫ ﺱ‬swap
‫ ﺄ‬f(v,x)=x f(y’(f(y,x))=f(y’,x) overwrite
‫ ﺱ‬fetch&inc
‫ ﺄ‬f(x)=x+1 f(f(x))=x+1+1 commutative
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Non-trivial RMW Registers: no Consensus for n>2
‫ ﺱ‬The Impossibility Intuition for Non-trivial interefering RMW
‫ ﺄ‬Let us suppose that Mr.green accesses the object with an operation x and Mr.
Red with an operation y (e.g. both fethc&inc). The third process Mr. Orange
that accesses (reading) the object after Mr.Green and Mr.Red cannot
understand who was the winner between these two (in both cases, i.e. Mr. Red
the winner, or Mr. Green the winner, the state is the same).
‫ ﺄ‬So if we run Mr. Orange to completion we get the same decision value after
both Cx and Cy, which means that Cx and Cy can't be 0-valent and 1-valent.
Contradiction.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
What about queues?
‫ ﺱ‬Queues, stacks,lists are not interfering objects: no owerwriting and
commutative operations
‫ ﺄ‬es enq(g)enq(r)  enq(r)enq(g)  enq(g)
‫ ﺱ‬they seems more powerful...if the Mr. Orange sees by dequeing g,r
or r,g it can understand who arrives first (Mr.Green in the first case,
Mr.Red in the second)
‫ ﺱ‬But even for queues no wait-free Consensus implementation there
exists for n>2, i.e. Mr. Orange is not able to understand who was the
winner...why?
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Non-interfering Queues: no consensus for n>2
‫ ﺱ‬Here we give the intuition behind the impossibility:
‫ ﺱ‬Start from C enq(g) enq(r) (1-valent) and C enq(r) enq(g) (0valent)
‫ ﺄ‬Run Mr.Red until its first deq()
‫ ﺄ‬Run Mr.Green until it does its first deq() .
‫ ﺄ‬Mr. Orange cannot distinguish between C, C enq(g) enq(r) and C enq(r)
enq(g).
‫ ﺱ‬Start from C deq() enq(r) and C enq(r) deq() on a non-empty
queue.
‫ ﺄ‬To lose the trace of this order (we reach indistinguishable states), it suffices that
ony 2 witnessess (two dequeuers) fail. Then, the queue has number ≤ 2.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Objects with consensus number 
‫ ﺱ‬Augmented Queue
‫ ﺄ‬Has operations enq(x) and peek(), which returns the first value enqueued but
does not remove it. Protocol is to enq my input and then peek and return the first
value into the queue.
‫ ﺱ‬Fetch-and-cons
‫ ﺄ‬Returns old cdr and adds new car on to the head of a list. Use preceding protocol
where peek() = tail(car::cdr)
‫ ﺱ‬Sticky bits:
‫ ﺄ‬Has write operation that fails unless register is in the initial state. Protocol is to
write my input and then return result of a read.
‫ ﺱ‬Compare-and-swap
‫ ﺄ‬has CAS(old, new) operation that writes new only if previous value = old. Use it to
build a sticky bit.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Consensus for n processes with CAS
‫ ﺱ‬n registers Ri
‫ ﺱ‬First step: pi writes on its register Ri the value proposed
‫ ﺱ‬Second step: see if nobody have already access the object, if so, write
the identifier of i: CAS(-1,i) and set j to the previous value of the object
j=CAS(-1,i).
‫ ﺱ‬If j==-1 (i is the first)
decide the value in Ri
else
decide the value in Rj
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
m-multiple assignment object
‫ ﺱ‬Snapshot means
‫ ﺄ‬Write any array element
‫ ﺄ‬Read multiple array elements atomically
‫ ﺱ‬What about the dual problem:
‫ ﺄ‬Write multiple array elements atomically
‫ ﺄ‬Scan any array elements
‫ ﺱ‬This problem is called multiple assignment
‫ ﺱ‬It has been proved that m-multiple assignment has consensus number
2m-2
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
2-multiple assignment has at least number 2
‫ ﺱ‬Here we have a (large) collection of atomic registers augmented by an m-register write
operation that performs all the writes simultaneously.
‫ ﺱ‬The intuition for why this is helpful is that if Mr.Red writes atomically R1 and R while
Mr.Green writes atomically R2 and R, then any process can look at the state of R1, R2
and R and tell which write happened first:
‫ﺄ‬
‫ﺄ‬
‫ﺄ‬
‫ﺄ‬
If Mr. Orange reads R1 = R2 =empty, then we don't care which went first, because the
Mr.Orange (or somebody else) already won.
If Mr.Orange reads R1 = 1 and then R2 = empty, then Mr.Green went first.
If Mr.Orange reads R2 = 2 and then R = empty, then Mr. Red went first. (This requires at
least one more read after checking the first case.)
Otherwise if Mr.Orange see R1 = 1 and R2 = 2. Now it reads R: if it's 1, Mr. Green went
first, and vice versa.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Universality of Consensus
‫ ﺱ‬Universality: any type that can implement n-process consensus can,
together with atomic registers, give a wait-free implementation of any
object in a system with n processes.
‫ ﺱ‬the processes repeatedly use consensus to decide between candidate
histories of the simulated object, and a process successfully completes
an operation when its operation (tagged to distinguish it from other
similar operations) appears in a winning history
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Universality of Consensus
‫ ﺱ‬Have a n-process consensus protocol instance for each of a series of
phases 0, 1, ... .
‫ ﺱ‬The algorithm works as follows for a process p:
1. Post a list of the operation it wants to its register.
2. Reads all the last-phase values and takes their max.
3. Runs the consensus protocol for the max phase to get the history decided on up
to that phase.
4. If the max phase history includes the process's pending operation, returns the
result that operation would have had in the winning history.
5. Otherwise, constructs a new history by appending all announced operations to
the previous history, and tries to win with that history in phase max+1.
6. Returns to step 2 if its operation doesn't make it into the winning history.
‫ ﺱ‬This terminates because even if process i doesn't get its value into the
winning history, eventually some other process will pick up the
announced value and include it.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Summary
‫ ﺱ‬What have we learned?
‫ ﺱ‬Per fare un albero ci vuole un seme, per fare un seme ci vuole un albero...questo lo
sapevamo...ora sappiamo anche che
‫ ﺱ‬Per fare un coda, una pila o una lista, non bastano uno o piu’ registri atomici.
‫ ﺱ‬Per fare una coda in un sistema di 2 processi ci vogliono una o più pile. E’ vero che
prendendo tante pile quante voglio non posso comunque fare una coda in un sistema
con piu’ di due processi???? Nessuno lo ha mai dimostrato...
‫ ﺱ‬Per fare un compare&swap non solo non ce ne facciamo niente dei registri atomici ma
neanche delle code...
‫ ﺱ‬Per fare un compare&swap ci vuole un oggetto con numero di consenso infinito...va
bene consenso implementato tra un numero non noto di processi.
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Sources
‫ ﺱ‬M. Herlihy and J. Wing, "Linearizability: A Correctness Condition for
Concurrent Objects", ACM Transactions on Programming Languages
and Systems (TOPLAS), Volume 12 , Issue 3 (July 1990), pp. 463-492
‫ ﺱ‬M. Herlihy, “Wait-free synchronization” ACM Transactions on
Programming Languages and Systems (TOPLAS) Volume 13 , Issue
1 (January 1991) pp: 124 – 149.
‫ ﺱ‬M. Raynal, “Wait-free computing: an introductory lecture” Future
Generation Computer Systems, Volume 21 , Issue 5 (May 2005)
Special issue: Parallel computing technologies, pp: 655 – 663.
‫ ﺱ‬M. Herlihy and N. Shavit. Concurrent Objects and Linearizability.
www.cs.tau.ac.il/~shanir/multiprocessor-synch-2003
June 8-9 2006
Seminars in Distributed Computing - Computing with Concurrent Objects
Grazie per l’attenzione!
Download