Concurrent Objects Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Computation memory object object Art of Multiprocessor Programming 2 Objectivism • What is a concurrent object? – How do we describe one? – How do we implement one? – How do we tell if we’re right? Art of Multiprocessor Programming 3 Objectivism • What is a concurrent object? – How do we describe one? – How do we tell if we’re right? Art of Multiprocessor Programming 4 FIFO Queue: Enqueue Method q.enq( ) Art of Multiprocessor Programming 5 FIFO Queue: Dequeue Method q.deq()/ Art of Multiprocessor Programming 6 Lock-Based Queue head 0 tail 1 x y 7 2 6 3 5 Art of Multiprocessor Programming 4 capacity = 8 7 Lock-Based Queue head 0 tail 1 x y 7 2 6 3 Fields protected by 5 single shared lock Art of Multiprocessor Programming 4 capacity = 8 8 A Lock-Based Queue head tail 0 1 y z 2 class LockBasedQueue<T> { int head, tail; T[] items; Lock lock; public LockBasedQueue(int capacity) { head = 0; tail = 0; lock = new ReentrantLock(); items = (T[]) new Object[capacity]; } capacity-1 Fields protected by single shared lock Art of Multiprocessor Programming 9 Lock-Based Queue tail Initially: head = tail head 0 1 7 2 6 3 5 4 10 Art of Multiprocessor Programming A Lock-Based Queue head tail 0 1 y z 2 class LockBasedQueue<T> { int head, tail; T[] items; Lock lock; public LockBasedQueue(int capacity) { head = 0; tail = 0; lock = new ReentrantLock(); items = (T[]) new Object[capacity]; } capacity-1 Initially head = tail Art of Multiprocessor Programming 11 Lock-Based deq() head 0 tail 1 x y 7 2 6 3 5 Art of Multiprocessor Programming 4 12 Acquire Lock head 0 xx My turn … tail 1 yy 7 2 6 3 5 Art of Multiprocessor Programming Waiting to enqueue… 4 13 Implementation: deq() public T deq() throws EmptyException { lock.lock(); Acquire lock at try { if (tail == head) method start throw new EmptyException(); T x = items[head % items.length]; head++; return x; head tail 0 1 } finally { capacity-1 y z 2 lock.unlock(); } } Art of Multiprocessor Programming 14 Check if Non-Empty head Not equal? 0 tail 1 x y 7 2 6 3 5 Art of Multiprocessor Programming Waiting to enqueue… 4 15 Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; head 0 } finally { 1 capacity-1 y z lock.unlock(); } } tail 2 If queue empty throw exception Art of Multiprocessor Programming 16 Modify the Queue head head 0 tail 1 x y 7 2 6 3 5 Art of Multiprocessor Programming Waiting to enqueue… 4 17 Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; head return x; 0 1 } finally { capacity-1 y z lock.unlock(); } } Queue not empty? tail 2 Remove item and update head Art of Multiprocessor Programming 18 Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; head 0 1 } finally { capacity-1 y z lock.unlock(); } } tail 2 Return result Art of Multiprocessor Programming 19 Release the Lock head 0 tail 1 y x 7 2 6 3 5 Art of Multiprocessor Programming Waiting… 4 20 Release the Lock head 0 tail 1 y x 7 2 6 3 5 Art of Multiprocessor Programming My turn! 4 21 Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; head 0 return x; 1 capacity-1 y z } finally { lock.unlock(); } } tail 2 Release lock no matter what! Art of Multiprocessor Programming 22 Implementation: deq() public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } } Art of Multiprocessor Programming 23 Now consider the following implementation • The same thing without mutual exclusion • For simplicity, only two threads – One thread enq only – The other deq only Art of Multiprocessor Programming 24 Wait-free 2-Thread Queue head 0 tail 1 x y 7 2 6 3 5 Art of Multiprocessor Programming 4 capacity = 8 25 Wait-free 2-Thread Queue head 0 x deq() tail 1 y 7 2 6 3 5 Art of Multiprocessor Programming 4 enq(z) z 26 Wait-free 2-Thread Queue head 0 x result = x tail 1 y 7 2 queue[tail] 6 3 = z 5 Art of Multiprocessor Programming 4 z 27 Wait-free 2-Thread Queue head 0 tail 1 y head++ 7 z 6 x 2 tail-- 3 5 Art of Multiprocessor Programming 4 28 Wait-free 2-Thread Queue head public class WaitFreeQueue { tail 0 capacity-1 y z 1 2 int head = 0, tail = 0; items = (T[]) new Object[capacity]; public void enq(Item x) { if (tail-head == capacity) throw new FullException(); items[tail % capacity] = x; tail++; } public Item deq() { if (tail == head) throw No lock needed new EmptyException(); Item item = items[head % capacity]; head++; return item; }} Art of Multiprocessor 29 Programming ! Wait-free 2-Thread Queue public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } } Art of Multiprocessor Programming 30 What is a Concurrent Queue? • Need a way to specify a concurrent queue object • Need a way to prove that an algorithm implements the object’s specification • Lets talk about object specifications … Art of Multiprocessor Programming 31 Correctness and Progress • In a concurrent setting, we need to specify both the safety and the liveness properties of an object • Need a way to define – when an implementation is correct – the conditions under which it guarantees progress Lets begin with correctness Art of Multiprocessor Programming 32 Sequential Objects • Each object has a state – Usually given by a set of fields – Queue example: sequence of items • Each object has a set of methods – Only way to manipulate state – Queue example: enq and deq methods Art of Multiprocessor Programming 33 Sequential Specifications • If (precondition) – the object is in such-and-such a state – before you call the method, • Then (postcondition) – the method will return a particular value – or throw a particular exception. • and (postcondition, con’t) – the object will be in some other state – when the method returns, Art of Multiprocessor Programming 34 Pre and PostConditions for Dequeue • Precondition: – Queue is non-empty • Postcondition: – Returns first item in queue • Postcondition: – Removes first item in queue Art of Multiprocessor Programming 35 Pre and PostConditions for Dequeue • Precondition: – Queue is empty • Postcondition: – Throws Empty exception • Postcondition: – Queue state unchanged Art of Multiprocessor Programming 36 Why Sequential Specifications Totally Rock • Interactions among methods captured by sideeffects on object state – State meaningful between method calls • Documentation size linear in number of methods – Each method described in isolation • Can add new methods – Without changing descriptions of old methods Art of Multiprocessor Programming 37 What About Concurrent Specifications ? • Methods? • Documentation? • Adding new methods? Art of Multiprocessor Programming 38 Methods Take Time time Art of Multiprocessor Programming 39 Methods Take Time invocation 12:00 q.enq(...) time Art of Multiprocessor Programming 40 Methods Take Time invocation 12:00 q.enq(...) Method call time Art of Multiprocessor Programming 41 Methods Take Time invocation 12:00 q.enq(...) Method call time Art of Multiprocessor Programming 42 Methods Take Time invocation 12:00 response 12:01 void q.enq(...) Method call time Art of Multiprocessor Programming 43 Sequential vs Concurrent • Sequential – Methods take time? Who knew? • Concurrent – Method call is not an event – Method call is an interval. Art of Multiprocessor Programming 44 Concurrent Methods Take Overlapping Time time Art of Multiprocessor Programming 45 Concurrent Methods Take Overlapping Time Method call time Art of Multiprocessor Programming 46 Concurrent Methods Take Overlapping Time Method call Method call time Art of Multiprocessor Programming 47 Concurrent Methods Take Overlapping Time Method call Method call Method call time Art of Multiprocessor Programming 48 Sequential vs Concurrent • Sequential: – Object needs meaningful state only between method calls • Concurrent – Because method calls overlap, object might never be between method calls Art of Multiprocessor Programming 49 Sequential vs Concurrent • Sequential: – Each method described in isolation • Concurrent – Must characterize all possible interactions with concurrent calls • What if two enq() calls overlap? • Two deq() calls? enq() and deq()? … Art of Multiprocessor Programming 50 Sequential vs Concurrent • Sequential: – Can add new methods without affecting older methods • Concurrent: – Everything can potentially interact with everything else Art of Multiprocessor Programming 51 Sequential vs Concurrent • Sequential: – Can add new methods without affecting older methods • Concurrent: – Everything can potentially interact with everything else Art of Multiprocessor Programming 52 The Big Question • What does it mean for a concurrent object to be correct? – What is a concurrent FIFO queue? – FIFO means strict temporal order – Concurrent means ambiguous temporal order Art of Multiprocessor Programming 53 Intuitively… public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } } Art of Multiprocessor Programming 54 Intuitively… public T deq() throws EmptyException { lock.lock(); try { if (tail == head) throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { All queue modifications lock.unlock(); are mutually exclusive } } Art of Multiprocessor Programming 55 Intuitively Lets capture the idea of describing the concurrent via the sequential lock() q.deq q.enq lock() enq time enq unlock() deq unlock() Behavior is “Sequential” deq Art of Multiprocessor Programming 56 Linearizability • Each method should – “take effect” – Instantaneously – Between invocation and response events • Object is correct if this “sequential” behavior is correct • Any such concurrent object is – Linearizable™ Art of Multiprocessor Programming 57 Is it really about the object? • Each method should – “take effect” – Instantaneously – Between invocation and response events • Sounds like a property of an execution… • A linearizable object: one all of whose possible executions are linearizable Art of Multiprocessor Programming 58 Example time Art of Multiprocessor Programming 59 Example q.enq(x) time Art of Multiprocessor Programming 60 Example q.enq(x) q.enq(y) time Art of Multiprocessor Programming 61 Example q.enq(x) q.enq(y) q.deq(x) time Art of Multiprocessor Programming 62 Example q.enq(x) q.enq(y) q.deq(y) q.deq(x) time Art of Multiprocessor Programming 63 Example q.enq(x) q.enq(y) q.deq(y) q.deq(x) time Art of Multiprocessor Programming 64 Example q.enq(x) q.enq(y) q.deq(y) q.deq(x) time Art of Multiprocessor Programming 65 Example time Art of Multiprocessor Programming 66 Example q.enq(x) time Art of Multiprocessor Programming 67 Example q.enq(x) q.deq(y) time Art of Multiprocessor Programming 68 Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming 69 Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming 70 Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming 71 Example time Art of Multiprocessor Programming 72 Example q.enq(x) time Art of Multiprocessor Programming 73 Example q.enq(x) q.deq(x) time Art of Multiprocessor Programming 74 Example q.enq(x) q.deq(x) time Art of Multiprocessor Programming 75 Example q.enq(x) q.deq(x) time Art of Multiprocessor Programming 76 Example q.enq(x) time Art of Multiprocessor Programming 77 Example q.enq(x) q.enq(y) time Art of Multiprocessor Programming 78 Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming 79 Example q.enq(x) q.deq(y) q.enq(y) q.deq(x) time Art of Multiprocessor Programming 80 Comme ci Example Comme ça q.enq(x) q.deq(y) q.enq(y) q.deq(x) time Art of Multiprocessor Programming 81 Read/Write Register Example write(0) read(1) write(2) write(1) read(0) time Art of Multiprocessor Programming 82 Read/Write Register Example write(0) read(1) write(2) write(1) read(0) write(1) already happened time Art of Multiprocessor Programming 83 Read/Write Register Example write(0) read(1) write(2) write(1) read(0) write(1) already happened time Art of Multiprocessor Programming 84 Read/Write Register Example write(0) read(1) write(2) write(1) read(0) write(1) already happened time Art of Multiprocessor Programming 85 Read/Write Register Example write(0) read(1) write(2) write(1) read(1) write(1) already happened time Art of Multiprocessor Programming 86 Read/Write Register Example write(0) read(1) write(2) write(1) read(1) write(1) already happened time Art of Multiprocessor Programming 87 Read/Write Register Example write(0) read(1) write(2) write(1) read(1) write(1) already happened time Art of Multiprocessor Programming 88 Read/Write Register Example write(0) write(2) write(1) read(1) time Art of Multiprocessor Programming 89 Read/Write Register Example write(0) write(2) write(1) read(1) time Art of Multiprocessor Programming 90 Read/Write Register Example write(0) write(2) write(1) read(1) time Art of Multiprocessor Programming 91 Talking About Executions • Why? – Can’t we specify the linearization point of each operation without describing an execution? • Not Always – In some cases, linearization point depends on the execution Art of Multiprocessor Programming 92 Formal Model of Executions • Define precisely what we mean – Ambiguity is bad when intuition is weak • Allow reasoning – Formal – But mostly informal • In the long run, actually more important • Ask me why! Art of Multiprocessor Programming 93 Split Method Calls into Two Events • Invocation – method name & args – q.enq(x) • Response – result or exception – q.enq(x) returns void – q.deq() returns x – q.deq() throws empty Art of Multiprocessor Programming 94 Invocation Notation A q.enq(x) Art of Multiprocessor Programming 95 Invocation Notation A q.enq(x) thread Art of Multiprocessor Programming 96 Invocation Notation A q.enq(x) thread method Art of Multiprocessor Programming 97 Invocation Notation A q.enq(x) thread method object Art of Multiprocessor Programming 98 Invocation Notation A q.enq(x) thread method object Art of Multiprocessor Programming arguments 99 Response Notation A q: void Art of Multiprocessor Programming 100 Response Notation A q: void thread Art of Multiprocessor Programming 101 Response Notation A q: void thread result Art of Multiprocessor Programming 102 Response Notation A q: void thread result object Art of Multiprocessor Programming 103 Response Notation A q: void thread result object Art of Multiprocessor Programming 104 Response Notation A q: empty() thread exception object Art of Multiprocessor Programming 105 History - Describing an Execution H= A A A B B B B q.enq(3) q:void q.enq(5) p.enq(4) p:void q.deq() q:3 Art of Multiprocessor Programming Sequence of invocations and responses 106 Definition • Invocation & response match if Thread names agree Object names agree A q.enq(3) A q:void Art of Multiprocessor Programming Method call 107 Object Projections H= A A B B B B q.enq(3) q:void p.enq(4) p:void q.deq() q:3 Art of Multiprocessor Programming 108 Object Projections H|q = A A B B B B q.enq(3) q:void p.enq(4) p:void q.deq() q:3 Art of Multiprocessor Programming 109 Thread Projections H= A A B B B B q.enq(3) q:void p.enq(4) p:void q.deq() q:3 Art of Multiprocessor Programming 110 Thread Projections H|B = A A B B B B q.enq(3) q:void p.enq(4) p:void q.deq() q:3 Art of Multiprocessor Programming 111 Complete Subhistory H= A A A B B B B q.enq(3) q:void q.enq(5) p.enq(4) p:void q.deq() An invocation is q:3 pending if it has no matching respnse Art of Multiprocessor Programming 112 Complete Subhistory H= A A A B B B B q.enq(3) q:void q.enq(5) p.enq(4) p:void q.deq() q:3 Art of Multiprocessor Programming May or may not have taken effect 113 Complete Subhistory H= A A A B B B B q.enq(3) q:void q.enq(5) p.enq(4) p:void q.deq() q:3 Art of Multiprocessor Programming discard pending invocations 114 Complete Subhistory A q.enq(3) A q:void Complete(H) = B B B B p.enq(4) p:void q.deq() q:3 Art of Multiprocessor Programming 115 Sequential Histories A A B B B B A q.enq(3) q:void p.enq(4) p:void q.deq() q:3 q:enq(5) Art of Multiprocessor Programming 116 Sequential Histories A A B B B B A q.enq(3) q:void p.enq(4) p:void q.deq() q:3 q:enq(5) match Art of Multiprocessor Programming 117 Sequential Histories A A B B B B A q.enq(3) q:void p.enq(4) p:void q.deq() q:3 q:enq(5) match match Art of Multiprocessor Programming 118 Sequential Histories A A B B B B A q.enq(3) q:void p.enq(4) p:void q.deq() q:3 q:enq(5) match match match Art of Multiprocessor Programming 119 Sequential Histories A A B B B B A q.enq(3) q:void p.enq(4) p:void q.deq() q:3 q:enq(5) match match match Final pending invocation OK Art of Multiprocessor Programming 120 Sequential Histories A A B B B B A q.enq(3) q:void p.enq(4) p:void q.deq() q:3 q:enq(5) match match match Final pending invocation OK Art of Multiprocessor Programming 121 Well-Formed Histories A B B H= B A B q.enq(3) p.enq(4) p:void q.deq() q:void q:3 Art of Multiprocessor Programming 122 Well-Formed Histories Per-thread projections sequential A B B H= B A B q.enq(3) p.enq(4) p:void q.deq() q:void q:3 B H|B= B B B Art of Multiprocessor Programming p.enq(4) p:void q.deq() q:3 123 Well-Formed Histories Per-thread projections sequential A B B H= B A B q.enq(3) p.enq(4) p:void q.deq() q:void q:3 B H|B= B B B p.enq(4) p:void q.deq() q:3 A q.enq(3) H|A= A q:void Art of Multiprocessor Programming 124 Equivalent Histories Threads see the same thing in both A B B H= B A B q.enq(3) p.enq(4) p:void q.deq() q:void q:3 H|A = G|A H|B = G|B A A B G= B B B Art of Multiprocessor Programming q.enq(3) q:void p.enq(4) p:void q.deq() q:3 125 Sequential Specifications • A sequential specification is some way of telling whether a – Single-thread, single-object history – Is legal • For example: – Pre and post-conditions – But plenty of other techniques exist … Art of Multiprocessor Programming 126 Legal Histories • A sequential (multi-object) history H is legal if – For every object x – H|x is in the sequential spec for x Art of Multiprocessor Programming 127 Precedence A B B A B B q.enq(3) p.enq(4) p.void q:void q.deq() q:3 A method call precedes another if response event precedes invocation event Method call Art of Multiprocessor Programming Method call 128 Non-Precedence A B B B A B q.enq(3) p.enq(4) p.void q.deq() q:void q:3 Some method calls overlap one another Method call Method call Art of Multiprocessor Programming 129 Notation • Given – History H – method executions m0 and m1 in H • We say m0 H m1, if – m0 precedes m1 • Relation m0 H m1 is a m0 m1 – Partial order – Total order if H is sequential Art of Multiprocessor Programming 130 Linearizability • History H is linearizable if it can be extended to G by – Appending zero or more responses to pending invocations – Discarding other pending invocations • So that G is equivalent to – Legal sequential history S – where G S Art of Multiprocessor Programming 131 Remarks • Some pending invocations – Took effect, so keep them – Discard the rest • Condition G S – Means that S respects “real-time order” of G Art of Multiprocessor Programming 132 Ensuring G S G = {ac,bc} S = {ab,ac,bc} a b G time Art of Multiprocessor Programming c S 133 Example A B B B B B q.enq(3) q.enq(4) q:void q.deq() q:4 q:enq(6) A q.enq(3) B q.enq(4) B q.deq(4) B q.enq(6) time Art of Multiprocessor Programming 134 Example A B B B B B q.enq(3) q.enq(4) q:void q.deq() q:4 q:enq(6) Complete this pending invocation A q.enq(3) B q.enq(4) B q.deq(3) B q.enq(6) time Art of Multiprocessor Programming 135 Example A B B B B B A q.enq(3) q.enq(4) q:void q.deq() q:4 q:enq(6) q:void Complete this pending invocation A q.enq(3) B q.enq(4) B q.deq(4) B q.enq(6) time Art of Multiprocessor Programming 136 Example discard this one A B B B B B A q.enq(3) q.enq(4) q:void q.deq() q:4 q:enq(6) q:void A q.enq(3) B q.enq(4) B q.deq(4) B q.enq(6) time Art of Multiprocessor Programming 137 Example discard this one A B B B B q.enq(3) q.enq(4) q:void q.deq() q:4 A q:void A q.enq(3) B q.enq(4) B q.deq(4) time Art of Multiprocessor Programming 138 Example A B B B B A q.enq(3) q.enq(4) q:void q.deq() q:4 q:void A q.enq(3) B q.enq(4) B q.deq(4) time Art of Multiprocessor Programming 139 Example A B B B B A q.enq(3) q.enq(4) q:void q.deq() q:4 q:void B B A A B B q.enq(4) q:void q.enq(3) q:void q.deq() q:4 A q.enq(3) B q.enq(4) B q.deq(4) time Art of Multiprocessor Programming 140 Example Equivalent sequential history A B B B B A q.enq(3) q.enq(4) q:void q.deq() q:4 q:void B B A A B B q.enq(4) q:void q.enq(3) q:void q.deq() q:4 A q.enq(3) B q.enq(4) B q.deq(4) time Art of Multiprocessor Programming 141 Concurrency • How much concurrency does linearizability allow? • When must a method invocation block? Art of Multiprocessor Programming 142 Concurrency • Focus on total methods – Defined in every state • Example: – deq() that throws Empty exception – Versus deq() that waits … • Why? – Otherwise, blocking unrelated to synchronization Art of Multiprocessor Programming 143 Concurrency • Question: When does linearizability require a method invocation to block? • Answer: never. • Linearizability is non-blocking Art of Multiprocessor Programming 144 Non-Blocking Theorem If method invocation A q.inv(…) is pending in history H, then there exists a response A q:res(…) such that H + A q:res(…) is linearizable Art of Multiprocessor Programming 145 Proof • Pick linearization S of H • If S already contains – Invocation A q.inv(…) and response, – Then we are done. • Otherwise, pick a response such that – S + A q.inv(…) + A q:res(…) – Possible because object is total. Art of Multiprocessor Programming 146 Composability Theorem • History H is linearizable if and only if – For every object x – H|x is linearizable • We care about objects only! – (Materialism?) Art of Multiprocessor Programming 147 Why Does Composability Matter? • Modularity • Can prove linearizability of objects in isolation • Can compose independently-implemented objects Art of Multiprocessor Programming 148 Reasoning About Linearizability: Locking public T deq() throws EmptyException { lock.lock(); head try { 0 1 if (tail == head) capacity-1 y z throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { lock.unlock(); } } Art of Multiprocessor Programming tail 2 149 Reasoning About Linearizability: Locking public T deq() throws EmptyException { lock.lock(); head try { 0 1 if (tail == head) capacity-1 y z 2 throw new EmptyException(); T x = items[head % items.length]; head++; return x; } finally { Linearization points lock.unlock(); are when locks are } } released Art of Multiprocessor Programming tail 150 More Reasoning: Wait-free public class WaitFreeQueue { head head tail tail 00 11 capacity-1 y capacity-1 z int head = 0, tail = 0; items = (T[]) new Object[capacity]; 22 public void enq(Item x) { if (tail-head == capacity) throw new FullException(); items[tail % capacity] = x; tail++; } public Item deq() { if (tail == head) throw new EmptyException(); Item item = items[head % capacity]; head++; return item; }} Art of Multiprocessor 151 Programming More Reasoning: Wait-free public class WaitFreeQueue { int head = 0, items = (T[]) Linearization order is tail = 0; order head and tail new Object[capacity]; fields modified public void enq(Item x) { if (tail-head == capacity) throw new FullException(); items[tail % capacity] = x; tail++; } public Item deq() { if (tail == head) throw new EmptyException(); Item item = items[head % capacity]; head++; return item; }} Art of Multiprocessor 152 Programming Strategy • Identify one atomic step where method “happens” – Critical section – Machine instruction • Doesn’t always work – Might need to define several different steps for a given method Art of Multiprocessor Programming 153 Linearizability: Summary • Powerful specification tool for shared objects • Allows us to capture the notion of objects being “atomic” • Don’t leave home without it Art of Multiprocessor Programming 154 Alternative: Sequential Consistency • History H is Sequentially Consistent if it can be extended to G by – Appending zero or more responses to pending invocations – Discarding other pending invocations • So that G is equivalent to a – Legal sequential history S Differs from linearizability – Where G S Art of Multiprocessor Programming 155 Sequential Consistency • No need to preserve real-time order – Cannot re-order operations done by the same thread – Can re-order non-overlapping operations done by different threads • Often used to describe multiprocessor memory architectures Art of Multiprocessor Programming 156 Example time Art of Multiprocessor Programming 157 Example q.enq(x) time Art of Multiprocessor Programming 158 Example q.enq(x) q.deq(y) time Art of Multiprocessor Programming 159 Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming 160 Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming 161 Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming 162 Example q.enq(x) q.deq(y) q.enq(y) time Art of Multiprocessor Programming 163 Theorem Sequential Consistency is not composable Art of Multiprocessor Programming 164 FIFO Queue Example p.enq(x) q.enq(x) p.deq(y) time Art of Multiprocessor Programming 165 FIFO Queue Example p.enq(x) q.enq(x) q.enq(y) p.deq(y) p.enq(y) q.deq(x) time Art of Multiprocessor Programming 166 FIFO Queue Example p.enq(x) q.enq(x) q.enq(y) p.deq(y) p.enq(y) q.deq(x) History H time Art of Multiprocessor Programming 167 H|p Sequentially Consistent p.enq(x) q.enq(x) q.enq(y) p.deq(y) p.enq(y) q.deq(x) time Art of Multiprocessor Programming 168 H|q Sequentially Consistent p.enq(x) q.enq(x) q.enq(y) p.deq(y) p.enq(y) q.deq(x) time Art of Multiprocessor Programming 169 Ordering imposed by p p.enq(x) q.enq(x) q.enq(y) p.deq(y) p.enq(y) q.deq(x) time Art of Multiprocessor Programming 170 Ordering imposed by q p.enq(x) q.enq(x) q.enq(y) p.deq(y) p.enq(y) q.deq(x) time Art of Multiprocessor Programming 171 Ordering imposed by both p.enq(x) q.enq(x) q.enq(y) p.deq(y) p.enq(y) q.deq(x) time Art of Multiprocessor Programming 172 Combining orders p.enq(x) q.enq(x) q.enq(y) p.deq(y) p.enq(y) q.deq(x) time Art of Multiprocessor Programming 173 Fact • Most hardware architectures don’t support sequential consistency • Because they think it’s too strong • Here’s another story … Art of Multiprocessor Programming 174 The Flag Example y.read(0) x.write(1) y.write(1) x.read(0) time Art of Multiprocessor Programming 175 The Flag Example y.read(0) x.write(1) y.write(1) x.read(0) • Each thread’s view is sequentially consistent – It went first time Art of Multiprocessor Programming 176 The Flag Example y.read(0) x.write(1) y.write(1) x.read(0) • Entire history isn’t sequentially consistent time – Can’t both go first Art of Multiprocessor Programming 177 The Flag Example y.read(0) x.write(1) y.write(1) x.read(0) • Is this behavior really so wrong? – We can argue either way … time Art of Multiprocessor Programming 178 Opinion: It’s Wrong • This pattern – Write mine, read yours • Is exactly the flag principle – Beloved of Alice and Bob – Heart of mutual exclusion • Peterson • Bakery, etc. • It’s non-negotiable! Art of Multiprocessor Programming 179 Peterson's Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; } Art of Multiprocessor Programming 180 Crux of Peterson Proof (1) writeB(flag[B]=true)writeB(victim=B) (3) writeB(victim=B)writeA(victim=A) (2) writeA(victim=A)readA(flag[B]) readA(victim) Art of Multiprocessor Programming 181 Crux of Peterson Proof (1) writeB(flag[B]=true)writeB(victim=B) (3) writeB(victim=B)writeA(victim=A) (2) writeA(victim=A)readA(flag[B]) readA(victim) Observation: proof relied on fact that if a location is stored, a later load by some thread will return this or a later stored value. Art of Multiprocessor Programming 182 Opinion: But It Feels So Right … • Many hardware architects think that sequential consistency is too strong • Too expensive to implement in modern hardware • OK if flag principle – violated by default – Honored by explicit request Art of Multiprocessor Programming 183 Hardware Consistancy Initially, a = b = 0. Processor 0 Processor 1 mov 1, a ;Store mov 1, b ;Store mov b, %ebx ;Load mov a, %eax ;Load What are the final possible values of %eax and %ebx after both processors have executed? Sequential consistency implies that no execution ends with %eax= %ebx = 0 Slide used with permission of Charles E. Leiserson Hardware Consistency ∙ No modern-day processor implements sequential consistency. ∙ Hardware actively reorders instructions. ∙ Compilers may reorder instructions, too. ∙ Why? ∙ Because most of performance is derived from a single thread’s unsynchronized execution of code. Art of Multiprocessor Programming Instruction Reordering mov 1, a ;Store mov b, %ebx ;Load Program Order mov b, %ebx ;Load mov 1, a ;Store Execution Order Q. Why might the hardware or compiler decide to reorder these instructions? A. To obtain higher performance by covering load latency — instruction-level parallelism. Slide used with permission of Charles E. Leiserson Instruction Reordering mov 1, a ;Store mov b, %ebx ;Load Program Order mov b, %ebx ;Load mov 1, a ;Store Execution Order Q. When is it safe for the hardware or compiler to perform this reordering? A. When a ≠ b. A′. And there’s no concurrency. Slide used with permission of Charles E. Leiserson Hardware Reordering Load Bypass Store Buffer Processor Network Memory System ∙ Processor can issue stores faster than the network can handle them ⇒ store buffer. ∙ Loads take priority, bypassing the store buffer. ∙ Except if a load address matches an address in the store buffer, the store buffer returns the result. Slide used with permission of Charles E. Leiserson X86: Memory Consistency Thread’s Code Store1 Store2 Load1 Load2 Store3 Store4 Load3 Load4 Load5 1. 2. 3. 4. 5. Loads are not reordered with loads. Stores are not reordered with stores. Stores are not reordered with prior loads. A load may be reordered with a prior store to a different location but not with a prior store to the same location. Stores to the same location respect a global total order. Art of Multiprocessor Programming X86: Memory Consistency Thread’s Code L O A D S Store1 Store2 Load1 Load2 Store3 Store4 Load3 Load4 Load5 1. 2. 3. 4. OK!5. Loads are not reordered with loads. Stores are not reordered with stores. Stores are notOrdering reordered with prior Total Store loads. (TSO)…weaker than Asequential load may be reordered with a prior consistency store to a different location but not with a prior store to the same location. Stores to the same location respect a global total order. Art of Multiprocessor Programming Memory Barriers (Fences) ∙ A memory barrier (or memory fence) is a hardware action that enforces an ordering constraint between the instructions before and after the fence. ∙ A memory barrier can be issued explicitly as an instruction (x86: mfence) ∙ The typical cost of a memory fence is comparable to that of an L2-cache access. Art of Multiprocessor Programming X86: Memory Consistency Thread’s Code Store1 Store2 Load1 Load2 Store3 Store4 Barrier Load3 Load4 Load5 1. 2. 3. 4. 5. Loads are not reordered with loads. Stores are not reordered with + stores. Total Store Ordering Stores are not reordered with prior properly placed memory loads.barriers = sequential A load may be reordered with a prior consistency store to a different location but not with a prior store to the same location. Stores to the same location respect a global total order. Memory Barriers • Explicit Synchronization • Memory barrier will – Flush write buffer – Bring caches up to date • Compilers often do this for you – Entering and leaving critical sections Art of Multiprocessor Programming 193 Volatile Variables • In Java, can ask compiler to keep a variable up-to-date by declaring it volatile • Adds a memory barrier after each store • Inhibits reordering, removing from loops, & other “compiler optimizations” • Will talk about it in detail in later lectures Art of Multiprocessor Programming 194 Summary: Real-World • Hardware weaker than sequential consistency • Can get sequential consistency at a price • Linearizability better fit for high-level software Art of Multiprocessor Programming 195 Linearizability • Linearizability – Operation takes effect instantaneously between invocation and response – Uses sequential specification, locality implies composablity Art of Multiprocessor Programming 196 Summary: Correctness • Sequential Consistency – Not composable – Harder to work with – Good way to think about hardware models • We will use linearizability as our consistency condition in the remainder of this course unless stated otherwise Art of Multiprocessor Programming 197 Progress • We saw an implementation whose methods were lock-based (deadlock-free) • We saw an implementation whose methods did not use locks (lock-free) • How do they relate? Art of Multiprocessor Programming 198 Progress Conditions • Deadlock-free: some thread trying to acquire the lock eventually succeeds. • Starvation-free: every thread trying to acquire the lock eventually succeeds. • Lock-free: some thread calling a method eventually returns. • Wait-free: every thread calling a method eventually returns. Art of Multiprocessor Programming 199 Progress Conditions Blocking Non-Blocking Everyone makes progress Wait-free Starvation-free Someone makes progress Lock-free Deadlock-free Art of Multiprocessor Programming 200 Summary • We will look at linearizable blocking and non-blocking implementations of objects. Art of Multiprocessor Programming 201 This work is licensed under a Creative Commons AttributionShareAlike 2.5 License. • • • • • You are free: – to Share — to copy, distribute and transmit the work – to Remix — to adapt the work Under the following conditions: – Attribution. You must attribute the work to “The Art of Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work). – Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to – http://creativecommons.org/licenses/by-sa/3.0/. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights. Art of Multiprocessor Programming 202