Art of Multiprocessor Programming

advertisement
Concurrent Queues and Stacks
Companion slides for
The Art of Multiprocessor Programming
by Maurice Herlihy & Nir Shavit
The Five-Fold Path
•
•
•
•
•
Coarse-grained locking
Fine-grained locking
Optimistic synchronization
Lazy synchronization
Lock-free synchronization
Art of Multiprocessor Programming
2
Another Fundamental Problem
• We told you about
– Sets implemented by linked lists
• Next: queues
• Next: stacks
Art of Multiprocessor Programming
3
Queues & Stacks
• pool of items
Art of Multiprocessor Programming
4
Queues
deq()/
enq(
)
Total order
First in
First out
Art of Multiprocessor Programming
5
Stacks
pop()/
push(
Total order
Last in
First out
)
Art of Multiprocessor Programming
6
Bounded
• Fixed capacity
• Good when resources an issue
Art of Multiprocessor Programming
7
Unbounded
• Unlimited capacity
• Often more convenient
Art of Multiprocessor Programming
8
Blocking
zzz …
Block on attempt to remove
from empty stack or queue
Art of Multiprocessor Programming
9
Blocking
zzz …
Block on attempt to add to full
bounded stack or queue
Art of Multiprocessor Programming
10
Non-Blocking
Ouch!
Throw exception on attempt to
remove from empty stack or queue
Art of Multiprocessor Programming
11
This Lecture
• Queue
– Bounded, blocking, lock-based
– Unbounded, non-blocking, lock-free
• Stack
– Unbounded, non-blocking lock-free
– Elimination-backoff algorithm
Art of Multiprocessor Programming
12
Queue: Concurrency
enq(x)
tail
head
y=deq()
enq() and deq()
work at different
ends of the object
Art of Multiprocessor Programming
13
Concurrency
y=deq()
enq(x)
Challenge: what if
the queue is empty
or full?
Art of Multiprocessor Programming
14
Bounded Queue
head
tail
Sentinel
Art of Multiprocessor Programming
15
Bounded Queue
head
tail
First actual item
Art of Multiprocessor Programming
16
Bounded Queue
head
tail
deqLock
Lock out other
deq() calls
Art of Multiprocessor Programming
17
Bounded Queue
head
tail
deqLock
enqLock
Lock out other
enq() calls
Art of Multiprocessor Programming
18
Not Done Yet
head
tail
deqLock
enqLock
Need to tell whether
queue is full or
empty
Art of Multiprocessor Programming
19
Not Done Yet
head
tail
deqLock
enqLock
size
1
Max size is 8 items
Art of Multiprocessor Programming
20
Not Done Yet
head
tail
deqLock
enqLock
size
1
Incremented by enq()
Decremented by deq()
Art of Multiprocessor Programming
21
Enqueuer
head
tail
deqLock
enqLock
Lock enqLock
size
1
Art of Multiprocessor Programming
22
Enqueuer
head
tail
deqLock
Read size
enqLock
size
1
OK
Art of Multiprocessor Programming
23
Enqueuer
head
tail
deqLock
No need to
lock tail
enqLock
size
1
Art of Multiprocessor Programming
24
Enqueuer
head
tail
deqLock
enqLock
Enqueue Node
size
1
Art of Multiprocessor Programming
25
Enqueuer
head
tail
deqLock
enqLock
size
12
getAndincrement()
Art of Multiprocessor Programming
26
Enqueuer
head
tail
deqLock
enqLock
size
Release lock
8
2
Art of Multiprocessor Programming
27
Enqueuer
head
tail
deqLock
enqLock
If queue was empty,
notify waiting dequeuers
size
2
Art of Multiprocessor Programming
28
Unsuccesful Enqueuer
…
head
tail
deqLock
Read size
enqLock
size
8
Uh-oh
Art of Multiprocessor Programming
29
Dequeuer
head
tail
deqLock
enqLock
Lock deqLock
size
2
Art of Multiprocessor Programming
30
Dequeuer
head
tail
deqLock
enqLock
size
Read sentinel’s
next field
2
OK
Art of Multiprocessor Programming
31
Dequeuer
head
tail
deqLock
enqLock
Read value
size
2
Art of Multiprocessor Programming
32
Dequeuer
Make first Node
new sentinel
head
tail
deqLock
enqLock
size
2
Art of Multiprocessor Programming
33
Dequeuer
head
tail
deqLock
enqLock
size
Decrement
size
1
Art of Multiprocessor Programming
34
Dequeuer
head
tail
deqLock
enqLock
size
1
Release
deqLock
Art of Multiprocessor Programming
35
Unsuccesful Dequeuer
head
tail
deqLock
enqLock
size
Read sentinel’s
next field
0
uh-oh
Art of Multiprocessor Programming
36
Bounded Queue
public class BoundedQueue<T> {
ReentrantLock enqLock, deqLock;
Condition notEmptyCondition, notFullCondition;
AtomicInteger size;
Node head;
Node tail;
int capacity;
enqLock = new ReentrantLock();
notFullCondition = enqLock.newCondition();
deqLock = new ReentrantLock();
notEmptyCondition = deqLock.newCondition();
}
Art of Multiprocessor Programming
37
Bounded Queue
public class BoundedQueue<T> {
ReentrantLock enqLock, deqLock;
Condition notEmptyCondition, notFullCondition;
AtomicInteger size;
Node head;
Node tail;
int capacity;
enq & deq locks
enqLock = new ReentrantLock();
notFullCondition = enqLock.newCondition();
deqLock = new ReentrantLock();
notEmptyCondition = deqLock.newCondition();
}
Art of Multiprocessor Programming
38
Digression: Monitor Locks
• Java synchronized objects and
ReentrantLocks are monitors
• Allow blocking on a condition rather
than spinning
• Threads:
– acquire and release lock
– wait on a condition
Art of Multiprocessor Programming
39
The Java Lock Interface
public interface Lock {
void lock();
void lockInterruptibly() throws InterruptedException;
boolean tryLock();
boolean tryLock(long time, TimeUnit unit);
Condition newCondition();
void unlock;
}
Acquire lock
Art of Multiprocessor Programming
40
The Java Lock Interface
public interface Lock {
void lock();
void lockInterruptibly() throws InterruptedException;
boolean tryLock();
boolean tryLock(long time, TimeUnit unit);
Condition newCondition();
void unlock;
Release lock
}
Art of Multiprocessor Programming
41
The Java Lock Interface
public interface Lock {
void lock();
void lockInterruptibly() throws InterruptedException;
boolean tryLock();
boolean tryLock(long time, TimeUnit unit);
Condition newCondition();
void unlock;
}
Try for lock, but not too hard
Art of Multiprocessor Programming
42
The Java Lock Interface
public interface Lock {
void lock();
void lockInterruptibly() throws InterruptedException;
boolean tryLock();
boolean tryLock(long time, TimeUnit unit);
Condition newCondition();
void unlock;
}
Create condition to wait on
Art of Multiprocessor Programming
43
The Java Lock Interface
public interface Lock {
void lock();
void lockInterruptibly() throws InterruptedException;
boolean tryLock();
boolean tryLock(long time, TimeUnit unit);
Condition newCondition();
void unlock;
}
Never mind what this method does
Art of Multiprocessor Programming
44
Lock Conditions
public interface Condition {
void await();
boolean await(long time, TimeUnit unit);
…
void signal();
void signalAll();
}
Art of Multiprocessor Programming
45
Lock Conditions
public interface Condition {
void await();
boolean await(long time, TimeUnit unit);
…
void signal();
void signalAll();
}
Release lock and
wait on condition
Art of Multiprocessor Programming
46
Lock Conditions
public interface Condition {
void await();
boolean await(long time, TimeUnit unit);
…
void signal();
void signalAll();
}
Wake up one waiting thread
Art of Multiprocessor Programming
47
Lock Conditions
public interface Condition {
void await();
boolean await(long time, TimeUnit unit);
…
void signal();
void signalAll();
}
Wake up all waiting threads
Art of Multiprocessor Programming
48
Await
q.await()
•
•
•
•
Releases lock associated with q
Sleeps (gives up processor)
Awakens (resumes running)
Reacquires lock & returns
Art of Multiprocessor Programming
49
Signal
q.signal();
• Awakens one waiting thread
– Which will reacquire lock
Art of Multiprocessor Programming
50
Signal All
q.signalAll();
• Awakens all waiting threads
– Which will each reacquire lock
Art of Multiprocessor Programming
51
A Monitor Lock
lock()
Critical Section
waiting room
unlock()
Art of Multiprocessor Programming
52
Unsuccessful Deq
lock()
deq()
Critical Section
waiting room
await()
Oh no,
empty!
Art of Multiprocessor Programming
53
Another One
lock()
deq()
Critical Section
waiting room
await()
Oh no,
empty!
Art of Multiprocessor Programming
54
Enqueuer to the Rescue
Yawn!
Yawn!
lock()
enq( )
Critical Section
waiting room
signalAll()
unlock()
Art of Multiprocessor Programming
55
Monitor Signalling
Yawn!
Yawn!
Critical Section
waiting room
Awakened thread
might still lose lock to
outside contender…
Art of Multiprocessor Programming
56
Dequeuers Signalled
Yawn!
Critical Section
waiting room
Found it
Art of Multiprocessor Programming
57
Dequeuers Signaled
Yawn!
Critical Section
waiting room
Still empty!
Art of Multiprocessor Programming
58
Dollar Short + Day Late
Critical Section
waiting room
Art of Multiprocessor Programming
59
Java Synchronized Methods
public class Queue<T> {
int head = 0, tail = 0;
T[QSIZE] items;
public synchronized T deq() {
while (tail – head == 0)
wait();
T result = items[head % QSIZE]; head++;
notifyAll();
return result;
}
…
}}
Art of Multiprocessor Programming
60
Java Synchronized Methods
public class Queue<T> {
int head = 0, tail = 0;
T[QSIZE] items;
public synchronized T deq() {
while (tail – head == 0)
wait();
T result = items[head % QSIZE]; head++;
notifyAll();
return result;
Each object has an implicit
}
…
lock with an implicit condition
}}
Art of Multiprocessor Programming
61
Java Synchronized Methods
public class Queue<T> {
int head = 0, tail = 0;
T[QSIZE] items;
Lock on entry,
unlock on return
public synchronized T deq() {
while (tail – head == 0)
wait();
T result = items[head % QSIZE]; head++;
notifyAll();
return result;
}
…
}}
Art of Multiprocessor Programming
62
Java Synchronized Methods
public class Queue<T> {
int head = 0, tail = 0;
T[QSIZE] items;
Wait on implicit
condition
public synchronized T deq() {
while (tail – head == 0)
wait();
T result = items[head % QSIZE]; head++;
this.notifyAll();
return result;
}
…
}}
Art of Multiprocessor Programming
63
Java Synchronized Methods
public class Queue<T> {
Signal all threads waiting
on condition
tail = 0;
int head = 0,
T[QSIZE] items;
public synchronized T deq() {
while (tail – head == 0)
this.wait();
T result = items[head % QSIZE]; head++;
notifyAll();
return result;
}
…
}}
Art of Multiprocessor Programming
64
(Pop!) The Bounded Queue
public class BoundedQueue<T> {
ReentrantLock enqLock, deqLock;
Condition notEmptyCondition, notFullCondition;
AtomicInteger size;
Node head;
Node tail;
int capacity;
enqLock = new ReentrantLock();
notFullCondition = enqLock.newCondition();
deqLock = new ReentrantLock();
notEmptyCondition = deqLock.newCondition();
}
Art of Multiprocessor Programming
65
Bounded Queue Fields
public class BoundedQueue<T> {
ReentrantLock enqLock, deqLock;
Condition notEmptyCondition, notFullCondition;
AtomicInteger size;
Node head;
Node tail;
int capacity;
Enq & deq locks
enqLock = new ReentrantLock();
notFullCondition = enqLock.newCondition();
deqLock = new ReentrantLock();
notEmptyCondition = deqLock.newCondition();
}
Art of Multiprocessor Programming
66
Bounded Queue Fields
public class BoundedQueue<T> {
ReentrantLock enqLock, deqLock;
Condition notEmptyCondition, notFullCondition;
AtomicInteger size;
Node head;
Enq lock’s associated
Node tail;
condition
int capacity;
enqLock = new ReentrantLock();
notFullCondition = enqLock.newCondition();
deqLock = new ReentrantLock();
notEmptyCondition = deqLock.newCondition();
}
Art of Multiprocessor Programming
67
Bounded Queue Fields
public class BoundedQueue<T> {
ReentrantLock enqLock, deqLock;
Condition notEmptyCondition, notFullCondition;
AtomicInteger size;
Node head;
Node tail;
size: 0 to capacity
int capacity;
enqLock = new ReentrantLock();
notFullCondition = enqLock.newCondition();
deqLock = new ReentrantLock();
notEmptyCondition = deqLock.newCondition();
}
Art of Multiprocessor Programming
68
Bounded Queue Fields
public class BoundedQueue<T> {
ReentrantLock enqLock, deqLock;
Condition notEmptyCondition, notFullCondition;
AtomicInteger size;
Head and Tail
Node head;
Node tail;
int capacity;
enqLock = new ReentrantLock();
notFullCondition = enqLock.newCondition();
deqLock = new ReentrantLock();
notEmptyCondition = deqLock.newCondition();
}
Art of Multiprocessor Programming
69
Enq Method Part One
public void enq(T x) {
boolean mustWakeDequeuers = false;
enqLock.lock();
try {
while (size.get() == Capacity)
notFullCondition.await();
Node e = new Node(x);
tail.next = e;
tail = tail.next;
if (size.getAndIncrement() == 0)
mustWakeDequeuers = true;
} finally {
enqLock.unlock();
}
…
}
Art of Multiprocessor Programming
70
Enq Method Part One
public void enq(T x) {
boolean mustWakeDequeuers = false;
enqLock.lock();
try {
Lock and unlock
while (size.get() == capacity)
enq lock
notFullCondition.await();
Node e = new Node(x);
tail.next = e;
tail = tail.next;
if (size.getAndIncrement() == 0)
mustWakeDequeuers = true;
} finally {
enqLock.unlock();
}
…
}
Art of Multiprocessor Programming
71
Enq Method Part One
public void enq(T x) {
boolean mustWakeDequeuers = false;
enqLock.lock();
try {
while (size.get() == capacity)
notFullCondition.await();
Node e = new Node(x);
tail.next = e;
tail = tail.next;
if (size.getAndIncrement() == 0)
mustWakeDequeuers = true;
} finally {
enqLock.unlock();
}
Wait
while
queue
is
…
}
Art of Multiprocessor Programming
full …
72
Enq Method Part One
public void enq(T x) {
boolean mustWakeDequeuers = false;
enqLock.lock();
try {
while (size.get() == capacity)
notFullCondition.await();
Node e = new Node(x);
tail.next = e;
tail = tail.next;
if (size.getAndIncrement() == 0)
mustWakeDequeuers = true;
} finally {
enqLock.unlock();
}
when await() returns, you
…
might still fail the test !
}
Art of Multiprocessor Programming
73
Be Afraid
public void enq(T x) {
boolean mustWakeDequeuers = false;
enqLock.lock();
try {
while (size.get() == capacity)
notFullCondition.await();
Node e = new Node(x);
tail.next = e;
tail = tail.next;
if (size.getAndIncrement() == 0)
mustWakeDequeuers = true;
} finally {
enqLock.unlock();
}
After the loop: how do we know the
…
queue won’t become full again?
}
Art of Multiprocessor Programming
74
Enq Method Part One
public void enq(T x) {
boolean mustWakeDequeuers = false;
enqLock.lock();
try {
while (size.get() == capacity)
notFullCondition.await();
Node e = new Node(x);
tail.next = e;
tail = tail.next;
if (size.getAndIncrement() == 0)
mustWakeDequeuers = true;
} finally {
enqLock.unlock();
}
…
Add new
}
Art of Multiprocessor Programming
node
75
Enq Method Part One
public void enq(T x) {
boolean mustWakeDequeuers = false;
enqLock.lock();
try {
while (size.get() == capacity)
notFullCondition.await();
Node e = new Node(x);
tail.next = e;
tail = tail.next;
if (size.getAndIncrement() == 0)
mustWakeDequeuers = true;
} finally {
enqLock.unlock();
}
If queue was empty, wake
…
frustrated dequeuers
}
Art of Multiprocessor Programming
76
Beware Lost Wake-Ups
Yawn!
lock()
enq( )
Critical Section
waiting room
Queue empty
so signal ()
unlock()
Art of Multiprocessor Programming
77
Lost Wake-Up
Yawn!
lock()
enq( )
Critical Section
waiting room
Queue not
empty so no
need to signal
unlock()
Art of Multiprocessor Programming
78
Lost Wake-Up
Yawn!
Critical Section
waiting room
Art of Multiprocessor Programming
79
Lost Wake-Up
Critical Section
waiting room
Found it
Art of Multiprocessor Programming
80
What’s Wrong Here?
Still waiting ….!
Critical Section
waiting room
Art of Multiprocessor Programming
81
Solution to Lost Wakeup
• Always use
– signalAll() and notifyAll()
• Not
– signal() and notify()
Art of Multiprocessor Programming
82
Enq Method Part Deux
public void enq(T x) {
…
if (mustWakeDequeuers) {
deqLock.lock();
try {
notEmptyCondition.signalAll();
} finally {
deqLock.unlock();
}
}
}
Art of Multiprocessor Programming
83
Enq Method Part Deux
public void enq(T x) {
…
if (mustWakeDequeuers) {
deqLock.lock();
try {
notEmptyCondition.signalAll();
} finally {
deqLock.unlock();
}
}
}Are there dequeuers to be signaled?
Art of Multiprocessor Programming
84
Enq Method Part Deux
public void enq(T x) {
Lock and
…
if (mustWakeDequeuers) { unlock deq lock
deqLock.lock();
try {
notEmptyCondition.signalAll();
} finally {
deqLock.unlock();
}
}
}
Art of Multiprocessor Programming
85
Enq Method Part Deux
public void enq(T x) {
Signal dequeuers that
…
queue
is no longer empty
if (mustWakeDequeuers)
{
deqLock.lock();
try {
notEmptyCondition.signalAll();
} finally {
deqLock.unlock();
}
}
}
Art of Multiprocessor Programming
86
The Enq() & Deq() Methods
• Share no locks
– That’s good
• But do share an atomic counter
– Accessed on every method call
– That’s not so good
• Can we alleviate this bottleneck?
Art of Multiprocessor Programming
87
Split the Counter
• The enq() method
– Increments only
– Cares only if value is capacity
• The deq() method
– Decrements only
– Cares only if value is zero
Art of Multiprocessor Programming
88
Split Counter
• Enqueuer increments enqSize
• Dequeuer decrements deqSize
• When enqueuer runs out
– Locks deqLock
– computes size = enqSize - DeqSize
• Intermittent synchronization
– Not with each method call
– Need both locks! (careful …)
Art of Multiprocessor Programming
89
A Lock-Free Queue
head
tail
Sentinel
Art of Multiprocessor Programming
90
Compare and Set
CAS
Art of Multiprocessor Programming
91
Enqueue
head
tail
enq( )
Art of Multiprocessor Programming
92
Enqueue
head
tail
Art of Multiprocessor Programming
93
Logical Enqueue
CAS
head
tail
Art of Multiprocessor Programming
94
Physical Enqueue
head
tail
CAS
Art of Multiprocessor Programming
95
Enqueue
• These two steps are not atomic
• The tail field refers to either
– Actual last Node (good)
– Penultimate Node (not so good)
• Be prepared!
Art of Multiprocessor Programming
96
Enqueue
• What do you do if you find
– A trailing tail?
• Stop and help fix it
– If tail node has non-null next field
– CAS the queue’s tail field to tail.next
• As in the universal construction
Art of Multiprocessor Programming
97
When CASs Fail
• During logical enqueue
– Abandon hope, restart
– Still lock-free (why?)
• During physical enqueue
– Ignore it (why?)
Art of Multiprocessor Programming
98
Dequeuer
head
tail
Read value
Art of Multiprocessor Programming
99
Dequeuer
Make first Node
new sentinel
CAS
head
tail
Art of Multiprocessor Programming
100
Memory Reuse?
• What do we do with nodes after we
dequeue them?
• Java: let garbage collector deal?
• Suppose there is no GC, or we prefer
not to use it?
Art of Multiprocessor Programming
101
Dequeuer
CAS
head
tail
Can recycle
Art of Multiprocessor Programming
102
Simple Solution
• Each thread has a free list of unused
queue nodes
• Allocate node: pop from list
• Free node: push onto list
• Deal with underflow somehow …
Art of Multiprocessor Programming
103
Why Recycling is Hard
head
tail
Want to
redirect
head from
gray to red
zzz…
Free pool
Art of Multiprocessor Programming
104
Both Nodes Reclaimed
head
tail
zzz
Free pool
Art of Multiprocessor Programming
105
One Node Recycled
head
tail
Yawn!
Free pool
Art of Multiprocessor Programming
106
Why Recycling is Hard
head
tail
CAS
OK, here
I go!
Free pool
Art of Multiprocessor Programming
107
Recycle FAIL
head
tail
zOMG what went wrong?
Free pool
Art of Multiprocessor Programming
108
The Dreaded ABA Problem
head
tail
Head reference has value A
Thread reads value A
Art of Multiprocessor Programming
109
Dreaded ABA continued
head
zzz
tail
Head reference has value B
Node A freed
Art of Multiprocessor Programming
110
Dreaded ABA continued
head
Yawn!
tail
Head reference has value A again
Node A recycled and reinitialized
Art of Multiprocessor Programming
111
Dreaded ABA continued
head
tail
CAS
CAS succeeds because references match,
even though reference’s meaning has changed
Art of Multiprocessor Programming
112
The Dreaded ABA FAIL
• Is a result of CAS() semantics
– I blame Sun, Intel, AMD, …
• Not with Load-Locked/Store-Conditional
– Good for IBM?
Art of Multiprocessor Programming
113
Dreaded ABA – A Solution
•
•
•
•
Tag each pointer with a counter
Unique over lifetime of node
Pointer size vs word size issues
Overflow?
– Don’t worry be happy?
– Bounded tags?
• AtomicStampedReference class
Art of Multiprocessor Programming
114
Atomic Stamped Reference
• AtomicStampedReference class
– Java.util.concurrent.atomic package
Can get reference & stamp atomically
Reference
address
S
Stamp
Art of Multiprocessor Programming
115
Concurrent Stack
• Methods
– push(x)
– pop()
• Last-in, First-out (LIFO) order
• Lock-Free!
Art of Multiprocessor Programming
116
Empty Stack
Top
Art of Multiprocessor Programming
117
Push
Top
Art of Multiprocessor Programming
118
Push
Top
CAS
Art of Multiprocessor Programming
119
Push
Top
Art of Multiprocessor Programming
120
Push
Top
Art of Multiprocessor Programming
121
Push
Top
Art of Multiprocessor Programming
122
Push
Top
CAS
Art of Multiprocessor Programming
123
Push
Top
Art of Multiprocessor Programming
124
Pop
Top
Art of Multiprocessor Programming
125
Pop
Top
CAS
Art of Multiprocessor Programming
126
Pop
Top
CAS
mine!
Art of Multiprocessor Programming
127
Pop
Top
CAS
Art of Multiprocessor Programming
128
Pop
Top
Art of Multiprocessor Programming
129
Lock-free Stack
public class LockFreeStack {
private AtomicReference top =
new AtomicReference(null);
public boolean tryPush(Node node){
Node oldTop = top.get();
node.next = oldTop;
return(top.compareAndSet(oldTop, node))
}
public void push(T value) {
Node node = new Node(value);
while (true) {
if (tryPush(node)) {
return;
} else backoff.backoff();
}}
Art of Multiprocessor Programming
130
Lock-free Stack
public class LockFreeStack {
private AtomicReference top = new
AtomicReference(null);
public Boolean tryPush(Node node){
Node oldTop = top.get();
node.next = oldTop;
return(top.compareAndSet(oldTop, node))
}
public void push(T value) {
Node node = new Node(value);
while (true) {
if (tryPush(node)) {
return;
} else backoff.backoff()
tryPush
attempts to push a node
}}
Art of Multiprocessor Programming
131
Lock-free Stack
public class LockFreeStack {
private AtomicReference top = new
AtomicReference(null);
public boolean tryPush(Node node){
Node oldTop = top.get();
node.next = oldTop;
return(top.compareAndSet(oldTop, node))
}
public void push(T value) {
Node node = new Node(value);
while (true) {
if (tryPush(node)) {
return;
Read top value
} else backoff.backoff()
}}
Art of Multiprocessor Programming
132
Lock-free Stack
public class LockFreeStack {
private AtomicReference top = new
AtomicReference(null);
public boolean tryPush(Node node){
Node oldTop = top.get();
node.next = oldTop;
return(top.compareAndSet(oldTop, node))
}
public void push(T value) {
Node node = new Node(value);
while (true) {
if (tryPush(node)) {
return;
} else top
backoff.backoff()
current
will be new node’s successor
}}
Art of Multiprocessor Programming
133
Lock-free Stack
public class LockFreeStack {
private AtomicReference top = new
AtomicReference(null);
public boolean tryPush(Node node){
Node oldTop = top.get();
node.next = oldTop;
return(top.compareAndSet(oldTop, node))
}
public void push(T value) {
Node node = new Node(value);
while (true) {
if (tryPush(node)) {
return;
else backoff.backoff()
Try to} swing
top, return success or failure
}}
Art of Multiprocessor Programming
134
Lock-free Stack
public class LockFreeStack {
private AtomicReference top = new
AtomicReference(null);
public boolean tryPush(Node node){
Node oldTop = top.get();
node.next = oldTop;
return(top.compareAndSet(oldTop, node))
}
public void push(T value) {
Node node = new Node(value);
while (true) {
if (tryPush(node)) {
return;
} else backoff.backoff()
Push calls tryPush
}}
Art of Multiprocessor Programming
135
Lock-free Stack
public class LockFreeStack {
private AtomicReference top = new
AtomicReference(null);
public boolean tryPush(Node node){
Node oldTop = top.get();
node.next = oldTop;
return(top.compareAndSet(oldTop, node))
}
public void push(T value) {
Node node = new Node(value);
while (true) {
if (tryPush(node)) {
return;
} else backoff.backoff()
Create new node
}}
Art of Multiprocessor Programming
136
Lock-free Stack
public class LockFreeStack {
private AtomicReference top = new
AtomicReference(null);
public boolean tryPush(Node node){
Node oldTop = top.get();
If tryPush() fails,
node.next = oldTop;
back off before retrying
return(top.compareAndSet(oldTop,
node))
}
public void push(T value) {
Node node = new Node(value);
while (true) {
if (tryPush(node)) {
return;
} else backoff.backoff()
}}
Art of Multiprocessor Programming
137
Lock-free Stack
• Good
– No locking
• Bad
– Without GC, fear ABA
– Without backoff, huge contention at top
– In any case, no parallelism
Art of Multiprocessor Programming
138
Big Question
• Are stacks inherently sequential?
• Reasons why
– Every pop() call fights for top item
• Reasons why not
– Stay tuned …
Art of Multiprocessor Programming
139
Elimination-Backoff Stack
• How to
– “turn contention into parallelism”
• Replace familiar
– exponential backoff
• With alternative
– elimination-backoff
Art of Multiprocessor Programming
140
Observation
linearizable stack
Push( )
Pop()
Yes!
After an equal number
of pushes and pops,
stack stays the same
Art of Multiprocessor Programming
141
Idea: Elimination Array
Pick at
random
stack
Push( )
Pop()
Pick at
random
Elimination
Array
Art of Multiprocessor Programming
142
Push Collides With Pop
continue
stack
Push( )
Pop()
continue
No need to
access stack
Yes!
Art of Multiprocessor Programming
143
No Collision
Push( )
stack
Pop()
If pushes collide or
If nocollide
collision,
pops
access
stack
access
stack
Art of Multiprocessor Programming
144
Elimination-Backoff Stack
• Lock-free stack + elimination array
• Access Lock-free stack,
– If uncontended, apply operation
– if contended, back off to elimination array
and attempt elimination
Art of Multiprocessor Programming
145
Elimination-Backoff Stack
If CAS fails, back off
Push( )
CAS
Top
Pop()
Art of Multiprocessor Programming
146
Dynamic Range and Delay
Pick random range and
max waiting time based
on level of contention
encountered
Push( )
Art of Multiprocessor Programming
147
Linearizability
• Un-eliminated calls
– linearized as before
• Eliminated calls:
– linearize pop() immediately after matching
push()
• Combination is a linearizable stack
Art of Multiprocessor Programming
148
Un-Eliminated Linearizability
pop(v1)
push(v1)
time
time
Art of Multiprocessor Programming
149
Eliminated Linearizability
Collision
Point
pop(v1)
push(v1)
pop(v2)
push(v2)
Red calls are
eliminated
time
time
Art of Multiprocessor Programming
150
Backoff Has Dual Effect
• Elimination introduces parallelism
• Backoff to array cuts contention on lockfree stack
• Elimination in array cuts down number
of threads accessing lock-free stack
Art of Multiprocessor Programming
151
Elimination Array
public class EliminationArray {
private static final int duration = ...;
private static final int timeUnit = ...;
Exchanger<T>[] exchanger;
public EliminationArray(int capacity) {
exchanger = new Exchanger[capacity];
for (int i = 0; i < capacity; i++)
exchanger[i] = new Exchanger<T>();
…
}
…
}
Art of Multiprocessor Programming
152
Elimination Array
public class EliminationArray {
private static final int duration = ...;
private static final int timeUnit = ...;
Exchanger<T>[] exchanger;
public EliminationArray(int capacity) {
exchanger = new Exchanger[capacity];
for (int i = 0; i < capacity; i++)
exchanger[i] = new Exchanger<T>();
…
}
An array of Exchangers
…
}
Art of Multiprocessor Programming
153
Digression: A Lock-Free
Exchanger
public class Exchanger<T> {
AtomicStampedReference<T> slot
= new AtomicStampedReference<T>(null, 0);
Art of Multiprocessor Programming
154
A Lock-Free Exchanger
public class Exchanger<T> {
AtomicStampedReference<T> slot
= new AtomicStampedReference<T>(null, 0);
Atomically modifiable
reference + status
Art of Multiprocessor Programming
155
Atomic Stamped Reference
• AtomicStampedReference class
– Java.util.concurrent.atomic package
• In C or C++:
reference
address
S
stamp
Art of Multiprocessor Programming
156
Extracting Reference & Stamp
public T get(int[] stampHolder);
Art of Multiprocessor Programming
157
Extracting Reference & Stamp
public T get(int[] stampHolder);
Returns reference to
object of type T
Returns stamp at
array index 0!
Art of Multiprocessor Programming
158
Exchanger Status
enum Status {EMPTY, WAITING, BUSY};
Art of Multiprocessor Programming
159
Exchanger Status
enum Status {EMPTY, WAITING, BUSY};
Nothing yet
Art of Multiprocessor Programming
160
Exchange Status
enum Status {EMPTY, WAITING, BUSY};
Nothing yet
One thread is waiting
for rendez-vous
Art of Multiprocessor Programming
161
Exchange Status
enum Status {EMPTY, WAITING, BUSY};
Nothing yet
One thread is waiting
for rendez-vous
Other threads busy
with rendez-vous
Art of Multiprocessor Programming
162
The Exchange
public T Exchange(T myItem, long nanos)
throws TimeoutException {
long timeBound = System.nanoTime() + nanos;
int[] stampHolder = {EMPTY};
while (true) {
if (System.nanoTime() > timeBound)
throw new TimeoutException();
T herItem = slot.get(stampHolder);
int stamp = stampHolder[0];
switch(stamp) {
case EMPTY: …
// slot is free
case WAITING: … // someone waiting for me
case BUSY: …
// others exchanging
}
}
Art of Multiprocessor Programming
163
The Exchange
public T Exchange(T myItem, long nanos)
throws TimeoutException {
long timeBound = System.nanoTime() + nanos;
int[] stampHolder = {EMPTY};
while (true) {
if (System.nanoTime() > timeBound)
Item and timeout
throw new TimeoutException();
T herItem = slot.get(stampHolder);
int stamp = stampHolder[0];
switch(stamp) {
case EMPTY: …
// slot is free
case WAITING: … // someone waiting for me
case BUSY: …
// others exchanging
}
}
Art of Multiprocessor Programming
164
The Exchange
public T Exchange(T myItem, long nanos)
throws TimeoutException {
long timeBound = System.nanoTime() + nanos;
int[] stampHolder = {EMPTY};
while (true) {
if (System.nanoTime() > timeBound)
throw new TimeoutException();
T herItem = slot.get(stampHolder);
int stamp = stampHolder[0];
Array holds status
switch(stamp) {
case EMPTY: …
// slot is free
case WAITING: … // someone waiting for me
case BUSY: …
// others exchanging
}
}
Art of Multiprocessor Programming
165
The Exchange
public T Exchange(T myItem, long nanos) throws
TimeoutException {
long timeBound = System.nanoTime() + nanos;
int[] stampHolder = {0};
while (true) {
if (System.nanoTime() > timeBound)
throw new TimeoutException();
T herItem = slot.get(stampHolder);
int stamp = stampHolder[0];
switch(stamp) {
case EMPTY:
// slot is free
case WAITING: // someone waiting for me
case BUSY:
// others exchanging
Loop until timeout
}
}}
Art of Multiprocessor Programming
166
The Exchange
public T Exchange(T myItem, long nanos) throws
TimeoutException {
long timeBound = System.nanoTime() + nanos;
int[] stampHolder = {0};
while (true) {
if (System.nanoTime() > timeBound)
throw new TimeoutException();
T herItem = slot.get(stampHolder);
int stamp = stampHolder[0];
switch(stamp) {
case EMPTY:
// slot is free
case WAITING: // someone waiting for me
case BUSY:
// others exchanging
Get other’s item and status
}
}}
Art of Multiprocessor Programming
167
The Exchange
public T Exchange(T myItem, long nanos) throws
TimeoutException {
long timeBound = System.nanoTime() + nanos;
int[]
stampHolder
= three
{0}; possible states
An Exchanger
has
while (true) {
if (System.nanoTime() > timeBound)
throw new TimeoutException();
T herItem = slot.get(stampHolder);
int stamp = stampHolder[0];
switch(stamp) {
case EMPTY: …
// slot is free
case WAITING: … // someone waiting for me
case BUSY: …
// others exchanging
}
}}
Art of Multiprocessor Programming
168
Lock-free Exchanger
EMPTY
Art of Multiprocessor Programming
169
Lock-free Exchanger
CAS
EMPTY
Art of Multiprocessor Programming
170
Lock-free Exchanger
WAITING
Art of Multiprocessor Programming
171
Lock-free Exchanger
In search of
partner …
WAITING
Art of Multiprocessor Programming
172
Lock-free Exchanger
Try to exchange
item and set
status to BUSY
Still waiting …
Slot
CAS
WAITING
Art of Multiprocessor Programming
173
Lock-free Exchanger
Partner showed
up, take item and
reset to EMPTY
Slot
BUSY
item
status
Art of Multiprocessor Programming
174
Lock-free Exchanger
Partner showed
up, take item and
reset to EMPTY
Slot
BUSY
EMPTY
item
status
Art of Multiprocessor Programming
175
Exchanger State EMPTY
case EMPTY: // slot is free
if (slot.CAS(herItem, myItem, EMPTY, WAITING)) {
while (System.nanoTime() < timeBound){
herItem = slot.get(stampHolder);
if (stampHolder[0] == BUSY) {
slot.set(null, EMPTY);
return herItem;
}}
if (slot.CAS(myItem, null, WAITING, EMPTY)){
throw new TimeoutException();
} else {
herItem = slot.get(stampHolder);
slot.set(null, EMPTY);
return herItem;
}
} break;
Art of Multiprocessor Programming
176
Exchanger State EMPTY
case EMPTY: // slot is free
if (slot.CAS(herItem, myItem, EMPTY, WAITING)) {
while (System.nanoTime() < timeBound){
herItem = slot.get(stampHolder);
if (stampHolder[0] == BUSY) {
slot.set(null, EMPTY);
Try to insert myItem and
return herItem;
}}
change state to WAITING
if (slot.CAS(myItem, null, WAITING, EMPTY)){
throw new TimeoutException();
} else {
herItem = slot.get(stampHolder);
slot.set(null, EMPTY);
return herItem;
}
} break;
Art of Multiprocessor Programming
177
Exchanger State EMPTY
case EMPTY: // slot is free
if (slot.CAS(herItem, myItem, EMPTY, WAITING)) {
while (System.nanoTime() < timeBound){
herItem = slot.get(stampHolder);
if (stampHolder[0] == BUSY) {
slot.set(null, EMPTY);
return herItem;
}}
if (slot.CAS(myItem, null, WAITING, EMPTY)){
throw new TimeoutException();
} else {
Spin until either
herItem = slot.get(stampHolder);
myItem
is taken or timeout
slot.set(null,
EMPTY);
return herItem;
}
} break;
Art of Multiprocessor Programming
178
Exchanger State EMPTY
case EMPTY: // slot is free
if (slot.CAS(herItem, myItem, EMPTY, WAITING)) {
while (System.nanoTime() < timeBound){
herItem = slot.get(stampHolder);
if (stampHolder[0] == BUSY) {
slot.set(null, EMPTY);
return herItem;
}}
if (slot.CAS(myItem, null, WAITING, EMPTY)){
throw new TimeoutException();
} else {
was taken,
herItemmyItem
= slot.get(stampHolder);
slot.set(null,
EMPTY);
so return
herItem
return herItem;
that was put in its place
}
} break;
Art of Multiprocessor Programming
179
Exchanger State EMPTY
case EMPTY: // slot is free
ifOtherwise
(slot.CAS(herItem,
myItem,
EMPTY, WAITING)) {
we ran out
of time,
while (System.nanoTime() < timeBound){
try
to reset
status to EMPTY
herItem
= slot.get(stampHolder);
if (stampHolder[0]
and time out== BUSY) {
slot.set(null, EMPTY);
return herItem;
}}
if (slot.CAS(myItem, null, WAITING, EMPTY)){
throw new TimeoutException();
} else {
herItem = slot.get(stampHolder);
slot.set(null, EMPTY);
return herItem;
}
} break;
Art of Multiprocessor Programming
180
Exchanger State EMPTY
case EMPTY: // slot is free
if (slot.compareAndSet(herItem, myItem, WAITING,
BUSY)) {
while (System.nanoTime() < timeBound){
If reset failed,
herItem = slot.get(stampHolder);
someone showed
up after
if (stampHolder[0]
== BUSY)
{ all,
slot.set(null,
so takeEMPTY);
that item
return herItem;
}}
if (slot.compareAndSet(myItem, null, WAITING,
EMPTY)){throw new TimeoutException();
} else {
herItem = slot.get(stampHolder);
slot.set(null, EMPTY);
return herItem;
}
Art ofArt
Multiprocessor
of Multiprocessor
Programming
181
} break;
Programming© Herlihy-Shavit
Exchanger State EMPTY
case EMPTY: // slot is free
if (slot.CAS(herItem, myItem, EMPTY, WAITING)) {
while (System.nanoTime() < timeBound){
herItem = slot.get(stampHolder);
if (stampHolder[0] == BUSY) {
slot.set(null, EMPTY);
Clear
slot and take that item
return
herItem;
}}
if (slot.CAS(myItem, null, WAITING, EMPTY)){
throw new TimeoutException();
} else {
herItem = slot.get(stampHolder);
slot.set(null, EMPTY);
return herItem;
}
} break;
Art of Multiprocessor Programming
182
Exchanger State EMPTY
case EMPTY: // slot is free
if (slot.CAS(herItem, myItem, EMPTY, WAITING)) {
while (System.nanoTime() < timeBound){
herItem = slot.get(stampHolder);
if (stampHolder[0] == BUSY) {
slot.set(null, EMPTY);
If initial CAS failed,
return herItem;
}} then someone else changed status
if (slot.CAS(myItem, null, WAITING, EMPTY)){
EMPTY to WAITING,
throw from
new TimeoutException();
} else {
so retry from start
herItem = slot.get(stampHolder);
slot.set(null, EMPTY);
return herItem;
}
} break;
Art of Multiprocessor Programming
183
States WAITING and BUSY
case WAITING:
// someone waiting for me
if (slot.CAS(herItem, myItem, WAITING, BUSY))
return herItem;
break;
case BUSY:
// others in middle of exchanging
break;
default:
// impossible
break;
}
}
}
}
Art of Multiprocessor Programming
184
States WAITING and BUSY
case WAITING:
// someone waiting for me
if (slot.CAS(herItem, myItem, WAITING, BUSY))
return herItem;
break;
case BUSY:
// others in middle of exchanging
break;
default:
// impossible
someone is waiting to exchange,
break;
}
so try to CAS my item in
}
and change state to BUSY
}
}
Art of Multiprocessor Programming
185
States WAITING and BUSY
case WAITING:
// someone waiting for me
if (slot.CAS(herItem, myItem, WAITING, BUSY))
return herItem;
break;
case BUSY:
// others in middle of exchanging
break;
default:
// impossible
break;
}
If successful, return other’s item,
}
}
otherwise someone else took it,
}
so try again from start
Art of Multiprocessor Programming
186
States WAITING and BUSY
case WAITING:
// someone waiting for me
if (slot.CAS(herItem, myItem, WAITING, BUSY))
return herItem;
break;
case BUSY:
// others in middle of exchanging
break;
default:
// impossible
break;
}
If BUSY,
}
}
other threads exchanging,
}
so start again
Art of Multiprocessor Programming
187
The Exchanger Slot
• Exchanger is lock-free
• Because the only way an exchange can
fail is if others repeatedly succeeded or
no-one showed up
• The slot we need does not require
symmetric exchange
Art of Multiprocessor Programming
188
Back to the Stack: the
Elimination Array
public class EliminationArray {
…
public T visit(T value, int range)
throws TimeoutException {
int slot = random.nextInt(range);
int nanodur = convertToNanos(duration, timeUnit));
return (exchanger[slot].exchange(value, nanodur)
}}
Art of Multiprocessor Programming
189
Elimination Array
public class EliminationArray {
…
public T visit(T value, int range)
throws TimeoutException {
int slot = random.nextInt(range);
int nanodur = convertToNanos(duration, timeUnit));
return (exchanger[slot].exchange(value, nanodur)
visit the elimination array
}}
with fixed value and range
Art of Multiprocessor Programming
190
Elimination Array
public class EliminationArray {
…
public T visit(T value, int range)
throws TimeoutException {
int slot = random.nextInt(range);
int nanodur = convertToNanos(duration, timeUnit));
return (exchanger[slot].exchange(value, nanodur)
}}
Pick a random array entry
Art of Multiprocessor Programming
191
Elimination Array
public class EliminationArray {
…
Exchange
value or time out
public T visit(T value,
int range)
throws TimeoutException {
int slot = random.nextInt(range);
int nanodur = convertToNanos(duration, timeUnit));
return (exchanger[slot].exchange(value, nanodur)
}}
Art of Multiprocessor Programming
192
Elimination Stack Push
public void push(T value) {
...
while (true) {
if (tryPush(node)) {
return;
} else try {
T otherValue =
eliminationArray.visit(value,policy.range);
if (otherValue == null) {
return;
}
}
Art of Multiprocessor Programming
193
Elimination Stack Push
public void push(T value) {
...
while (true) {
if (tryPush(node)) {
return;
} else try {
T otherValue =
eliminationArray.visit(value,policy.range);
if (otherValue == null) {
return;
}
First, try to push
}
Art of Multiprocessor Programming
194
Elimination Stack Push
public void push(T value) {
...
If I failed,
backoff & try to eliminate
while (true)
{
if (tryPush(node)) {
return;
} else try {
T otherValue =
eliminationArray.visit(value,policy.range);
if (otherValue == null) {
return;
}
}
Art of Multiprocessor Programming
195
Elimination Stack Push
public void push(T value) {
...
while (true) {
Value pushed and range to try
if (tryPush(node)) {
return;
} else try {
T otherValue =
eliminationArray.visit(value,policy.range);
if (otherValue == null) {
return;
}
}
Art of Multiprocessor Programming
196
Elimination Stack Push
public void push(T value) {
...
Only pop() leaves null,
while (true) {
so elimination
was successful
if (tryPush(node))
{
return;
} else try {
T otherValue =
eliminationArray.visit(value,policy.range);
if (otherValue == null) {
return;
}
}
Art of Multiprocessor Programming
197
Elimination Stack Push
public void push(T value) {
...
Otherwise, retry push() on lock-free stack
while (true) {
if (tryPush(node)) {
return;
} else try {
T otherValue =
eliminationArray.visit(value,policy.range);
if (otherValue == null) {
return;
}
}
Art of Multiprocessor Programming
198
Elimination Stack Pop
public T pop() {
...
while (true) {
if (tryPop()) {
return returnNode.value;
} else
try {
T otherValue =
eliminationArray.visit(null,policy.range;
if (otherValue != null) {
return otherValue;
}
}
}}
Art of Multiprocessor Programming
199
Elimination Stack Pop
public T pop() {
...
while (true) {
if
(tryPop())
{ other thread is a push(),
If value
not null,
return returnNode.value;
} else so elimination succeeded
try {
T otherValue =
eliminationArray.visit(null,policy.range;
if ( otherValue != null) {
return otherValue;
}
}
}}
Art of Multiprocessor Programming
200
Summary
• We saw both lock-based and lock-free
implementations of
• queues and stacks
• Don’t be quick to declare a data
structure inherently sequential
– Linearizable stack is not inherently
sequential (though it is in worst case)
• ABA is a real problem, pay attention
Art of Multiprocessor Programming
201
This work is licensed under a Creative Commons AttributionShareAlike 2.5 License.
• You are free:
– to Share — to copy, distribute and transmit the work
– to Remix — to adapt the work
• Under the following conditions:
– Attribution. You must attribute the work to “The Art of
Multiprocessor Programming” (but not in any way that
suggests that the authors endorse you or your use of the
work).
– Share Alike. If you alter, transform, or build upon this work,
you may distribute the resulting work only under the same,
similar or a compatible license.
• For any reuse or distribution, you must make clear to others the
license terms of this work. The best way to do this is with a link
to
– http://creativecommons.org/licenses/by-sa/3.0/.
• Any of the above conditions can be waived if you get permission
from the copyright holder.
• Nothing in this license impairs or restricts the author's moral
rights.
Art of Multiprocessor Programming
202
Art of Multiprocessor Programming
203
Download