Data Structures of the Future: Concurrent

advertisement
Data Structures of the Future:
Concurrent, Optimistic, and Relaxed
Dan Alistarh
ETH Zurich
Background
• The First Graph Problem (Euler, 1735)
• 4 vertices, 7 edges
• Graph Problems, circa 2010
The Bridges of Königsberg
• Social Graphs: 1 billion vertices, 100 billion edges (~1 TB storage)
• Graph Problems Today
• Web/Office/Brain Graphs: 100 billion vertices, 100 trillion edges
• More than 1 PetaByte storage
We distribute computation across processors, computers, and data centers.
This changes the way data structures are designed, built, and deployed.
Why Concurrent?
To get speedup on newer hardware.
Scaling: more threads should imply more useful work.
The Problem with Concurrency
Throughput (Events/Second)
Throughput of Parallel Event Processing
6,00E+06
Event Queue
5,00E+06
> $10000 /
machine
4,00E+06
3,00E+06
event3
2,00E+06
1,00E+06
0,00E+00
0
< $1000
/20
10
machine
30
40
50
60
70
Number of Threads
Concurrency can be very bad value for money.
Is this problem inherent?
event2
event1
Inherent Sequential Bottlenecks
Data structures with strong ordering semantics
• Stacks, Queues, Priority Queues, Exact Counters
Theorem: Assuming n threads, any deterministic, strongly ordered
data structure has an execution in which
a processor takes linear in n time to return.
[Alistarh, Aspnes, Gilbert, Guerraoui, Journal of the ACM, 2014]
This is important because of Amdahl’s Law
• Assume single-threaded computation takes 1 week
• Inherently sequential
component (e.g.,
takes
15% = 1up
day
To get performance,
it isqueue)
critical
to speed
• Then maximum speedup
< 7x, even
infinite threads
shared
datawith
structures.
Concurrent Data Structures
Algorithms, data structures, and architectures for
scalable distributed computation.
Theory ↔ Software ↔ Hardware
New Algorithmic and Analytic Ideas.
New Hardware Designs!
New Data Structures!
Discrete Event Simulation
Search(key)
11
task
1
Insert/Delete(k, v)
18
task
DeleteMin()
15
3
task
4
task
7
Priority Queue
task<key,
8
5
task
value>
task
task
task
Extremely useful:
• Graph Operations (Shortest Paths)
• Operating System Kernel
• Time-Based Simulations
We are looking for a fast
concurrent Priority Queue.
Methods:
• Get Top Task
• Insert a Task
• Search for Task
The Problem
Target: fast, concurrent Priority Queue.
Lots of work on the topic:
[Sanders97], [Lotan&Shavit00], [Sundell&Tsigas07],
[Linden&Jonsson13], [Lenhart et al. 14], [Wimmer et al.14]
Current solutions are hard to scale:
DeleteMin is highly contended.
Everyone wants the same element!
Concurrent Solution
● Linked list, sorted by priority
● Each node has random “height” (geometrically distributed with parameter ½)
● Elements at the same height form their own lists
head
H
1
3
4
5
9
…
T
Concurrent Solution: the SkipList [Pugh90]
●
●
●
●
Linked list, sorted by priority
Each node has random “height” (geometrically distributed with parameter ½)
Elements at the same height form their own lists
Average time Search, Insert, Delete logarithmic, work concurrently [Pugh98, Fraser04]
tail
head
Search( 5 )
[H, 9]
[H, 9]
[1, 9]
[5, 9]
stop
!
H
1
3
4
5
9
…
T
The SkipList as a PQ
● DeleteMin: simply remove the smallest element from the bottom list
● All processors compete for smallest element
● Does not scale!
head
I. Lotan and N. Shavit. Skiplist-Based Concurrent Priority Queues. 2000.
tail
The Idea: Relax!
● We want to choose an item at random with ‘good’ guarantees
● Minimize loss of exactness by only choosing items near the front of the list
● Minimize contention by keeping collision probability low
DeleteMin: The Spray [Alistarh, Kopinsky, Li, Shavit, PPoPP 2015]
procedure Spray()
● At each skiplist level, flip coin to stay or jump forward
● Repeat for each level from log n down to 1 (the bottom)
● As if removing a random priority element near the head
jump
stay
jump
jump
Two examples for starting height 4
Spray and pray?
SprayList Probabilistic Guarantees
✓ Maximum value returned by Spray has rank Õ(𝑛)
‐
Sprays aren’t too wide
✓ For all x, p(x) = Õ(1/𝑛)
‐
Sprays don’t cluster too much
✓ If x > y is returned by some Spray, then p(y) = Ω(1/𝑛)
‐
Elements do not starve in the list
Õ(𝑛)
p(x) = probability that a
spray returns value at
index x
The Benchmark
• Discrete Event Simulation
• Exact algorithms have negative scaling after 8 threads
• SprayList competitive with the random remover
(no guarantees, incorrect execution)
In many practical settings
(Discrete Event Simulation, shortest paths),
priority inversions are not expensive.
DeleteMin: The Spray
node* DeleteMin ( ) {
cur <- head;
//starting node
i <- log n;
//starting height
while (i > 0) {
repeat(rand(0, 1)) { //stay or skip?
cur <- cur->next[i];
};
i <- i-1;
//decrease level
}
v <- cur->val
//reached bottom
flag <- Compare-and-Swap ( cur->val, v, NULL ) //acquire node
if ( flag == SUCCESS ) return cur;
else RETRY
}
The SprayList relaxes progress as well!
Relaxed Data Structures
The data structures of our childhood are changing.
The SprayList merges both relaxed semantics and
optimistic progress to achieve scalability.
A relaxation renaissance
[KarpZhang93], [DeoP92], [Sanders98],
[HenzingerKPSS13], [NguyenLP13], [WimmerCVTT14],
[LenhartNP15], [RihaniSD15], [JeffreySYES16]
My Research
Algorithms, data structures, and architectures for
scalable distributed computation.
Theory ↔ Software ↔ Hardware
Algorithms and Data Structures.
Architectures and Systems.
Applications.
Interested?
Internship / Master / PhD.
What’s Next?
Algorithms, data structures, and architectures for
scalable distributed computation.
Theory ↔ Software ↔ Hardware
Algorithms and Data Structures.
Next-Generation Data Structures
Population Protocols
Architectures and Systems.
Low-Latency Transactional Systems
Optimization for the Cloud
Applications.
Large-Scale Graph Processing
Distributed Machine Learning
Backup: SprayList Shortest Paths
DeleteMin: The Spray
node* DeleteMin ( ) {
cur <- head;
//starting node
i <- log n;
//starting height
while (i > 0) {
repeat(rand(0, 1)) { //stay or skip?
cur <- cur->next[i];
};
i <- i-1;
//decrease level
}
v <- cur->val
//reached bottom
flag <- Compare-and-Swap ( cur->val, v, NULL ) //acquire node
if ( flag == SUCCESS ) return cur;
else RETRY
}
Parameters in red can be tuned!
The Scheduler (Intel™ machine, single socket)
The Stochastic Scheduler Model
• Short version:
Every (non-faulty) thread can be scheduled in each step, with probability > 0.
• Definition: A scheduler is a triple (Dt, At, θ)t > 0, where Dt is the distribution at
time t, At is the active set at time t, and
θ is the probability threshold, such that
• At time t, scheduling probabilities are given by Dt
• Only processes in At can be scheduled at t
• Each process in At is scheduled with probability ≥ θ.
A scheduler is stochastic if θ > 0.
* Lottery OS Scheduling, e.g. [Petrou, Milford, Gibson]
Examples
• Assume n processes
• The uniform stochastic scheduler:
• θ=1/n
• Each process gets scheduled uniformly
• A standard shared-memory adversary:
• Take any adversarial strategy
• Take Dt to give probability 1 to the process
picked by the strategy, 0 to all others
• Not stochastic
• Quantum-based schedulers also easily modeled
Related documents
Download