DATA STUCTURES FOR MULTI

DATA STRUCTURES OPTIMISATION FOR MANY-CORE SYSTEMS Matthew Freeman | Supervisor: Maciej Golebiewski CSIRO Vacation Scholar Program 2013-14 The Multi-core Age CSIRO ‘Bragg’ Compute Cluster Mobile Phone 2-4 Cores 2 | Presentation title | Presenter name PC Intel Xeon Phi 4-16 Cores 61 Cores 2048 Cores Programming for multi-cores Problem Machine Instructions Execution CPU Core 1 CPU Core 2 Divide the problem CPU Core 3 CPU Core 4 3 | Presentation title | Presenter name Amdahl's Law • The maximum speedup is dependent on % of the problem you can run in parallel 95% 20x speedup 90% 10x speedup 4x speedup 75% 2x speedup 50% Single Core Processor 1x Speed 0 5 10 15 Maximum Speedup 4 | Presentation title | Presenter name 20 25 Data structures: • Memory (data) is still a shared resource. Single Core Computer 4-Core Computer CPU core CPU core CPU core Memory (data) Memory (data) CPU core 5 | Presentation title | Presenter name CPU core Linked-list (Stack) Data Structure A “node” that holds data. TOP Data EMPTY A link to the next data point 6 | Presentation title | Presenter name Add new item (Push) We want to add a chunk of data (Data B) to the structure Data B TOP Data A 7 | Presentation title | Presenter name EMPTY Add new item (Push) Steps: For new data B 1) Find the start of the structure (TOP) Data B TOP Data A 8 | Presentation title | Presenter name EMPTY Add new item Steps: For new data B 2) Link into the structure. Data B TOP Data A 9 | Presentation title | Presenter name EMPTY Add new item TOP (new) Steps: For new data B 3) Update TOP. Data B Data A 10 | Presentation title | Presenter name NULL Resulting structure • Like stacking dinner plates • Only need to keep track of where TOP is to access the rest. TOP Data Data 11 | Presentation title | Presenter name Data Data Data NULL What happens in multi-core systems? Two threads trying to operate on the stack structure: Thread 1 attempts at time T. Thread 2 attempts at time T + 1 nanosecond. Because each of the steps takes time to complete, errors occur. 12 | Presentation title | Presenter name What happens in multi-core systems? This causes the interleaving of steps Thread 1 reads TOP (1) Thread 2 reads TOP (1) Thread 1 sets the next pointer (2) Thread 2 sets the next pointer (2) Thread 1 updates TOP (3) Thread 2 updates TOP (3) 13 | Presentation title | Presenter name Data B is lost forever because it is not linked to TOP anymore  (Stack failure) Data B Thread 1 Data A TOP Data C Thread 2 14 | Presentation title | Presenter name EMPTY How do we fix this? • Use “data locks”. • Protect the 3 steps. • One thread at a time is granted access to the stack. • Complete an operation and release the lock. This is the standard approach for multithreaded structures. 15 | Presentation title | Presenter name Locks Easy to use. 2 lines of code added to fix. - Get Lock - Step 1, 2 ,3. - Release Lock. × Slow. One thread at a time can use the lock. This becomes sequential code. This is the code that cannot run in parallel. Analogy: Merging highway traffic into a single lane. 16 | Presentation title | Presenter name Lock-free New method • Lock-free data structure. • Special low-level instructions allows three steps in one computer instruction. • Removes the need for locks. • Called a Compare-Exchange. 17 | Presentation title | Presenter name Lock-free • Downside: Writing lock-free code is difficult (hence the project). • The Compare-Exchange operation forms the base for writing lock-free code. • The project takes specifications from research papers to implement. 18 | Presentation title | Presenter name Lock-free Implemented a range of lock-free optimizations for the stack. Open coding standards (C++, OpenMP) Benchmarked using a Intel Xeon Phi 61 core processor. Lock-free structure performed about 2x better for pure stack operations. 19 | Presentation title | Presenter name Summary Amdahl’s Law shows that it’s important to optimize sequential sections of code. The shared data structures are often sequential bottlenecks. Implementing lock-free data structures reduced this bottleneck. 20 | Presentation title | Presenter name

DATA STUCTURES FOR MULTI

Related documents

Products

Support

DATA STUCTURES FOR MULTI

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib