G. Shainer - Interconnect Your Future

advertisement
Interconnect Your Future
Gilad Shainer, VP of Marketing
Dec 2013
Leading Supplier of End-to-End Interconnect Solutions
Comprehensive End-to-End Software Accelerators and Managment
Management
MXM
FCA
Mellanox Messaging
Acceleration
Fabric Collectives
Acceleration
UFM
Unified Fabric Management
Storage and Data
VSA
UDA
Storage Accelerator
(iSCSI)
Unstructured Data
Accelerator
Comprehensive End-to-End InfiniBand and Ethernet Portfolio
ICs
Adapter Cards
© 2013 Mellanox Technologies
Switches/Gateways
Host/Fabric Software
- Mellanox Confidential -
Metro / WAN
Cables/Modules
2
Mellanox InfiniBand Paves the Road to Exascale Computing
Accelerating Half of the World’s Petascale Systems
Mellanox Connected Petascale System Examples
© 2013 Mellanox Technologies
- Mellanox Confidential -
3
TOP500 InfiniBand Accelerated Petascale Capable Machines
 Mellanox FDR InfiniBand systems Tripled from Nov’12 to Nov’13
• Accelerating 63% of the Petaflop capable systems (12 systems out of 19)
© 2013 Mellanox Technologies
- Mellanox Confidential -
4
FDR InfiniBand Delivers Highest Return on Investment
Higher is better
Higher is better
© 2013 Mellanox Technologies
Higher is better
Source: HPC Advisory Council
- Mellanox Confidential -
Higher is better
5
Connect-IB
Architectural Foundation for Exascale Computing
© 2013 Mellanox Technologies
- Mellanox Confidential -
6
Mellanox Connect-IB The World’s Fastest Adapter
 The 7th generation of Mellanox interconnect adapters
 World’s first 100Gb/s interconnect adapter (dual-port FDR 56Gb/s InfiniBand)
 Delivers 137 million messages per second – 4X higher than competition
 Support the new innovative InfiniBand scalable transport – Dynamically Connected
© 2013 Mellanox Technologies
- Mellanox Confidential -
7
Connect-IB Provides Highest Interconnect Throughput
Higher is Better
Connect-IB FDR
(Dual port)
Connect-IB FDR
(Dual port)
ConnectX-3 FDR
ConnectX-3 FDR
ConnectX-2 QDR
ConnectX-2 QDR
Competition (InfiniBand)
Competition (InfiniBand)
Source: Prof. DK Panda
Gain Your Performance Leadership With Connect-IB Adapters
© 2013 Mellanox Technologies
- Mellanox Confidential -
8
Connect-IB Delivers Highest Application Performance
200% Higher Performance Versus Competition, with Only 32-nodes
Performance Gap Increases with Cluster Size
© 2013 Mellanox Technologies
- Mellanox Confidential -
9
Dynamically Connected Transport Advantages
© 2013 Mellanox Technologies
- Mellanox Confidential -
10
Mellanox ScalableHPC
Accelerations for Parallel Programs
© 2013 Mellanox Technologies
- Mellanox Confidential -
11
Mellanox ScalableHPC Accelerate Parallel Applications
MPI
SHMEM
P1
P2
P3
Memory
Memory
Memory
PGAS
P1
P2
P3
Logical Shared Memory
Memory
P2
FCA
•
•
•
•
•
•
•
•
P3
Logical Shared Memory
Memory
MXM
Reliable Messaging Optimized for Mellanox HCA
Hybrid Transport Mechanism
Efficient Memory Registration
Receive Side Tag Matching
P1
Memory
Memory
Topology Aware Collective Optimization
Hardware Multicast
Separate Virtual Fabric for Collectives
CORE-Direct Hardware Offload
InfiniBand Verbs API
© 2013 Mellanox Technologies
- Mellanox Confidential -
12
Mellanox FCA Collective Scalability
Reduce Collective
3000
100.0
90.0
80.0
70.0
60.0
50.0
40.0
30.0
20.0
10.0
0.0
2500
Latency (us)
Latency (us)
Barrier Collective
2000
1500
1000
500
0
0
500
1000
1500
2000
2500
0
500
Processes (PPN=8)
Without FCA
With FCA
1000
1500
processes (PPN=8)
Without FCA
2000
2500
With FCA
Bandwidth (KB*processes)
8-Byte Broadcast
9000
8000
7000
6000
5000
4000
3000
2000
1000
0
0
500
1000
1500
Processes (PPN=8)
Without FCA
© 2013 Mellanox Technologies
2000
2500
With FCA
- Mellanox Confidential -
13
Nonblocking Alltoall (Overlap-Wait) Benchmark
CoreDirect Offload
allows Alltoall
benchmark with almost
100% compute
© 2013 Mellanox Technologies
- Mellanox Confidential -
14
Accelerator and GPU Offloads
© 2013 Mellanox Technologies
- Mellanox Confidential -
15
GPUDirect 1.0
Transmit
2 1
System
Memory
CPU
GPU
Chip
set
Non GPUDirect
InfiniBand
Chip
set
GPU
GPU
Memory
Memory
CPU
CPU
1
GPU
GPU
Chip
set
Chip
set
GPU
InfiniBand
GPU
Memory
© 2013 Mellanox Technologies
System
Memory
InfiniBand
1
System
Memory
CPU
1 2
Receive
System
Memory
InfiniBand
GPUDirect 1.0
- Mellanox Confidential -
GPU
Memory
16
GPUDirect RDMA
Transmit
1
System
Memory
CPU
GPU
Chip
set
GPUDirect 1.0
InfiniBand
Chip
set
GPU
GPU
Memory
Memory
CPU
CPU
1
GPU
GPU
Chip
set
Chip
set
GPU
InfiniBand
GPU
Memory
© 2013 Mellanox Technologies
System
Memory
InfiniBand
1
System
Memory
CPU
1
Receive
System
Memory
InfiniBand
GPUDirect RDMA
- Mellanox Confidential -
GPU
Memory
17
Performance of MVAPICH2 with GPUDirect RDMA
GPU-GPU Internode MPI Latency
GPU-GPU Internode MPI Bandwidth
25
2000
1800
67 %
10
5
Higher is Better
15
5X
1600
Bandwidth (MB/s)
Latency (us)
Lower is Better
20
1400
1200
1000
800
600
400
5.49 usec
200
0
0
1
4
16
64
256
1K
1
4K
4
16
64
256
1K
4K
Message Size (bytes)
Message Size (bytes)
Source: Prof. DK Panda
67% Lower Latency
© 2013 Mellanox Technologies
5X Increase in Throughput
- Mellanox Confidential -
18
Performance of MVAPICH2 with GPU-Direct-RDMA
Execution Time of HSG
(Heisenberg Spin Glass)
Application with 2 GPU Nodes
Source: Prof. DK Panda
Problem Size
© 2013 Mellanox Technologies
- Mellanox Confidential -
19
Roadmap
© 2013 Mellanox Technologies
- Mellanox Confidential -
20
Technology Roadmap – One-Generation Lead over the Competition
Mellanox
Terascale
Exascale
1st
TOP500 2003
“Roadrunner”
Virginia Tech (Apple)
Mellanox Connected
© 2013 Mellanox Technologies
200Gbs
100Gbs
Petascale
3rd
2000
56Gbs
40Gbs
20Gbs
2005
Mega Supercomputers
2010
- Mellanox Confidential -
2015
2020
21
Paving The Road for 100Gb/s and Beyond
Recent Acquisitions are Part of Mellanox’s Strategy
to Make 100Gb/s Deployments as Easy as 10Gb/s
Copper (Passive, Active)
© 2013 Mellanox Technologies
Optical Cables (VCSEL)
- Mellanox Confidential -
Silicon Photonics
22
The Only Provider of End-to-End 40/56Gb/s Solutions
Comprehensive End-to-End InfiniBand and Ethernet Portfolio
ICs
Adapter Cards
Switches/Gateways
Host/Fabric Software
Metro / WAN
Cables/Modules
From Data Center to Metro and WAN
X86, ARM and Power based Compute and Storage Platforms
The Interconnect Provider For 10Gb/s and Beyond
© 2013 Mellanox Technologies
- Mellanox Confidential -
23
Thank You
Download