Slides - Computer System & Network Laboratory

advertisement
Memory Network: Enabling Technology for
Scalable Near-Data Computing
Gwangsun Kim, John Kim
Jung Ho Ahn
Yongkee Kwon
Korea Advanced Institute of
Science and Technology
Seoul National University
SK Hynix
Memory Network
Hybrid Memory Cube (HMC)
Vault
DRAM layers
Vault
controller
…
Vault
controller
Intra-HMC Network
Logic layer
High-speed link
I/O port
…
I/O port
Data A
 “Near”-data processing with
multiple memories?
Data B
“compute A+B”
 “Far”-data?
 Memory network enables
scalable near-data computing.
Data B
2/10
DIVA Processing-in-Memory (PIM) Chip
 For multimedia and irregular applications.
 Proposed memory network for PIM modules.
 Simple low-dimensional network (e.g., ring )
 High packet hop count  performance & energy inefficiency
 Advanced technology is available – high off-chip bandwidth
Draper et al., “The architecture of the DIVA processing-in-memory chip”, ICS’02
3/10
Memory Networks from Micron
Network-attached memories
2D Mesh topology
Local memories
D. R. Resnick, “Memory Network Methods, Apparatus, and Systems,”
US Patent Application Publication, US20100211721 A1, 2010.
4/10
Memory Network Design Issues
 Difficult to leverage high-radix topology
– Low-radix vs. high-radix topology
– High-radix topology  smaller network diameter
– Limited # of ports in memory modules.
Low-radix networks
High-radix networks
 Adaptive routing requirement
– Can increase network cost
– Depends on traffic pattern, memory mapping, etc.
5/10
Memory-centric Network
 Host-memory bandwidth still matters.
– To support conventional applications while adopting NDP.
– NDP involves communication with host processors.
 MCN Leverage the same network for NDP.
Separate network required for NDP
…
…
CPU
Memory BW
CPU
Processor-to
processor BW
…
Network
CPU
Flexible BW
utilization
…
…
CPU
…
Network
The same network can be used for NDP
Processor-centric Network (PCN)
Memory-centric Network (MCN)
(e.g., Intel QPI, AMD HyperTransport)
[PACT’13]
6/10
Memory Network for Heterogeneous NDP
 NDP for not only CPU, but also for GPU.
 Unified memory network for multi-GPU systems [MICRO’14].
 Extending the memory network for heterogeneous NDP.
CPU
GPU
GPU
…
…
…
…
FPGA
…
Unified Memory Network
7/10
Hierarchical Network
 With intra-HMC network, the memory network is a
hierarchical network.
 NDP requires additional processing elements at the logic
layer.
 Need to support various types of traffic
– Local (on-chip) traffic vs. global traffic
– Conventional memory access traffic vs. NDP-induced traffic
Hybrid Memory Cube
DRAM (stacked)
Vault controller
On-chip channel
I/O port
Concentrated Mesh-based intra-HMC network [PACT’13]
8/10
Issues with Memory Network-based NDP
 Power management
– Large number of channels possible in memory network
– Power-gating, DVFS, and other circuit-level techniques.
 Data placement & migration
– Optimal placement of shared data
– Migration within memory network
 Consistency & coherence
– Direct memory access by multiple processors
– Heterogeneous processors
9/10
Summary
 Memory network can enable scalable near-data
processing.
 Leveraging recent memory network researches
– Memory-centric network [PACT’13]
– Unified memory network [MICRO’14]
 Intra-HMC design considerations
 Further issues
10/10
Download