BlueGene/L Design Fundamentals Design goals

advertisement
BlueGene/L Design Fundamentals
l
Design goals
l
l
l
Availability
l
l
l
Redundancy, fault detection and fault tolerance
Standard proven components for reliability and cost
Custom advanced components where needed
for increased application performance
l
10/14/03
Excellent cost/performance ratio (including operational costs)
Good performance/power and performance/volume ratios
65 thousands nodes
l “BG/L Compute” System-on-a-chip
2
1
ASIC cost/performance advantage
l
l
Embedded processor has power/performance advantage
System-on-a-chip allows less complexity, denser packaging
10/14/03
3
BlueGene/L System
System
(64 cabinets, 64x32x32)
Cabinet
(32 Node boards, 8x8x16)
Both processors
in a single chip
Node Board
(32 chips, 4x4x2)
16 Compute Cards
Compute Card
(2 chips, 2x1x1)
180/360 TF/s
16 TB DDR
Chip
(2 processors)
90/180 GF/s
8 GB DDR
2.8/5.6 GF/s
4 MB
10/14/03
2.9/5.7 TF/s
256 GB DDR
5.6/11.2 GF/s
0.5 GB DDR
4
2
The BlueGene/L Networks
3 Dimensional Torus
Point-to-point
Global Tree
Global Operations
Global Barriers and Interrupts
Low Latency Barriers and Interrupts
Gbit Ethernet
File I/O and Host Interface
Control Network
Boot, Monitoring and Diagnostics
10/14/03
5
BlueGene/L Compute ASIC
PLB (4:1)
32k/32k L1
256
128
L2
440 CPU
4MB
EDRAM
“Double FPU”
snoop
Multiported
Shared
SRAM
Buffer
256
32k/32k L1
440 CPU
I/O proc
128
Shared
L3 directory
for EDRAM
L3 Cache
1024+
or
144 ECC
Memory
L2
256
Includes ECC
256
“Double FPU”
128
• IBM CU-11, 0.13 µm
• 11 x 11 mm die size
• 25 x 32 mm CBGA
• 474 pins, 328 signal
• 1.5/2.5 Volt
10/14/03
Ethernet
Gbit
Gbit
Ethernet
JTAG
Access
JTAG
Torus
6 out and
6 in, each at
1.4 Gbit/s link
Tree
3 out and
3 in, each at
2.8 Gbit/s link
Global
Interrupt
4 global
barriers or
interrupts
DDR
Control
with ECC
144 bit wide
DDR
256/512MB
6
3
Dual Node Compute Card
Metral 4000
connector
(180 pins)
Heatsinks designed
for 15W
54 mm
(2.125”)
206 mm (8.125”) wide, 14 layers
9 x 512 Mb DRAM;
16B interface
10/14/03
7
Midplane (450 pins) torus, tree,
barrier, clock, Ethernet service port
16 compute
cards
EthernetJTAG FPGA
dc-dc
converters
2 optional
IO cards
32- way (4x4x2) node card
IO Gb Ethernet
connectors through
tailstock
10/14/03
Latching and retention
8
4
512 Way BG/L Prototype
10/14/03
9
Compact Footprint
10/14/03
10
5
Download