Interconnection Topologies • Networks scaling with N • Logical Properties: • distance: number of links on route • degree: number of links in a switch • diameter: maximum routing distance • Physical properties • length, width Network Bandwidth • Local bandwidth: bandwidth available to each node individually effective local bandwidth n b * n n E • b is raw bandwidth (bytes/sec) • n packet length • nE header information • Aggregate bandwidth • Entry bandwidth: Sum of the bandwidth of all links of the nodes into the network. • Bisection bandwidth: Sum of the bandwidth of the minimum set of links that, if removed, partition the network into two equal unconnected sets of nodes. Fully Connected network • diameter = 1 • degree = N • cost? • bus => O(N), but BW is O(1) • crossbar => O(N2) for BW O(N) Linear Arrays and Rings Linear Array Torus Torus arranged to use short wires • Linear Array • Diameter N-1 • Average Distance N/3 • Bisection bandwidth 1 • Route A -> B given by relative address R = B-A – A and B are node numbers (0 N-1) • Examples: • FDDI, SCI, FiberChannel Arbitrated Loop, KSR1 Multidimensional Meshes and Tori • d-dimensional array • N = kd-1 X ...X kO nodes • described by d-vector of coordinates (id-1, ..., iO) • Routing • relative distance: R = (bd-1 - ad-1, ... , b0 - a0 ) • traverse ri = bi - ai hops in each dimension • dimension-order routing Properties • Routing • relative distance: R = (bd-1 - ad-1, ... , b0 - a0 ) • traverse ri = bi - ai hops in each dimension • dimension-order routing • Diameter: • d*(k-1) if k0=k1=...kd-1=k for mesh • d*k/2 for torus • Average Distance • d * k / 3 for mesh • d * k / 4 for torus • Degree: • d to 2d for mesh, 2d for torus • Bisection bandwidth: • k**(d-1) bidirectional links for mesh, 2 times for torus Diameter (d=2) 70 60 50 40 Mesh 30 Torus 20 10 0 2 4 8 16 32 Hypercubes • Also called binary n-cubes, n=#dimension, #nodes = N = 2n • Diameter n=log N • Bisection bandwidth 2n-1 0-D 1-D 2-D 3-D 4-D 5-D ! Gray Code 0-D (0) (0,1,1) (1,1,1) 3-D 1-D (0) (1) (0,0,1) (1,0,1) (0,1,0) (0,1) (1,1,0) (1,1) 2-D (0,0) (1,0) (0,0,0) (1,0,0) Routing in Hypercubes • Nodes are connected to n nodes that differ by exactly one bit in address. (Gray Code) • Routing: • length: number of ones in A xor B • dimension order routing: hops in the non-zero dimensions Properties of Some Topologies Topology Degree Diameter Ave Dist Bisection Ave Dist N=1024 1D Array 2 N-1 N/3 1 huge 1D Ring 2 N/2 N/4 2 huge 2D Mesh 4 2 (N1/2 - 1) 2/3 N1/2 N1/2 21 2D Torus 4 N1/2 1/2 N1/2 2N1/2 16 Hypercube n=log N n n/2 N/2 5 • All have some “bad permutations” • many popular permutations are very bad for meshes (transpose) • randomness in wiring or routing makes it hard to find a bad one! K-ary Trees • height d = logk N • Fixed degree • Diameter and avg. distance are logarithmic • Bisection BW? Addressing in k-ary Trees Level 1 0 1 Level 0 (0,0) (0,1) (1,0) (1,1) • address specified as d-vector • describing path down from root (kd,…,k0) • Route up to common ancestor and down • going from A to B: R = B xor A • let i be position of most significant 1 in R, route up i+1 levels (common ancestor) • down in direction given by low i+1 bits of B Example: Routing in a Tree Network 000 001 010 011 100 101 110 111 Route from 000 to 001: 000 xor 001 = 001 Route from 010 to 111: 010 xor 111 = 101 Position i+1= 1 1 level up Position i+1= 3 3 level up Last bit of B is 1 Last 3 bits of B is 111 Fat tree • Routing AB: • Select random switch C in the least common ancestor of A and B • Take unique tree route from A to C • Take unique tree route back from C to B • Let i be position of most significant 1 in B xor A; then there are 2i root nodes to choose from. The longer the routing distance the more the traffic can be distributed. • Tree network in a partition of the Altix 4700, Roadrunner How Many Dimensions in Network? • d = 2 or d = 3 • Short wires, easy to build • Many hops, low bisection bandwidth • Requires traffic locality • d >= 4 • Harder to build, more wires, longer average length • Fewer hops, better bisection bandwidth • Can handle non-local traffic • k-ary d-cubes provide a consistent framework for comparison • N = kd • scale dimension (d) or nodes per dimension (k) Traditional Scaling: Latency(P) 140 Ave Latency T(n=40) 120 100 d=2 d=3 80 d=4 k=2 60 n/w 40 20 0 0 5000 10000 Machine Size (N) • Assumes equal channel width and 1 cycle routing delay • bandwidth 1 byte/cycle • dominated by average distance In the 3-D world • For N nodes in a 3-cube, bisection area is O(N2/3 ) • For larger dimensions the bisection bandwidth is limited to O(N2/3 ), since number of wires in physical space are limited. (Dally, IEEE TPDS, 1990) Equal cost in k-ary d-cubes • Equal bisection bandwidth? • Equal number of pins/wires? • What do we know? • switch degree: 2*d • diameter = d*(k-1) • total links = N*d • pins per node = 2wd (w is width of link) • bisection = k**(d-1) = N/k links in each direction • 2Nw/k wires cross the middle Latency with Equal Bisection Width • Number of wires crossing bisection is constant. • N-node hypercube has N bisection links • 2d torus • has 2N bisection links • each link can thus be N / 2 times wider • 1 M nodes, d=2, each link can be 512 times wider than in a hypercube. 1000 900 Ave Latency T(n=40) 800 700 600 500 400 300 256 nodes 1024 nodes 200 16 k nodes 100 1M nodes 0 0 10 20 Dimension (d) 30 Latency with Equal Pin Count 300 300 256 nodes 250 1024 nodes Ave Latency T(n= 140 B) Ave Latency T(n=40B) 250 16 k nodes 200 1M nodes 150 100 50 200 150 100 256 nodes 1024 nodes 50 16 k nodes 1M nodes 0 0 0 5 10 15 20 25 0 Dimension (d) • Baseline d=2, has w = 32 (128 wires per node) • fix 2dw pins => w(d) = 64/d • distance down with d, but channel time up 5 10 15 Dimension (d) 20 25 Embedding d-dim. Dimension into physical space • Wire density tends to be very high near the bisection and low near the perimeter. • 2D mesh has uniform density throughout. • Higher-order dimensions require longer links. Cycle time for network increases logarithmic in the wire length. Topology Summary • Rich set of topological alternatives • Design point depends heavily on cost model • nodes, pins, bisection, area, ... • Wire length or wire delay metrics favor small dimension • Long (pipelined) links increase optimal dimension