Interconnect Networks Basics

advertisement
Interconnect Networks Basics
Generic parallel/distributed system
architecture
• On-chip interconnects (manycore processor)
• Off-chip interconnects (clusters of servers)
Interconnection
network performance
• Latency: how much time does it take between the time when a
send of 1 byte is issued and the time when the receive of the data is
completed?
– Signal propogation delay + router queuing delay
• Bandwidth: how much time to send a large amount of data (e.g.
1MB)?
• Examples:
– Ethernet:
• Bandwidth 100Mbps, 1Gbps, 10Gbps, 100Gbps
• Latency: 25us -100us (user level, single hop, try ping between linprog’s)
– InfiniBand
• Bandwidth: 20Gbps, 40Gbps, 54Gbps, 80Gbps, ……
• Latency: 1-3us (user level, single hop)
Interconnection
network performance
• Latency and Bandwidth
– Different levels
• User level: the performance that users feel
• Systems level, device level
• Which level will have the highest bandwidth?
– Example: 1Gbps Ethernet, 800Mbps at system level, 650Mbps
at the user level.
• 1Gbps Ethernet, which level?
• 0.115ms ping latency, which level?
– Some measurement trap: single pair .vs. multiple
pair.
Network components
• Network interface (card)
• Communication between a node and the network
• Link
• Bundle of wires and fibers that carry signals
• Switches
• Connects a fixed number of input channels to a fixed number of output
channels.
• In this community, switches may also have the router functions.
Switch
The cross-bar can realize a communication from
any input port to any output port.
• The simplest form is a dedicated computer with
memory (e.g. linux router).
Most expensive form: Cross-bar functionality –
all permutations can be realized simultaneously
i
n
p
u
t
1
2
1
2
1
2
3
3
3
4
4
4
1 2 3
4
1 2 3
4
output
A 4x4 cross-bar
(1,2, 3, 4)->
(3, 1, 2, 4)
1 2 3
4
(1,2,3,4)->(4,3,2,2)
Only (1,2,3,4)->(4,3,2,-)
Permutation: (1, 2, 3, 4) -> (3, 1, 2, 4)
A communication pattern where each source
happens once, each destination happens once.
The input registers send control signals to the control, routing, scheduling
module indicating the pattern; the control module computes and sets the dots.
Switch example: 24-port 1Gbps
Ethernet switch
• 24 input ports and 24 output ports – each
Ethernet jacket has one input port and one
output port.
• All 24 machines can send and receive
simultaneously.
switch
Ethernet card
machine
Alternatives to cross-bars
• A question: why buffers when we can always do
permutation?
• An N x N cross bar has O(N^2) cross points
(on/off switches).
– Not scalable, expensive
• An alternative for low end switches: bus and
memory
– When bus and memory is fast enough, moving data
between input and output ports are like memory copy
in a typical computer.
Bus and memory alternative to
crossbar
• Realizing (1, 2, 3, 4) -> (4, 3, 2, 1)
–
–
–
–
–
–
–
–
–
Read from input port 1 to memory A
Read from input port 2 to memory B
Read from input port 3 to memory C
Read from input port 4 to memory D
Run forwarding logic (find out the output ports)
Write A to output port 4
Write B to output port 3
Write C to output port 2
Write D to output port 1
Bus and memory alternative to
crossbar
• A typical northbridge bandwidth is a few
GBps. Let us assume the bandwidth is 4GBps,
how many ports can the northbridge support
in 100Mbps Ethernet swithes?
Another alternative: multistage
interconnection network
• Realize all permutations without controlling
O(N^2) cross-points.
– Clos networks, Benes networks
Each of the dot is a 2x2 switch, controlled
by two states.
0
1
How to realize 0000->0000, 0001->0001, 0010->1011?
Switch
• All approximate crossbars
– High end ones are equivalent to or close to crossbars: all
permutations can happens simultaneously.
– Low end ones will have limited total bandwidth (aggregate
bandwidth).
• Example: High end and low end 24 port 1Gbps switch
connecting 24 computers.
– With one pair of Source/destination, the throughput will be
about 800Mbps for both (no difference).
– When 24 pairs send/receive at the same time
• High end one will get 24*800Mbps
• Low end one will get a total of X Mbps, X < 24*800Mbps (X can
sometimes be about 5*800Mbps)
– Different pairs may also have different throughput depending on the
scheduling algorithm.
Network level components
• Topology (what)
– Physical interconnection structure of the network graph.
– Physically limits the performance of the networks.
• Routing algorithm (which)
– Restricts the set of paths that messages can follow.
• Switching strategy (how)
– How data in a message traverses a route (passing routers)
• Flow control mechanism (when)
– When a message or portions of it traverse a route
– What happens when traffic collides
Topology
• How the components are connected.
• Important properties
• Diameter: maximum distance between any two nodes
in the network (hop count, or # of links).
• Nodal degree: how many links connect to each node.
• Bisection bandwidth: The smallest bandwidth
between half of the nodes to another half of the
nodes.
• A good topology: small diameter, small nodal
degree, large bisection bandwidth.
Topology
• Regular topologies
– Nodes are connected with some kind of patterns.
• The graph has a structure.
– Nodes are identified by coordinates.
– Routing can usually pre-determined by the
coordinates of the nodes.
• Irregular topologies
– Nodes are connected arbitrarily.
• The graph does not have a structure, e.g. internet
• More extensible in comparison to regular topology.
– Usually use variations of shortest path routing.
Example regular topology:
complete binary tree
• Nodal degree = ?
• Diameter = ?
• Bisection bandwidth = ?
Example regular topology: ring
topology
0
1
2
3
• Nodal degree = ?
• Diameter = ?
• Bisection bandwidth = ?
4
Routing: deciding which path to
take from a source to a destination
0
1
2
3
4
• 0 to 1: 0->1 or 0->4->3->2->1
• Which path to use? This is a routing issue.
• Routing objective:
– Minimize resources used
• Shortest path routing
– The load on all links are as balanced as possible (load
balancing).
• ???
Classification of routing schemes
0
1
2
3
4
• 0 to 1: 0->1 or 0->4->3->2->1
• Deterministic .vs. adaptive
– Deterministic – always the same route
– Adaptive – choose load depending on traffic condition?
• Minimal routing: always use shortest path
• Source routing: the source node supplies the path
• Destination routing: routing based on destination ID
Switching
• Communication data units:
– Message
– Packet
– Flit
• How a packet passes a switch.
• Circuit switching – circuit setup, all data pass
through
• Packet switching: the whole packet stored in a
switch, and then forwarded to the next hop
Flow-control
• Used between hops to make sure that when
data is sent, there is available buffer for the
data.
• Built into switching mechanism sometimes.
Download