The network diameter is the largest minimum hop count in the

advertisement
University of Ottawa
ELG7187 - Scribing
Interconnection Networks on Chip: Topologies, Routing
John-Marc Desmarais
Student Number 3198863
Contents
Interconnection networks on chip: Topologies, Routing
3
Lecture Review
3
Definitions
4
Performance Metrics
5
Types of Topologies
7
Topology Examples
8
Presentation - Autotuning of Network Parameters
10
Introduction
10
Adaptive Techniques
10
Communication Architecture Tuners
11
Analysis for Design of Communication
11
Architectures
11
Communication Partitioning
12
Parameter Generation
12
Performance Analysis
13
Example Networks Using CAT
13
CONCLUSION
14
Conclusion
14
Future Work
14
References
15
2
Interconnection networks on chip:
Topologies, Routing
Lecture Review
This lecture concentrates on performance and performance metrics of interconnection networks on a
chip. We discussed the difference between direct and indirect networks and showed several examples
of each type. We also looked at the important metrics used for evaluating System on Chip (SoC)
networks.
This review will begin by providing basic definitions for components of an interconnection network, will
then provide useful performance metrics which can be used to evaluate and compare interconnection
networks.
Several topologies will then be described and analyzed based on these performance metrics.
3
Definitions
Channel:
A network channel is a conduit over which information traffic can flow. It is synonymous with
connection.
Connections:
A network connection is a conduit over which information traffic can flow. It is synonymous with
channel.
Direct Network:
A direct network is one in which each node contains a router.
Flit:
A flit is the largest amount of network traffic that can be transmitted during one cycle.
Head Flit:
The head flit is the first flit of a packet.
Indirect Network:
An indirect network is one in which each node does not contains a router. A router may be
responsible for sending traffic to any of several nodes.
Node:
A network node is a source or a sink for information which will flow on the network.
Network:
A network is a combination of nodes, connections and routers.
Packet:
A packet is a logical division of information that is to be transmitted from one node to another,
it usually consists of several flits.
Router:
A router is a device on the network which contains routing tables and algorithms which describe
how information traffic will flow through the connections.
Tail Flit:
The tail flit is the last flit of a packet.
Topology:
A topology is a formal definition of the layout of nodes and connections. It defines which nodes
are connected to which routers and which routers are interconnected.
4
Performance Metrics
Degree:
The degree of a network is the number of links to a given node. This can be used to estimate the
number of connection wires in the network.
E.g. In a direct network, a 3x3 router will have a node degree of two. Since one of the
connections must connect to the core.
Minimum Hop Count:
The hop count is the minimum number of routers between a source and a destination. This can
be used to estimate the network latency. Or, the number of separate connection lines that a
packet needs to travel from source to destination.
Network Diameter:
The network diameter is the largest minimum hop count in the network. Calculating the
minimum hop count between all of the nodes in the network and then selecting the largest of
these will give the network diameter. This can be used to estimate the worst case network
latency.
Average Minimum Hop Count:
The average minimum hop count is calculated by taking all of the possible source/destination
pairs in the network. Counting the minimum number of hops between these two ends and then
averaging all of these values will provide the average minimum hop count.
Latency:
The latency is the amount of time required for a packet to reach its destination through the
network.
5
Head Latency:
The head latency is the amount of time required for the head flit to reach its destination node
through the network.
Maximum Channel Load:
The maximum channel load is the maximum amount of traffic that can flow between routers
concurrently. When the network reaches its maximum channel load, it becomes saturated and
cannot accept any more traffic.
Bandwidth:
The network bandwidth is defined as the number of bits per second that can be injected into the
network before it saturates.
Also,
𝜔 = 𝜔𝑐 ∗ 𝑓𝑐
Where ω is the network bandwidth, ωc is the bandwidth of channel c and fc is the frequency of
injection into channel c.
Channel Load:
The channel load is the amount of traffic being inserted into a channel measured in bits/second.
Bisection Width:
The bisection width is the number of connections that need to be cut in order to divide the
network into two networks with an equal number of nodes (plus or minus one node). Uniform
traffic implies that ½ of the traffic will pass through this bisection.
6
Types of Topologies
Switched:
Bus and crossbar networks are switched networks.
A bus network connects inputs to outputs using a single connection line for all nodes. Whereas a
crossbar network connects each input to each output using separate connection lines, one for each
input/output pair.
A crossbar network is also known as “fully connected”.
Direct/Indirect:
As previously mentioned, direct networks have a router on each node whereas indirect networks
separate the nodes from the routers.
Static/Dynamic:
A static network has fixed links between routers whereas a dynamic network can build this connections
on the fly. A fully connected network (crossbar) is a static network whereas a bus network connects two
nodes on the bus only when these two nodes need to communicate and hence, is dynamic.
7
Topology Examples
Torus :
A torus network is a network in which each node is connected to k other nodes based on the
dimensionality of the torus. For example a two-dimensional torus will have four connections
away from a given node. In general, a node in a k-dimension torus will connect to 2k other
nodes.
E.g. In this 3x3 two dimensional torus. Each node is connected to
four other nodes. For any NxN two dimensional torus each node
will be connected to four other nodes.
Degree:
2k for a k dimensional torus.
Minimum Hop Count:
From (1)
𝐻𝑚𝑖𝑛
𝑛𝑘
,
4
={
𝑘 1
𝑛( − ),
4 4𝑘
𝑘 𝑖𝑠 𝑒𝑣𝑒𝑛
𝑘 𝑖𝑠 𝑜𝑑𝑑
Where n is the number of nodes in the torus and k is the dimensionality.
Number of Connections:
Each node has degree of 2k. So the number of connections is 2𝑛𝑘 where n is the number of
nodes and k is the dimensionality of the torus.
8
Mesh:
If the wrap-around connections are removed from a torus, a mesh network is created. A mesh network
will have one or more fewer connections on border edge nodes and corner nodes.
E.g. In this 3x3 two dimensional mesh network, the center node is
connected to four other nodes. The border edge nodes are connected
to three other nodes, while the corner nodes are only connected to
two other nodes.
Degree:
2k for a k dimensional mesh.
Minimum Hop Count:
From (1)
𝐻𝑚𝑖𝑛
𝑛𝑘
,
3
={
𝑘 1
𝑛( − ),
3 3𝑘
𝑘 𝑖𝑠 𝑒𝑣𝑒𝑛
𝑘 𝑖𝑠 𝑜𝑑𝑑
Where n is the number of nodes in the mesh and k is the dimensionality.
Number of Connections:
Each node has degree of 2k. Each node on an edge will have one fewer connection. For a two
dimensional mesh, there will be 2𝑘 ∗ √𝑛 fewer nodes. For a higher dimensionality, the square
𝑘
root changes to cube root, 4th root etc. This gives us 2𝑘 ∗ √𝑛 fewer nodes than the torus.
𝑘
So the number of connections is 2𝑛𝑘 − 2𝑘 √𝑛 where n is the number of nodes and k is the
dimensionality of the mesh.
9
Presentation - Autotuning of Network Parameters
Introduction
For the presentation portion of this paper the following questions were provided:
1. How can one perform monitoring of the network parameters including temporal characteristics of
the traffic and current parameters of the networking protocol?
2. How can one report the parameters in real-time?
3. Which dynamic parameters can be monitored (routing protocol, burstiness of the traffic or priority
levels of the packets, ...)
4. How can these parameters be used for autotuning? What can be tuned? Can this information help in
dispatching threads?
The presentation concentrated on using Communication Architecture Tuners to monitor and update
network flow protocols on the fly. Two examples of systems that use the Communication Architecture
Tuners to control network flow protocols are then given.
Adaptive Techniques
As connection network decrease in size unpredictable network congestion and link failures can occur. If
a router continues to send data toward broken links, the network will block. Adaptive techniques will
help prevent this blocking.
Even after network design is complete adaptive techniques can still be used to adapt the network packet
priorities. This adaptation can route packets around dead links, can send higher priority packets through
busy links and can allow protocol and priority changes on the fly. Adaptive techniques do not change
the underlying network topology, but can alter some network parameters thereby changing the on-chip
network behaviour.
10
Communication Architecture Tuners
A Communication Architecture Tuner (CAT), “provides the underlying communication architecture with
an ability to adapt to runtime variations in the communication needs of system components”. (2)
Figure 1 - Example CAT Network
The CAT is a mechanism which allows adaptation of network parameters based on monitored events. In
order to use this most effectively then, we need know to which dynamic parameters can be monitored.
Lahiri et al. did their initial research based on monitoring the priority of packets through the network,
but they also go on to claim that any network parameters can be monitored and used to dynamically
alter the network flow. Parameters mentioned include burst modes, burst sizes, endianness, split
transactions etc. (3). Therefore, any component on the interconnection network is a candidate for
communication tuning.
Analysis for Design of Communication
Architectures
There are three analysis methods for the design of interconnection networks. These are system
simulation, static estimation and trace-based techniques.
System simulation is a method by which the network and traffic flows are modelled using a computer.
These models are then simulated, analyzed and then altered to determine the best traffic flow protocols
and topologies for a given task and constraints. Simulation is not feasible for large systems and may be
less accurate than any of the other methods.
Static estimation requires the use of static models of communication latencies or power requirements.
This is based on the assumption that network flow scheduling can be performed statically.
11
Finally, trace-based techniques begin with a detailed simulation of the network and based on flow
patterns, the priority of any given packet between routers can be altered.
The design space exploration space for interconnection networks can be viewed in a few different
directions. Network topology, communications protocols, and path optimization are three areas for
consideration. As we optimize path size, and as components get faster and their associated networks
get smaller, we also run into problems of network congestion and link failure. (2)
Adaptive techniques can assist with these problems. If a link goes down, an adaptive network can route
around the dead area until it is available again. Likewise if there is a good deal of traffic between two
nodes, adaptive networks can route around this or fix priority levels so that packets can get through the
congested channel.
The design flow for an interconnection network which can dynamically adapt to changes in network flow
patterns is as follows. (3)
Figure 2 - Design Flow
Communication Partitioning
A communication partition consists of a subset of the communication traffic generated by a component.
A component can generate packets (tx), control flow events (ey), communication requests (cz) and
possible application specific properties for the network. A full network partition can be described by its
first and last request (e.g. <t1•c3>,<t1•c5>) can represent a packet t1 that requires three communication
requests in order to pass through the network). The sub-partitions (or any request for network flow, e.g.
flit) need to be identified by an identifier and a communication request.
Parameter Generation
Parameters are generated by the CAT to dynamically affect the network flow. These parameters come
from LUTs containing specific precomputed values for each protocol. Lahiri et al. (3) use a priority
heuristic to populate the LUTs, but any network parameters can be used to build heuristics to populate
the LUTs. Other heuristics the can be used to populate the LUTs such as burst size, burst mode, packet
size, etc.
12
Once these LUTs are populated in the initial design, they can be updated dynamically based on
monitored network parameters according to a given heuristic.
Performance Analysis
CAT LUTs are then updated until no further performance increase is seen. Continual changes to the LUTs
can be made as long as there is a better performance. Once there is no more improvement, there is not
need to change the LUTs unless a change occurs in the monitored network parameters.
Example Networks Using CAT
LOTTERYBUS
LOTTERYBUS is a technique for dynamically tuning the network based on CAT. In LOTTERYBUS the
arbiters in Figure 1 are replaced by randomizing arbiters. The LUTs provide a pre-set number of lottery
tickets based on requester type and location in the network. The when a bus or channel is free, the
arbiter chooses one ticket from the tickets that the LUT has provided randomly and provides access to
the channel based upon which packet is drawn.
LOTTERYBUS is topology and protocol agnostic. The only change from a CAT system is in the channel
arbiters.
LOTTERYBUS can monitor any of the network parameters that can be monitored by any CAT system.
LUTs are still used to assign priorities to packets, but these priorities take the form of the number of
“lottery tickets” provided to each packet. These LUTs can be statically set when designing the network
but can also be updated dynamically based on monitored network parameters.
FLEXBUS
FLEXBUS is a technique for dynamically tuning the network based on CAT. In FLEXBUS network traffic is
continually monitored and when network channels are not available, the system has the ability to route
around these unavailable links.
FLEXBUS can be applied to any network topology and any communications protocols.
FLEXBUS continually monitors variations in network traffic. If a channel is congested or if link failure
occurs, components on the network can be dynamically routed to different busses.
13
CONCLUSION
Conclusion
We have seen that the CAT process can be used to dynamically configure the network at design time
and on the fly. Extending on this idea, we have seen random and dynamic routing system that can be
used to reduce network latency.
In reference to our introduction questions then:
1. How can one perform monitoring of the network parameters including temporal characteristics of
the traffic and current parameters of the networking protocol?
One can use Communication Architecture Tuners (CAT) in order to monitor the network parameters
and temporal characteristics of the network.
2. How can one report the parameters in real-time?
The Communication Architecture Tuner can automatically update priority lookup tables to affect the
routing of packets on the network. Updates to the lookup tables can happen in real time and it
would not be a stretch to output the tables as a report.
3. Which dynamic parameters can be monitored (routing protocol, burstiness of the traffic or priority
levels of the packets, ...)
Any of the aforemention network parameters can be monitored and used to update the flow
priorities in the system.
4. How can these parameters be used for autotuning? What can be tuned? Can this information help in
dispatching threads?
Packet priority lookup tables are located in each router and can be tuned dynamically.
Future Work
Knowing that routing heuristics can be based on any network parameters, work can be done to codify
the quality and utility of any and all network parameters for use in a CAT system.
The CAT process is based on reducing network latency while using priority based routing. It may be
interesting to see if it could be used to dynamically reduce the power requirements of an
interconnection network on a chip
14
References
(1) N.Enright Jerger, ECE 1749H: Interconnection Networks for Parallel Computer Architectures –
Topology. http://www.eecg.toronto.edu/~enright/interconnects-topology.pdf. November 2010.
(2) K. Lahiri, S. Dey, and A. Raghunathan, Design of Communication Architectures for High-Performance
and Energy-Efficient System-on-Chips book chapter, in Multiprocessor Systems-on-Chips. Morgan
Kaufmann, 2004.
(3) K. Lahiri, A. Raghunathan, G. Lakshminarayana, S. Dey.; "Design of high-performance system-on-chips
using communication architecture tuners," Computer-Aided Design of Integrated Circuits and Systems,
IEEE Transactions on , vol.23, no.5, pp. 620- 636, May 2004
(4) K. Lahiri, A. Raghunathan, G. Lakshminarayana , "The LOTTERYBUS on-chip communication
architecture," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , vol.14, no.6, pp.596608, June 2006
(5) K. Sekar, K. Lahiri, A. Raghunathan, S. Dey, "FLEXBUS: a high-performance system-on-chip
communication architecture with a dynamically configurable topology," Design Automation Conference,
2005. Proceedings. 42nd , vol., no., pp. 571- 574, 13-17 June 2005
15
Download