Document 12940850

advertisement
18-740 Proposal
Adding Buffers to Bufferless On-chip Networks
Dan Burrows, Kevin Kai-Wei Chang, Eric Rippey
September 27, 2010
I) Problem
Buffers in on-chip networks use a significant amount of energy. Network designs without buffers have
been proposed but these designs have been criticized because performance and energy efficiency
degrades as packet injection rate increases. Bufferless designs that include buffers that can be turned
on and off based on local congestion have been proposed. However, it is unclear whether it is necessary
to include such a buffer in every router. We will evaluate designs that contain buffers in a subset of
routers and the algorithms that control the on/off state of these buffers based on local and global
congestion. The positions of the routers in the network which have buffers will be defined as the buffer
topology. We will evaluate how different buffer topologies perform for various network configurations
and workload allocations.
II) Related Work
A bufferless design called WORM-BLESS is proposed in [1]. This design reduces the energy consumption
with some performance loss compared to a traditional buffered network. However, it has high control
logic cost and it does not perform well under high contention. We are proposing a way to improve
bufferless network under high contention by adding some buffers with energy-saving flow control.
The Chipper architecture proposed in [2] is a reduced-hardware BLESS refinement. In an effort to
reduce cost it increases the deflection rate significantly. We will use the simulator used in this paper to
evaluate our designs. We suspect the slowdown would greatly increase when the injection rate goes
beyond 20% due to the high number of deflections. Our project is trying to mitigate this by adding some
buffers to decrease the slowdown while using only slightly more area and power.
The authors of [3] state that bufferless on-chip networks are more power efficient than buffered on-chip
networks when the injection rate is low. The paper proposes empty buffer bypassing in the absence of
contention to reduce dynamic power. When injection rates are high, this technique gives buffered
designs a significant power and performance advantage over bufferless designs. The paper mentions
that the power efficiency of this technique suffers from buffer power leakage. The weakness found in
bufferless designs at high injection rates should, to some degree be mitigated by our approach.
Adding buffers to an on-chip network is described in [4], but is unfortunately not yet published, and is
due to be presented in a conference in December. We believe that although their approach is
promising, adding buffers to all routers is unnecessary.
III) Approach
Buffers will be added to certain routers in a bufferless on-chip network described in [2]. We will
evaluate network traffic generated by SPECint2006 benchmarks and various work allocations to decide
where in the network buffers can be strategically placed to reduce deflections. The location and
number of the buffers and the algorithms used to control them (turn them on and off) will be explored
and will then be compared in terms of latency, throughput and power.
Our network topology will be a mesh. We will assume that the buffers are either present or not present
at any given node, and evaluate buffer placement patterns. Candidate patterns include buffers on the
central nodes, buffers on the nodes on a main diagonal, random placement, a checkerboard, and
placement of buffers to maximally segment the network into bufferless areas. We anticipate that
varying buffer placement will produce significantly different results.
A variety of algorithms could be used to control the buffers where they are present. These algorithms
include a counter that is incremented by packet deflections and decremented by deflection-free cycles,
with thresholds for different amounts of buffer to be used. Different sleep states could be used based
on estimated future need. Contention in the next cycle could be estimated based on a hash of which
inputs are currently receiving flits and how full the cache is. It is anticipated that the search space for
buffer sizing algorithms will be of similar size to the space of possible branch predictors.
IV) Experimental Methodology
We will evaluate our designs and algorithms by modifying the simulator used in [2] and the SPECint2006
benchmark suite. We will measure and compare throughput, latency, and power of our designs to the
Chiper design.
V) Research Plan
Our goal is to prove that the performance and power efficiency of the bufferless design can be improved
by using buffers in a subset of nodes which can be turned on and off.
Milestone1:
We will verify that the simulator is operational. We will implement our buffer topologies and test their
performance on the benchmarks. The different topologies will be tested on the same three network
configurations analyzed in [1] to determine how network configuration affects the optimal buffer
topology.
Milestone 2:
We will have potential flow control algorithms using local and global congestion information that we can
implement in the simulator for evaluation. Based on the simulation results, we will propose an
algorithm for how workloads should be allocated among nodes in the network to make use of the
buffers. This allocation will be influenced by the location of buffers in the network topology.
VI) References
[1] Thomas Moscibroda and Onur Mutlu, "A Case for Bufferless Routing in On-Chip Networks"
Proceedings of the 36th International Symposium on Computer Architecture (ISCA), pages 196-207,
Austin, TX, June 2009.
[2] Chris Fallin, Chris Craik, Onur Mutlu, “Chipper: A Low-complexity Bufferless NoC Router,” SAFARI
Group Technical Report No. 2010-001
[3] George Michelogiannakis, Daniel Sanchez, William J. Dally, Christos Kozyrakis, "Evaluating Bufferless
Flow Control for On-chip Networks," nocs, pp.9-16, 2010 Fourth ACM/IEEE International Symposium on
Networks-on-Chip, 2010
[4] Syed Ali Raza Jafri, Yu-Ju Hong, Mithuna Thottethodi, T. N. Vijaykumar, “Adaptive Flow Control for
Robust Performance and Energy,” MICRO 2010
Download