Document 12940850

18-740 Proposal Adding Buffers to Bufferless On-chip Networks Dan Burrows, Kevin Kai-Wei Chang, Eric Rippey September 27, 2010 I) Problem Buffers in on-chip networks use a significant amount of energy. Network designs without buffers have been proposed but these designs have been criticized because performance and energy efficiency degrades as packet injection rate increases. Bufferless designs that include buffers that can be turned on and off based on local congestion have been proposed. However, it is unclear whether it is necessary to include such a buffer in every router. We will evaluate designs that contain buffers in a subset of routers and the algorithms that control the on/off state of these buffers based on local and global congestion. The positions of the routers in the network which have buffers will be defined as the buffer topology. We will evaluate how different buffer topologies perform for various network configurations and workload allocations. II) Related Work A bufferless design called WORM-BLESS is proposed in [1]. This design reduces the energy consumption with some performance loss compared to a traditional buffered network. However, it has high control logic cost and it does not perform well under high contention. We are proposing a way to improve bufferless network under high contention by adding some buffers with energy-saving flow control. The Chipper architecture proposed in [2] is a reduced-hardware BLESS refinement. In an effort to reduce cost it increases the deflection rate significantly. We will use the simulator used in this paper to evaluate our designs. We suspect the slowdown would greatly increase when the injection rate goes beyond 20% due to the high number of deflections. Our project is trying to mitigate this by adding some buffers to decrease the slowdown while using only slightly more area and power. The authors of [3] state that bufferless on-chip networks are more power efficient than buffered on-chip networks when the injection rate is low. The paper proposes empty buffer bypassing in the absence of contention to reduce dynamic power. When injection rates are high, this technique gives buffered designs a significant power and performance advantage over bufferless designs. The paper mentions that the power efficiency of this technique suffers from buffer power leakage. The weakness found in bufferless designs at high injection rates should, to some degree be mitigated by our approach. Adding buffers to an on-chip network is described in [4], but is unfortunately not yet published, and is due to be presented in a conference in December. We believe that although their approach is promising, adding buffers to all routers is unnecessary. III) Approach Buffers will be added to certain routers in a bufferless on-chip network described in [2]. We will evaluate network traffic generated by SPECint2006 benchmarks and various work allocations to decide where in the network buffers can be strategically placed to reduce deflections. The location and number of the buffers and the algorithms used to control them (turn them on and off) will be explored and will then be compared in terms of latency, throughput and power. Our network topology will be a mesh. We will assume that the buffers are either present or not present at any given node, and evaluate buffer placement patterns. Candidate patterns include buffers on the central nodes, buffers on the nodes on a main diagonal, random placement, a checkerboard, and placement of buffers to maximally segment the network into bufferless areas. We anticipate that varying buffer placement will produce significantly different results. A variety of algorithms could be used to control the buffers where they are present. These algorithms include a counter that is incremented by packet deflections and decremented by deflection-free cycles, with thresholds for different amounts of buffer to be used. Different sleep states could be used based on estimated future need. Contention in the next cycle could be estimated based on a hash of which inputs are currently receiving flits and how full the cache is. It is anticipated that the search space for buffer sizing algorithms will be of similar size to the space of possible branch predictors. IV) Experimental Methodology We will evaluate our designs and algorithms by modifying the simulator used in [2] and the SPECint2006 benchmark suite. We will measure and compare throughput, latency, and power of our designs to the Chiper design. V) Research Plan Our goal is to prove that the performance and power efficiency of the bufferless design can be improved by using buffers in a subset of nodes which can be turned on and off. Milestone1: We will verify that the simulator is operational. We will implement our buffer topologies and test their performance on the benchmarks. The different topologies will be tested on the same three network configurations analyzed in [1] to determine how network configuration affects the optimal buffer topology. Milestone 2: We will have potential flow control algorithms using local and global congestion information that we can implement in the simulator for evaluation. Based on the simulation results, we will propose an algorithm for how workloads should be allocated among nodes in the network to make use of the buffers. This allocation will be influenced by the location of buffers in the network topology. VI) References [1] Thomas Moscibroda and Onur Mutlu, "A Case for Bufferless Routing in On-Chip Networks" Proceedings of the 36th International Symposium on Computer Architecture (ISCA), pages 196-207, Austin, TX, June 2009. [2] Chris Fallin, Chris Craik, Onur Mutlu, “Chipper: A Low-complexity Bufferless NoC Router,” SAFARI Group Technical Report No. 2010-001 [3] George Michelogiannakis, Daniel Sanchez, William J. Dally, Christos Kozyrakis, "Evaluating Bufferless Flow Control for On-chip Networks," nocs, pp.9-16, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip, 2010 [4] Syed Ali Raza Jafri, Yu-Ju Hong, Mithuna Thottethodi, T. N. Vijaykumar, “Adaptive Flow Control for Robust Performance and Energy,” MICRO 2010

Document 12940850

Related documents

Products

Support

Document 12940850

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib