International Journal of Futuristic Machine Intelligence & Application (ISSN : 2395-308x) Vol 1 Issue 1 Pipelined Architecture for High Throughput and Low Cost Single Cycle Router VOQ For On Chip Networks Saurabh R. Tambde M. Tech. (VLSI) PCE, Nagpur (M.H.) Prof. Mrs. P. J. Suryawanshi M. Tech (VLSI) PCE, Nagpur (M.H.) Abstract: Network-on-chip (NoC) designs are based on a compromise among latency, power dissipation, or energy, and the balance is usually designed at design time. However, setting all parameters, such as buffer size at design time can cause either excessive power dissipation or a higher latency. The communication latency of Network-on-Chip is one of the factors that significantly impacts on the application performance on System-on-Chips (SoC). To reduce the NoC latency, we propose low latency architecture of router, which utilizes virtual output queuing (VOQ) to shorten the processing time of a packet transfer. Multiple VOQ architecture that each input port maintains more than one queue for each output channel also proposed for improving the throughput of router. The experimental results show that in a 4x4 two dimensional mesh network, the proposed router reduces the communication latency by 25% and cost of area by 67.3% as compared to the look-ahead speculative virtual channel router. Index Terms- Network-on-chip, virtual output queuing, throughput, processing time, communication latency. 1. Introduction Network-on-Chip (NoC) is an appealing alternative for communication in SoC’s with ability of providing high throughput, low latency and scalability. Using a network to replace global wiring has advantages of structure, performance, and modularity. The popular NoC architectures are shared medium networks, direct and indirect networks, and large scale systems because of their high performance and scalability. In this architecture, each node contains a network interface block called a router, which handles the communication and directly connects to its neighboring ones. In the single VOQ scheme, the traffic congestion at output ports is increased significantly under heavy network load condition. This leads to increased queuing latency of the packets and influences performance of the network. To reduce the traffic congestion, multiple VOQ architecture is proposed. Network-on-Chip (NoC) is a general purpose on-chip communication concept that offers high throughput, which is the basic requirement to deal with complexity of modern systems. All links in NoC can be simultaneously used for data transmission, which provides a high level of parallelism and makes it attractive to replace the typical communication International Journal of Futuristic Machine Intelligence & Application (ISSN : 2395-308x) Vol 1 Issue 1 architectures like shared buses or point-topoint dedicated wires. Apart from throughput, NoC platform is scalable and has the potential to keep up with the pace of technology advances [3]. In the paper, two different architectures of low latency router using single VOQ and multiple VOQ that simplify the operations of routing computation and VC allocation are given to solve the problem as discussed in the survey paper [1]. 2. Proposed Router Architecture I Previous System Architecture By surveying different research papers, we try to design our propose system, we implementing parallel architecture for single cycle router, which utilizes virtual output queuing (VOQ) scheme to shorten the processing time of a packet transfer. In this architecture, each input port maintains a dedicated virtual channel (VC) for each output channel (single VOQ) [4]. Since each input VC is reserved for an output channel, the pipeline of a packet transfer can be shortened to two stages of switch allocation and switch traversal. By speculatively implementing these stages in a parallel fashion, the packet transfer can be performed in only one clock cycle as given in the fig1 below. Fig1.Virtual output queuing scheme. II Proposed VOQ Architecture The proposed design with single VOQ architecture reduces the cost of area by 67.3% and communication latency by 25% as compared to the look-ahead speculative VC router. The architecture of multiple VOQ offers another alternative that allow to improve throughput of router significantly, and reduce the hardware amount by 15.6% and communication latency by 25% in comparison with the look-ahead speculative VC router. In case of multiple VOQ which is another architecture which help to reduce latency and improve the throughput of router. It’s hardware amount reduced by 15% and latency by 25% in comparison with conventional VC router [2, 5]. International Journal of Futuristic Machine Intelligence & Application (ISSN : 2395-308x) Vol 1 Issue 1 Fig 3: Implemented VOQ Router. 3. Results The output of the given research is given below. For that we have used Virtex-5 XC5VFX70T with 11,200 slices and 148 blocks of Block RAM. Fig 2: Multiple VOQ Architecture. III Implemented VOQ Router Router can perform packet transfer in single cycle by performing switch traversal and switch allocation in parallel form. We also reduced HOL (Head of Line) blocking by implementing multiple FIFO memory at input ports which helped to reduce traffic congestion of data queue at input ports. The proposed design with single VOQ architecture reduces hardware cost by around 60% and communication latency by 25% [1, 6]. Fig 4: Output of the VOQ Router. Comparison of the implemented router with low latency router as given in the survey is summarized in the table given below. Port number VC number Data Width Buffer's size Slices LUTs Flip Flops Block RAM Frequency Baseline Router 5 4 16-bit 4-flit 794 3,055 510 10 159 MHz LA + Spec router 5 4 16-bit 4-flit 842 3,279 820 10 97 MHz Table1: Comparison of conventional routers with proposed VOQ router. VOQ router 5 4 16-bit 4-flit 1158 2592 240 10 68MHz International Journal of Futuristic Machine Intelligence & Application (ISSN : 2395-308x) Vol 1 Issue 1 As given in the table above, our VOQ router has given much advantage over conventional routers. 4. Conclusion The performance of conventional VC router can be improved by employing VOQ scheme. By taking advantage of VOQ we can shorten the packet transfer pipeline stages. Same will help to minimize the communication latency and simplicity to hardware design. In the paper, two different architectures of low latency router using single VOQ and multiple VOQ that simplify the operations of routing computation and VC allocation. Router can perform packet transfer in single cycle by performing switch traversal and switch allocation in parallel form. We also reduced HOL (Head of Line) blocking by implementing multiple FIFO memory at input ports which helped to reduce traffic congestion of data queue at input ports. The proposed design with single VOQ architecture reduces hardware cost by around 60% and communication latency by 25%. In case of multiple VOQ which is another architecture which help to reduce latency and improve the throughput of router. It’s hardware amount reduced by 15% and latency by 25% in comparison with conventional VC router. 5. References [1] Saurabh R.Tambde, Prof. Mrs. P .J. Suryawanshi, “Survey paper on Virtual Output Queuing Based Router for On-chip Networks”, International Journal of Computer, Information Technology & Bioinformatics(IJCITB) ISSN: 2278-7593, Volume-2, Issue-3. [2] Son Truong Nguye, Shigeru “A Low Cost Single-Cycle Router Based on Virtual Output Queuing for On-Chip Networks” IEEE 2010, 13th Euromicro Conference on Digital System Design. [3] R. Mullins, A. West and S. Moore, “Low-Latency Virtual-Channel Routers for On-Chip Networks”, Proceedings of the 31st Annual International Symposium on Computer Architecture, pp. 188-197, Jun. 2004. [4] D. Bertozzi and L. Benini, Xpipes:, “A Network-on-Chip architecture for Gigascale Systems-on-Chip”, IEEE Circuits and Systems Magazine, pp. 18-31, Q.2 2004. [5] A. Kumar, Li-S. Peh, P. Kundu and N. K. Jha, “Express Virtual Channels: Towards the Ideal Interconnection Fabric”, Proceedings of International Symposium on Computer Architecture (ISCA’07), pp. 150161, Jun. 2007. [6] Tobias Bjerregaard , Shankar Mahadevan , “A Survey of Research and Practices of Network-on-Chip”, ACM Computing Surveys, Vol. 38, March 2006.