Vodafone Chair Mobile Communications Systems, Prof. Dr.-Ing. G. Fettweis An Accurate and Scalable Analytic Model for Round-Robin Arbitration in Network-on-Chip Erik Fischer and Gerhard Fettweis Speaker: Erik Fischer 7th International Symposium on Networks-on-Chip (NOCS) 2013 Outline Motivation System Model Service Time Estimation Performance Evaluation TU Dresden E. Fischer and G. Fettweis Slide 2 Motivation MPSoC Design Flow Today Design of Future Many-Core SoC Given Future applications Application Modeling & Simulation Automated Guidance Concept Designers Experience Architecture HW design & chip production TU Dresden ??? E. Fischer and G. Fettweis ↑ computation ↑ reliability Complex Design Space Needs fast models! 1000s of cores on a single chip! Slide 3 System Model Assumptions on router level • • • . . . Crossbar Switch ... 𝑁𝑖 buffered inputs 𝑁𝑜 (unbuffered) outputs Known mean arrival rates 𝜆𝑖 • Known mean service rate 𝜇 (service incl. arbitration, switching and forwarding) • Known forwarding probabilities 𝑓𝑖,𝑜 . . . ... Input FIFOs Routing & Arbitration TU Dresden • Round-robin arbitration E. Fischer and G. Fettweis Slide 4 System Model Depends on … Topology Routing Decoupling of service time and queueing model Traffic Advantages: Simplified queueing model More flexible to adapt for other arbitration schemes and service classes Arbitration Service Classes Traffic / Service Statistics TU Dresden E. Fischer and G. Fettweis Slide 5 Service Time Estimation Service Time Model 1. Contention Probability Probability that contention occurs when a packet is requesting service to be routed from input i to output o 2. Contention Resolution Probability that a packet is forced to wait under a given contention situation TU Dresden E. Fischer and G. Fettweis Slide 6 Service Time Estimation Contention Probability Path utilization from input i to output o: Auxiliary binary “truth table” matrix B: Covers all possible collision scenarios e.g., for the case of 3 inputs: TU Dresden E. Fischer and G. Fettweis Slide 7 Service Time Estimation Contention Resolution Decision probability: Collision resolution distribution: Number of contending inputs for output o TU Dresden E. Fischer and G. Fettweis Slide 8 Service Time Estimation Waiting probability: Contention Probability Decision Probability Service times: System of equations TU Dresden E. Fischer and G. Fettweis Slide 9 Service Time Estimation Iterative Service Time Estimation Algorithm Check: xi,o converged? true Start false Precompute B, λi,o Finish Exit w/ error true Initialize xi,o=x TU Dresden Calculate ρi,o, ρi E. Fischer and G. Fettweis Ǝ ρi>1 false Update xi,o Calculate NC, r, d, w Slide 10 Comparison of service times with: Ref. model [1] Setup: Center router [4,4] of 8x8 2D-mesh Poisson inputs Uniform network traffic Service Time Cycle-accurate sim. Service Time Service Time Performance Evaluation – Single Router Northern router input 1.4 Sim RRMod RefMod 1.2 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.3 0.35 0.4 0.45 0.35 0.4 0.45 Eastern router input 1.2 Sim RRMod RefMod 1.1 1 0 0.05 0.1 0.15 0.2 0.25 Router input of module 1.4 Sim RRMod RefMod 1.2 1 0 0.05 0.1 0.15 0.2 0.25 0.3 Injection rate (flits/cycle/module) [1] O. Lysne, “Towards a generic analytical model of wormhole routing networks,” Microprocessors and Microsystems, vol. 21, no. 78, pp. 491 – 498, 1998. TU Dresden E. Fischer and G. Fettweis Slide 11 Performance Evaluation – Network Cycle-accurate sim. Ref. models [2][3][1] Setup: 8x8 2D-mesh Det. XY routing Flit based switching Application specific network traffic 50 Average packet latency [clock cycles] Comparison of mean packet latencies with: 45 40 35 Cycle-accurate Simulation Round-Robin Model Reference Model 1 Reference Model 2 Reference Model 3 30 25 20 15 10 5 5.5 6 6.5 7 Packet injection rate (pkt/cycle) 7.5 8 [2] U. Ogras, P. Bogdan, and R. Marculescu, “An analytical approach for network-on-chip performance analysis,” Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 29, no. 12, pp. 2001 –2013, Dec. 2010 [3] E. Fischer, A. Fehske, and G. Fettweis, “A flexible analytic model for the design space exploration of many-core network-on-chips based on queueing theory,” in Proc. of SIMUL, 2012 TU Dresden E. Fischer and G. Fettweis Slide 12 Conclusions Analytic model allows for fast NoC performance analysis in an early stage of design space exploration Decoupling of service time and queueing model eases arbiter model and makes it more flexible Round-robin service time model considers contention probability as well as contentions resolution Iterative solution algorithm Average packet latency [clock cycles] 50 45 40 35 Cycle-accurate Simulation Round-Robin Model Reference Model 1 Reference Model 2 Reference Model 3 30 25 20 15 10 5 5.5 7 6.5 6 Packet injection rate (pkt/cycle) TU Dresden 7.5 8 High accuracy of round-robin service time estimation Mean error <6% of packet latencies for analyzed case of 8x8 NoC E. Fischer and G. Fettweis Slide 13