SYNTHESIS OF NETWORKS ON CHIPS FOR 3D SYSTEMS ON CHIPS Srinivasan Murali, Ciprian Seiculescu, Luca Benini, Giovanni De Micheli Presented by Puqing Wu Context 3D Chip Smaller Footprint Shorter Wires Delay Power Routing Congestion… NoC: helps reduce TVS (Through Silicon Vias) Copy from reference [2] Contribution Synthesis Approach (power focus) NoC topology NoC switches placement Comparing to 3D optimized mesh -- 38% power reduction -- 25% latency reduction Design Approach Copy from reference [1] Input of Synthesis Specification Cores: name, size, position, layer assignment Communication: bandwidth, latency, message type Technology constraints: # TSV Synthesis Procedure power, area and timing models of Switch & TSV Output of Synthesis Topologies Switches layer placement and position Result of power, latency and area Knobs: # TSV Area Place and Route Copy from reference [2] Knobs: # Switches # Switches = # Cores / Ports on switches Fewer switches leads to longer links thus larger link power consumption (Congestion?) More switches leads to more switching activities thus larger switch power More ports on switches leads to lower frequency Connecting Cores and Switches # Switches per layer: Min: # Cores /max # ports (frequency) Max: # Cores / 1 Adapted from reference [1] Connecting cores to switches Questions: How Adapted from reference [1] to group cores? Different connectivity across layers? Connecting Switches to Switches Constraints for cost computation max_ill – IFN -- When 2 to 3 links less then max_ill, assign soft_IFN max_switch_size – IFN -- When exceeding switches connectivity, create indirect switches Questions: What is cost? How to connect switches based on cost analysis? Placing Switches Solve Linear Program by minimize obj Taken from reference [1] Also remove overlaps, pipeline long links and add NI to cores Case Study: Triple Video Object Plane Decoder Copy from reference [1] Case Study Copy from reference [1] Case Study Frequency (Lowest f results in Min Power) Max switches size Switch counts sweeping Power Consumption Copy from reference [1] Comparisons with Mesh Testbench: D_36_4, D_36_6 and D_36_8; D_35_bot, D_65_pipe 38% power saving 24.5% latency reduction? Copy from reference [1] Impact on ILL Constraint A very tight constraint on the number of ILL significantly increase power and latency (Questions) Copy from reference [1] Reference [1] Srinivasan Murali , Ciprian Seiculescu , Luca Benini , Giovanni De Micheli, Synthesis of networks on chips for 3D systems on chips, Proceedings of the 2009 Asia and South Pacific Design Automation Conference, January 19-22, 2009, Yokohama, Japan. [2] Dae Hyun Kim, Saibal Mukhopadhyay, and Sung Kyu Lim. Through-Silicon-Via Aware Interconnect Prediction and Optimization for 3D Stacked ics, SLIP’09, July 26–27, 2009, San Francisco, California, USA.