Measuring the Gap between FPGAs and ASICs Published at IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems in 2007 Authors: Presenter: Ian Kuon, Jonathan Rose (ECE, University of Toronto) Sang-Kyo Han (ECE, University of Maryland) 1 My Motivations for Paper Selection I have an interest on FPGA power reduction. I need reference data about area, performance and power consumption comparison between FPGAs and ASICs. 2 Contents 1. 2. 3. 4. 5. 6. 7. 8. Introduction Historical Measurements New Comparison Methodology FPGA CAD Flow ASIC CAD Flow Comparison Metrics Results Conclusion 3 Introduction Motivations of the Research It makes for system architects to choose their implementation medium between FPGAs and ASICs easier. FPGA makers seeking to improve FPGAs can gain insight by quantitative measurements. Focus on a Comparison Between a 90nm CMOS SRAM-programmable FPGA and a 90nm CMOS standard cell ASIC To Measure the Area, Performance and Power Consumption Gap More meaningful than past comparisons Wide range of benchmarks and real empirical experiments 4 Historical Measurements S.D. Brown [1992] Reported cursorily logic density gap between FPGAs and MPGAs P.S. Zuchowski [ICCAS, 2002] Found delay, gate density, dynamic power consumption gaps between FPGA lookup table (LUT) and ASIC Unclear cause of variability of the values across process generations S.J. Wilton [JSSC, 2005] Examined area and delay, but estimated the values for ASIC Performed only a single module 5 New Comparison Methodology To provide a more definitive measurement Implemented a large set of benchmark circuits in FPGAs and standard cells Selected carefully benchmarks (more detailed at the next page) Altera Stratix II FPGA based on TSMC’s 90nm process and ASIC based on STMicroelectronics’s 90nm process 6 New Comparison Methodology Benchmark Selection Considered a variety of benchmark Can significantly impact the results Two critical factors for selection: • • HDL RTL should be synthesized similarly by the different tools used for FPGA and ASIC. (Two synthesis tools were sufficiently similar by checking the number of registers inferred from two synthesis processes.) The designs should be able to make use of the block memories and dedicated multipliers. 7 FPGA CAD FLOW Altera Quartus-II Software Synthesis: QIS • PNR (Placement and Routing): Fitter Static Timing Analysis: Timing Analyzer • Logic synthesis is a process by which RTL is turned into a design implementation in terms of logic gates. STA measures the critical path which determines the operating frequency of the design. Repeated the entire CAD flow five times using five different seeds • The final operating frequency of the design can vary depending on the random seed given to the placement tool RTL Design Description Synthesis: QIS PNR: Fitter STA: Analyzer 8 ASIC CAD FLOW: Synthesis ASIC Synthesis Tool: Synopsis Design Compiler HDL sources analyzing and constraints for compilation Gate-level optimization for improving performance DFT (Design For Testability) to test for manufacturing defects The desired clock period is adjusted from the unrealistic 0.5ns constraint to the critical path delay. Netlist and constraint saved for PNR tools 9 ASIC CAD FLOW: PNR ASIC PNR Tool: Cadence SOC Encounter Floorplan and Placement • Target row utilization which is the percentage of the area required for the standard cells was set to 85%. Inserting clock tree and Routing Post-routing for improving performance DRC and Final netlist RC extraction: by Synopsys StarRCXT Final timing and power analysis: by Synopsys PrimeTime, PrimePower 10 Comparison Metrics Area FPGA: Actual silicon area of the resources used by the design ASIC: Final core area of the placed and routed design Speed Static timing analysis was used to measure the critical path. (Timing analysis determines the maximum clock frequencies.) FPGA: timing analysis tool in Quartus-II ASIC: Synopsys PrimeTime Power Preferred approach: to simulate post-placed and routed design with testbench vectors. (but for most designs, not available) Statistical vectorless estimation: to estimate toggle rates and probabilities at nodes 11 Results Area Area ratio The hard heterogeneous blocks do significantly reduce area gap. Heterogeneous blocks are fundamentally similar to an ASIC except a programmable interface. FPGAs take 40 times more area than ASICs when only logic used. 12 Results Speed Ratio between the FPGA’s critical path delay relative to the ASIC for each module. ASICs are designed for the worst case process. It is fairer to compare ASIC performance to that of the slowest FPGA speed grade. The slower speed grade parts cause a larger performance gap. FPGAs is 4.3 times slower than ASICs when only logic used. 13 Results Power Ratio of FPGA dynamic power consumption to ASIC power consumption FPGAs consume 12 times more dynamic power than ASICs when only logic used. The slower speed grade parts cause a larger performance gap. For static power, useful information was not found. (But, the static power gap and the area gap are correlated.) 14 Conclusion This paper has presented empirical measurements quantifying the gap between FPGAs and ASICs. For logic-only circuits, FPGAs show on average 40 times larger area and 3.2 times slower speed and 12 times more dynamic power consumption than ASICs. The use of hard multipliers and dedicated memories enable a substantial reduction in area and power consumption but have a relatively minor impact on the delay differences. 15 The End Thank You Q&A 16