sangkyo_20071030.ppt

advertisement
Measuring the Gap
between FPGAs and ASICs
Published at IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems in 2007
Authors:
Presenter:
Ian Kuon, Jonathan Rose
(ECE, University of Toronto)
Sang-Kyo Han
(ECE, University of Maryland)
1
My Motivations for Paper Selection
I have an interest on FPGA power reduction.
I need reference data about area, performance and
power consumption comparison between FPGAs and
ASICs.
2
Contents
1.
2.
3.
4.
5.
6.
7.
8.
Introduction
Historical Measurements
New Comparison Methodology
FPGA CAD Flow
ASIC CAD Flow
Comparison Metrics
Results
Conclusion
3
Introduction
Motivations of the Research


It makes for system architects to choose their implementation
medium between FPGAs and ASICs easier.
FPGA makers seeking to improve FPGAs can gain insight by
quantitative measurements.
Focus on a Comparison


Between a 90nm CMOS SRAM-programmable FPGA and a
90nm CMOS standard cell ASIC
To Measure the Area, Performance and Power Consumption
Gap
More meaningful than past comparisons

Wide range of benchmarks and real empirical experiments
4
Historical Measurements
S.D. Brown [1992]

Reported cursorily logic density gap between FPGAs and MPGAs
P.S. Zuchowski [ICCAS, 2002]


Found delay, gate density, dynamic power consumption gaps
between FPGA lookup table (LUT) and ASIC
Unclear cause of variability of the values across process generations
S.J. Wilton [JSSC, 2005]


Examined area and delay, but estimated the values for ASIC
Performed only a single module
5
New Comparison Methodology
To provide a more definitive measurement



Implemented a large set of benchmark circuits in FPGAs and
standard cells
Selected carefully benchmarks (more detailed at the next page)
Altera Stratix II FPGA based on TSMC’s 90nm process and
ASIC based on STMicroelectronics’s 90nm process
6
New Comparison Methodology
Benchmark Selection



Considered a variety of benchmark
Can significantly impact the results
Two critical factors for selection:
•
•
HDL RTL should be synthesized
similarly by the different tools used for
FPGA and ASIC.
(Two synthesis tools were sufficiently
similar by checking the number of
registers inferred from two synthesis
processes.)
The designs should be able to make use
of the block memories and dedicated
multipliers.
7
FPGA CAD FLOW
Altera Quartus-II Software

Synthesis: QIS
•


PNR (Placement and Routing): Fitter
Static Timing Analysis: Timing Analyzer
•

Logic synthesis is a process by which RTL is
turned into a design implementation in terms
of logic gates.
STA measures the critical path which
determines the operating frequency of the
design.
Repeated the entire CAD flow five times
using five different seeds
•
The final operating frequency of the design can
vary depending on the random seed given to
the placement tool
RTL Design
Description
Synthesis:
QIS
PNR:
Fitter
STA:
Analyzer
8
ASIC CAD FLOW: Synthesis
ASIC Synthesis






Tool: Synopsis Design Compiler
HDL sources analyzing and
constraints for compilation
Gate-level optimization for
improving performance
DFT (Design For Testability) to
test for manufacturing defects
The desired clock period is
adjusted from the unrealistic
0.5ns constraint to the critical
path delay.
Netlist and constraint saved for
PNR tools
9
ASIC CAD FLOW: PNR
ASIC PNR


Tool: Cadence SOC Encounter
Floorplan and Placement
•





Target row utilization which is the
percentage of the area required for
the standard cells was set to 85%.
Inserting clock tree and Routing
Post-routing for improving
performance
DRC and Final netlist
RC extraction: by Synopsys
StarRCXT
Final timing and power analysis:
by Synopsys PrimeTime,
PrimePower
10
Comparison Metrics
Area


FPGA: Actual silicon area of the resources used by the design
ASIC: Final core area of the placed and routed design
Speed



Static timing analysis was used to measure the critical path.
(Timing analysis determines the maximum clock frequencies.)
FPGA: timing analysis tool in Quartus-II
ASIC: Synopsys PrimeTime
Power


Preferred approach: to simulate post-placed and routed design
with testbench vectors. (but for most designs, not available)
Statistical vectorless estimation: to estimate toggle rates and
probabilities at nodes
11
Results
Area




Area ratio
The hard heterogeneous
blocks do significantly reduce
area gap.
Heterogeneous blocks are
fundamentally similar to an
ASIC except a programmable
interface.
FPGAs take 40 times more
area than ASICs when only
logic used.
12
Results
Speed




Ratio between the FPGA’s
critical path delay relative to
the ASIC for each module.
ASICs are designed for the
worst case process. It is fairer
to compare ASIC
performance to that of the
slowest FPGA speed grade.
The slower speed grade parts
cause a larger performance
gap.
FPGAs is 4.3 times slower
than ASICs when only logic
used.
13
Results
Power




Ratio of FPGA dynamic
power consumption to ASIC
power consumption
FPGAs consume 12 times
more dynamic power than
ASICs when only logic used.
The slower speed grade parts
cause a larger performance
gap.
For static power, useful
information was not found.
(But, the static power gap
and the area gap are
correlated.)
14
Conclusion
This paper has presented empirical measurements
quantifying the gap between FPGAs and ASICs.
For logic-only circuits, FPGAs show on average 40
times larger area and 3.2 times slower speed and 12
times more dynamic power consumption than ASICs.
The use of hard multipliers and dedicated memories
enable a substantial reduction in area and power
consumption but have a relatively minor impact on the
delay differences.
15
The End
Thank You
Q&A
16
Download