ECE 448 Lecture 16 ASIC Front-End Design ECE 448 – FPGA and ASIC Design with VHDL George Mason University Two competing implementation approaches ASIC Application Specific Integrated Circuit • designed all the way from behavioral description to physical layout • designs must be sent for expensive and time consuming fabrication in semiconductor foundry ECE 448 – FPGA and ASIC Design with VHDL FPGA Field Programmable Gate Array • no physical layout design; design ends with a bitstream used to configure a device • bought off the shelf and reconfigured by designers themselves 2 FPGAs vs. ASICs ASICs FPGAs Off-the-shelf High performance Low development costs Low power Short time to the market Low cost (but only in high volumes) ECE 448 – FPGA and ASIC Design with VHDL Reconfigurability 3 ASIC Design Example – Factoring circuit/GMU Global Memory Local Memory ECE 448 – FPGA and ASIC Design with VHDL 4 ASIC 130 nm vs. Virtex II 6000 Factoring/GMU 19.68 mm 19.80 mm 51x Area of Xilinx Virtex II 6000 FPGA (estimation by R.J. Lim Fong, MS Thesis, VPI, 2004) 2.7 mm 2.82 mm Area of an ASIC with equivalent functionality ECE 448 – FPGA and ASIC Design with VHDL 5 ASICs vs. FPGAs Source: I. Kuon, J. Rose, University of Toronto “Measuring the Gap Between FPGAs and ASICs” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 62, no. 2, Feb 2007. ECE 448 – FPGA and ASIC Design with VHDL 6 ECE 448 – FPGA and ASIC Design with VHDL 7 ECE 448 – FPGA and ASIC Design with VHDL 8 ECE 448 – FPGA and ASIC Design with VHDL 9 ECE 448 – FPGA and ASIC Design with VHDL 10 Simplified ASIC Design Flow Front-End Design Back-End Design Synthesis Timing Analysis Floorplanning Placement Clock Tree Synthesis Routing Design for Manufacturing ECE 448 – FPGA and ASIC Design with VHDL 11 31 Major ASIC Toolsets Cadence Magma ECE 448 – FPGA and ASIC Design with VHDL 12 Simplified ASIC Design Flow Front-End Design Back-End Design Synthesis Timing Analysis Synopsys Tools Design Analyzer Primetime Floorplanning Placement Clock Tree Synthesis Astro Routing Design for Manufacturing ECE 448 – FPGA and ASIC Design with VHDL 13 31 A Complete Placed and Routed Chip IP ECE 448 – FPGA and ASIC Design with VHDL 14 28 What is “Physical Layout”? VDD VDD PMOS PMOS OUT IN IN OUT NMOS NMOS GND GND Transistor or Device View Physical or Layout View Physical Layout – Topography of devices and interconnects, made up of polygons that represent different layers of material (diffusion, polysilicon, metal, contact, etc) ECE 448 – FPGA and ASIC Design with VHDL 15 Process of Device Fabrication • Devices are fabricated vertically on a silicon substrate wafer by layering different materials in specific locations and shapes on top of each other • Each of many process masks defines the shapes and locations of a specific layer of material (diffusion, polysilicon, metal, contact, etc) • Mask shapes, derived from the layout view, are transformed to silicon via photolithographic and chemical processes Silicon Substrate Layout or Mask (aerial) view ECE 448 – FPGA and ASIC Design with VHDL Wafer (cross-sectional) view 16 40 Wafer Representation of Layout Polygons 0.25 um PMOS Input VDD Output GND NMOS Aerial or Layout View ECE 448 – FPGA and ASIC Design with VHDL Wafer Cross-sectional View 17 41 Front-End Design Flow ECE 448 – FPGA and ASIC Design with VHDL 18 Simplified RTL Synthesis Write RTL HDL Code HDL No Simulate OK Yes Synthesize RTL Code to Gates Gate Level Netlist No Constraints Met? Yes No Gate Level Testing OK? Yes Proceed with Backend Processing ECE 448 – FPGA and ASIC Design with VHDL 19 VHDL vs. Verilog Government Developed Commercially Developed Ada based C based Strongly Type Cast Mildly Type Cast Difficult to learn Easy to Learn More Powerful Less Powerful ECE 448 – FPGA and ASIC Design with VHDL 20 Logic Synthesis VHDL description Circuit netlist architecture MLU_DATAFLOW of MLU is signal A1:STD_LOGIC; signal B1:STD_LOGIC; signal Y1:STD_LOGIC; signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin A1<=A when (NEG_A='0') else not A; B1<=B when (NEG_B='0') else not B; Y<=Y1 when (NEG_Y='0') else not Y1; MUX_0<=A1 and B1; MUX_1<=A1 or B1; MUX_2<=A1 xor B1; MUX_3<=A1 xnor B1; with (L1 & L0) select Y1<=MUX_0 when "00", MUX_1 when "01", MUX_2 when "10", MUX_3 when others; end MLU_DATAFLOW; ECE 448 – FPGA and ASIC Design with VHDL 21 Logic Synthesis ECE 448 – FPGA and ASIC Design with VHDL 22 TCL – Tool Command Language • Created by John Ousterhout of UC Berkeley • Scripting Language • Very simple to automate routine tasks. • Extension Language • Used to customize tools with user/company specific aplications. • Nearly all of modern EDA tools have a TCL interface. • Very simple to learn and use. ECE 448 – FPGA and ASIC Design with VHDL 23 TCL Example proc rfmdIfNotDirMkdir { directory } { if {! [file exists $directory]} { file mkdir $directory; } if {! [file isdirectory $directory]} { echo "Could not make \"$directory\""; exit 1; } elseif {! [file writable $directory]} { echo " \"$directory\" is not writable"; exit 1; } else { return 1; } } ECE 448 – FPGA and ASIC Design with VHDL 24 TCL References • Practical Programming in Tcl and TK • Brent B. Welch • Ken Jones • TCL/TK in a Nutshell • Paul Raines • Jeff Tranter ECE 448 – FPGA and ASIC Design with VHDL 25 Basic Synthesis Flow ECE 448 – FPGA and ASIC Design with VHDL 26 Synthesis using Design Compiler ECE 448 – FPGA and ASIC Design with VHDL 27 ECE 448 – FPGA and ASIC Design with VHDL 28 ECE 448 – FPGA and ASIC Design with VHDL 29 Synthesis script (1) designer = "Pawel Chodowiec" company = "George Mason University" search_path = "./opt3/synopsys/TSMCHOME/digital/Front_End/timing_power/tcb013ghp_200a " link_library = "* tcb013ghptc.db" /* Typical case library */ target_library = "tcb013ghptc.db " symbol_library = "tcb013ghp.sdb " /* Directory configuration */ src_directory = ~/exam1/vhdl/ report_directory = ~/exam1/reports/ db_directory = ~/exam1/db/ ECE 448 – FPGA and ASIC Design with VHDL 30 Synthesis script (2) /* Packages can be only read */ read_file -format vhdl -rtl src_directory + "components.vhd" blocks = {regne, upcount, RAM_16Xn_DISTRIBUTED, exam1} foreach (block, blocks) { block_source = src_directory + block + ".vhd" read_file -format vhdl -rtl block_source analyze -format vhdl -lib WORK block_source } current_design block /* All commands now apply to the entity "exam1" */ ECE 448 – FPGA and ASIC Design with VHDL 31 Synthesis script (3) uniquify /* Creates unique instances of multiple refrenced entities */ link check_design /* Checks the current design for consistency */ /*******************************************/ /* apply block attributes and constraints */ /*******************************************/ create_clock -period 10 clk /* Defines that the port "clk" on the entity "clk" is the clock for the design. Period=10ns 50% duty cycle Use -waveform option to define duty cycle other than 50%*/ set_operating_conditions NCCOM /*Normal Case Commercial Operating Conditions*/ ECE 448 – FPGA and ASIC Design with VHDL 32 Synthesis script (4) /***************************************************/ /* Apply these constraints to the top-level entity*/ /***************************************************/ set_max_fanout 100 block set_clock_latency 0.1 find(clock, "clk") set_clock_transition 0.01 find(clock, "clk") set_clock_uncertainty -setup 0.1 find(clock, "clk") set_clock_uncertainty -hold 0.1 find(clock, "clk") set_load 0 all_outputs() set_input_delay 1.0 -clock clk -max all_inputs() set_output_delay -max 1.0 -clock clk all_outputs() set_wire_load_model -library tcb013ghptc -name "TSMC8K_Fsg_Conservative" ECE 448 – FPGA and ASIC Design with VHDL 33 Wireload model basics (1) ECE 448 – FPGA and ASIC Design with VHDL 34 Wireload model basics (2) ECE 448 – FPGA and ASIC Design with VHDL 35 Synthesis script (5) set_dont_touch block compile -map_effort medium change_names -rules vhdl vhdlout_architecture_name = "sort_syn" vhdlout_use_packages = {"IEEE.std_logic_1164"} write -f db -hierarchy -output db_directory + "exam1.db" /*write -f vhdl -hierarchy -output db_directory + "exam1_syn.vhd"*/ report -area > report_directory + "exam1.report_area" report -timing -all > report_directory + "exam1.report_timing" ECE 448 – FPGA and ASIC Design with VHDL 36 Results of synthesis ECE 448 – FPGA and ASIC Design with VHDL 37 Area report after synthesis (1) report_area Information: Updating design information... (UID-85) **************************************** Report : area Design : exam1 Version: V-2003.12-SP1 Date: Tue Nov 15 20:39:06 2005 **************************************** Library(s) Used: tcb013ghptc (File: /opt3/synopsys/TSMCHOME/digital/Front_End/timing_power/ tcb013ghp_200a/tcb013ghptc.db) ECE 448 – FPGA and ASIC Design with VHDL 38 Area report after synthesis (2) Number of ports: Number of nets: Number of cells: Number of references: 75 346 107 28 Combinational area: 10593.477539 Noncombinational area: 14295.521484 Net Interconnect area: undefined (Wire load has zero net area) Total cell area: Total area: 24888.976562 undefined ECE 448 – FPGA and ASIC Design with VHDL 39 Critical Path (1) • Critical Path – The Longest Path From Outputs of Registers to Inputs of Registers t logic in D Q D Q out clk tCritical = tFF-P + tlogic + tFF-setup ECE 448 – FPGA and ASIC Design with VHDL 40 Critical Path (2) • Min. Clock Period = Length of The Critical Path • Max. Clock Frequency = 1 / Min. Clock Period ECE 448 – FPGA and ASIC Design with VHDL 41 n+m n+m ECE 448 – FPGA and ASIC Design with VHDL 42 Clock Jitter • Rising Edge of The Clock Does Not Occur Precisely Periodically • May cause faults in the circuit clk ECE 448 – FPGA and ASIC Design with VHDL 43 Clock Skew • Rising Edge of the Clock Does Not Arrive at Clock Inputs of All Flip-flops at The Same Time in D Q clk in D Q out delay D Q D delay ECE 448 – FPGA and ASIC Design with VHDL Q out clk 44 Timing report after synthesis (1) **************************************** Report : timing -path full -delay max -max_paths 1 Design : exam1 Version: V-2003.12-SP1 Date : Tue Nov 15 20:39:06 2005 **************************************** Operating Conditions: NCCOM Library: tcb013ghptc Wire Load Model Mode: segmented ECE 448 – FPGA and ASIC Design with VHDL 45 Timing report after synthesis (2) Startpoint: in_addr(1) (input port clocked by clk) Endpoint: RegSUM/Q_reg[34] (rising edge-triggered flip-flop clocked by clk) Path Group: clk Path Type: max Des/Clust/Port Wire Load Model Library ----------------------------------------------------------------------------------exam1 TSMC8K_Fsg_Conservative tcb013ghptc RAM_16Xn_DISTRIBUTED ZeroWireload tcb013ghptc exam1_DW01_cmp2_32_0 ZeroWireload tcb013ghptc exam1_DW01_cmp2_32_1 ZeroWireload tcb013ghptc exam1_DW01_add_35_0 ZeroWireload tcb013ghptc regne_1 ZeroWireload tcb013ghptc regne_2 ZeroWireload tcb013ghptc regne_n35 ZeroWireload tcb013ghptc ECE 448 – FPGA and ASIC Design with VHDL 46 Timing report after synthesis (3) Point Incr Path -----------------------------------------------------------------------------------------------clock clk (rise edge) 0.00 0.00 clock network delay (ideal) 0.10 0.10 input external delay 1.00 1.10 f in_addr(1) (in) 0.00 1.10 f U98/Z (CKMUX2D1) 0.13 1.23 f Memory/ADDR[1] (RAM_16Xn_DISTRIBUTED) 0.00 1.23 f Memory/U41/ZN (INVD1) 0.08 1.31 r Memory/U343/Z (OR3D1) 0.10 1.41 r Memory/U338/ZN (INVD2) 0.20 1.61 f Memory/U40/ZN (MOAI22D0) 0.17 1.78 f Memory/U350/Z (OR4D1) 0.26 2.03 f Memory/DATA_OUT[0] (RAM_16Xn_DISTRIBUTED) 0.00 2.03 f ECE 448 – FPGA and ASIC Design with VHDL 47 Timing report after synthesis (4) add_96xplusxplus/B[0] (exam1_DW01_add_35_0) add_96xplusxplus/U9/Z (AN2D0) add_96xplusxplus/U1_1/CO (CMPE32D1) add_96xplusxplus/U1_2/CO (CMPE32D1) add_96xplusxplus/U1_3/CO (CMPE32D1) add_96xplusxplus/U1_4/CO (CMPE32D1) add_96xplusxplus/U1_5/CO (CMPE32D1) add_96xplusxplus/U1_6/CO (CMPE32D1) add_96xplusxplus/U1_7/CO (CMPE32D1) add_96xplusxplus/U1_8/CO (CMPE32D1) add_96xplusxplus/U1_9/CO (CMPE32D1) add_96xplusxplus/U1_10/CO (CMPE32D1) add_96xplusxplus/U1_11/CO (CMPE32D1) add_96xplusxplus/U1_12/CO (CMPE32D1) add_96xplusxplus/U1_13/CO (CMPE32D1) add_96xplusxplus/U1_14/CO (CMPE32D1) ECE 448 – FPGA and ASIC Design with VHDL 0.00 0.12 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 2.03 f 2.15 f 2.25 f 2.34 f 2.44 f 2.54 f 2.63 f 2.73 f 2.82 f 2.92 f 3.02 f 3.11 f 3.21 f 3.31 f 3.40 f 3.50 f 48 Timing report after synthesis (5) add_96xplusxplus/U1_15/CO (CMPE32D1) add_96xplusxplus/U1_16/CO (CMPE32D1) add_96xplusxplus/U1_17/CO (CMPE32D1) add_96xplusxplus/U1_18/CO (CMPE32D1) add_96xplusxplus/U1_19/CO (CMPE32D1) add_96xplusxplus/U1_20/CO (CMPE32D1) add_96xplusxplus/U1_21/CO (CMPE32D1) add_96xplusxplus/U1_22/CO (CMPE32D1) add_96xplusxplus/U1_23/CO (CMPE32D1) add_96xplusxplus/U1_24/CO (CMPE32D1) add_96xplusxplus/U1_25/CO (CMPE32D1) add_96xplusxplus/U1_26/CO (CMPE32D1) add_96xplusxplus/U1_27/CO (CMPE32D1) add_96xplusxplus/U1_28/CO (CMPE32D1) add_96xplusxplus/U1_29/CO (CMPE32D1) add_96xplusxplus/U1_30/CO (CMPE32D1) add_96xplusxplus/U1_31/CO (CMPE32D1) ECE 448 – FPGA and ASIC Design with VHDL 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 3.60 f 3.69 f 3.79 f 3.88 f 3.98 f 4.08 f 4.17 f 4.27 f 4.37 f 4.46 f 4.56 f 4.66 f 4.75 f 4.85 f 4.94 f 5.04 f 5.14 f 49 Timing report after synthesis (6) add_96xplusxplus/U7/Z (AN2D0) 0.10 add_96xplusxplus/U5/Z (AN2D0) 0.08 add_96xplusxplus/U4/Z (CKXOR2D0) 0.15 add_96xplusxplus/SUM[34] (exam1_DW01_add_35_0) 0.00 RegSUM/R[34] (regne_n35) 0.00 RegSUM/U32/Z (AO21D0) 0.11 RegSUM/Q_reg[34]/D (EDFQD1) 0.00 data arrival time ECE 448 – FPGA and ASIC Design with VHDL 5.24 f 5.32 f 5.47 f 5.47 f 5.47 f 5.57 f 5.57 f 5.57 50 Timing report after synthesis (7) clock clk (rise edge) 10.00 10.00 clock network delay (ideal) 0.10 10.10 clock uncertainty -0.10 10.00 RegSUM/Q_reg[34]/CP (EDFQD1) 0.00 10.00 r library setup time -0.12 9.88 data required time 9.88 ------------------------------------------------------------------------------------data required time 9.88 data arrival time -5.57 ------------------------------------------------------------------------------------slack (MET) 4.31 ECE 448 – FPGA and ASIC Design with VHDL 51 Static Timing Analysis ECE 448 – FPGA and ASIC Design with VHDL 52 Static Timing Analysis Review • Tools will calculate all paths from sequential start point to sequential end point. • The worst case path will be used for Setup analysis, and the best case path will be used for hold analysis. • All paths are considered for design rule checking ECE 448 – FPGA and ASIC Design with VHDL 53 Review of Setup and Hold Checks ECE 448 – FPGA and ASIC Design with VHDL 54 False and Multicycle paths • False path • Very slow signals like reset, test mode enable, that are not used under normal conditions are classified as false paths • Multicycle path • Paths that take more than one clock cycle are known as multicycle paths. • Have to take define the multicylce paths in the analyzer and it takes those constraints into account when synthesizing ECE 448 – FPGA and ASIC Design with VHDL 55 Multicycle path - Example ECE 448 – FPGA and ASIC Design with VHDL 56 Optimization criteria ECE 448 – FPGA and ASIC Design with VHDL 57 Degrees of freedom and possible trade-offs speed area power testability ECE 448 – FPGA and ASIC Design with VHDL 58 Degrees of freedom and possible trade-offs speed latency area throughput ECE 448 – FPGA and ASIC Design with VHDL 59 VHDL Coding for Synthesis ECE 448 – FPGA and ASIC Design with VHDL 60 Recommended rules for Synthesis • When implementing combinational paths do not have hierarchy • Register all outputs • Do not implement glue logic between blocks, partition them well • Separate designs on functional boundary • Keep block sizes to a reasonable size ECE 448 – FPGA and ASIC Design with VHDL 61 Avoid hierarchical combinational blocks Block A reg1 Block B Combinatorial Logic1 Combinatorial Logic2 Block C Combinatorial Logic3 reg2 Not recommended Design Practice The path between reg1 and reg2 is divided between three different block Due to hierarchical boundaries, optimization of the combinational logic cannot be achieved Synthesis tools (Synopsys) maintain the integrity of the I/O ports, combinational optimization cannot be achieved between blocks (unless “grouping” is used). ECE 448 – FPGA and ASIC Design with VHDL 62 Recommend way to handle Combinational Paths Block C Block A reg1 Combinatorial Logic1 & Logic2& Logic3 reg2 Recommended practice All the combinational circuitry is grouped in the same block that has its output connected the destination flip flop It allows the optimal minimization of the combinational logic during synthesis Allows simplified description of the timing interface ECE 448 – FPGA and ASIC Design with VHDL 63 Register all outputs Block Y Block Y Block X reg1 reg2 reg3 Register all outputs Simplifies the synthesis design environment: Inputs to the individual block arrive within the same relative delay (caused by wire delays) Don’t really need to specify output requirements since paths starts at flip flop outputs. Take care of fanouts, rule of thumb, keep the fanout to 16 (dependent on technology and components that are being driven by the output) ECE 448 – FPGA and ASIC Design with VHDL 64 NO GLUE LOGIC between blocks Top Block X Block Y reg1 reg3 No Glue Logic between Blocks, no matter what the temptation Due to time pressures, and a bug found that can be simply be fixed by adding some simple glue logic. RESIST THE TEMPTATION!!! At this level in the hierarchy, this implementation will not allow the glue logic to be absorbed within any lower level block. ECE 448 – FPGA and ASIC Design with VHDL 65 Separate design with different goals Top Time reg1 critical path Slow Logic reg3 ECE 448 – FPGA and ASIC Design with VHDL reg1 may be driven by time critical function, hence will have different optimization constraints reg3 may be driven by slow logic, hence no need to constrain it for speed 66 Optimization based on design requirements Top Speed optimized block Time reg1 critical path Area optimized block Slow Logic • Use different entities to partition design blocks • Allows different constraints during synthesis to optimize for area or speed or both. reg3 ECE 448 – FPGA and ASIC Design with VHDL 67 Separate FSM with random logic • Separation of the FSM and the random logic allows you to use FSM optimized synthesis Top Use FSM optimization tool FSM reg1 Standard optimization techniques used Random Logic ECE 448 – FPGA and ASIC Design with VHDL reg3 68 Maintain a reasonable block size • Partition your design such that each block is between 1000-10000 gates (this is strictly tools and technology dependent) • Larger the blocks, longer the run time -> quick iterations cannot be done. ECE 448 – FPGA and ASIC Design with VHDL 69