Topics ■ ■ CAD systems. ■ Simulation. ■ Placement and routing. ■ Layout analysis. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 CAD systems Tools aren’t very useful if they don’t talk to each other. ■ Design interchange languages: – VHDL (TM), Verilog (TM) (function and structure); – EDIF (netlists); – GDS, CIF (masks). Page 1 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 CAD tool interactions xlate a tool 1 tool 2 tool 1 tool 2 xlate b xlate c xlate d database tool 4 tool 3 database (hub-and-spoke) ■ Often want to iteratively improve design. ■ Back annotation updates a more-abstract design with information from later design stages. – Example: annotate logic schematic with extracted parasitic Rs and Cs. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 tool 4 translator Page 3 Copyright 1998 Prentice Hall PTR Copyright 1998 Prentice Hall PTR Back annotation xlate e tool 3 Page 2 ■ Back annotation requires tools to know more about each other. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 4 Copyright 1998 Prentice Hall PTR Event-driven simulation ■ Event-driven simulation is designed for digital circuit characteristics: – small number of signal values; – relatively sparse activity over time. ■ Event-driven simulators try to update only those signals which change in order to reduce CPU time requirements. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 5 Copyright 1998 Prentice Hall PTR Event-driven simulation example Event-driven simulator structure ■ An event is a change in a signal value. ■ A timewheel is a queue of events (data structure is circular, hence the term wheel). ■ Simulator traces structure of circuit to determine causality of events—event at input of one gate may cause new event at gate’s output. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Copyright 1998 Prentice Hall PTR Event-driven simulation example, cont’d ■ A Page 6 Events at primary inputs: – A changes at t=1; – B changes at t=2. C D ■ B Immediate causality: – C changes at t=3 when both inputs to NOR are 0. logic network behavior ■ Event propagation: – D changes at t=4. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 7 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 8 Copyright 1998 Prentice Hall PTR Delay models ■ Unit-delay simulators assume that each component has a one-unit delay. Model function but not performance. ■ Variable-delay simulators allow each component to have its own delay. Accuracy of performance estimates from variabledelay simulators depends on how well circuits can be extracted to digital model. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 9 Copyright 1998 Prentice Hall PTR Switch simulation ■ Special type of event-driven simulation optimized for MOS transistors. ■ Treats transistor as switch. Takes capacitance into account to model charge sharing, etc. ■ Can also be enhanced to model transistor as resistive switch. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 10 Copyright 1998 Prentice Hall PTR Switch simulation example, cont’d Switch simulation example ■ Node g may be connected to either power supply, but signals on that node are terminated by gate of transistor. ■ To solve for values of a and b nodes, must first solve for value of g node. – If g=1, then a=b. – If g=0, other parts of circuit determine a and b independently. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 11 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 12 Copyright 1998 Prentice Hall PTR Switch simulation and charge sharing ■ Closed transistor connects source and drain nodes. Want to determine voltages of source/drain nodes taking into account capacitance. ■ Capacitance determines node size. Use size of connected nodes to determine new value of nodes. ■ Result may be X (unknown). Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 13 Copyright 1998 Prentice Hall PTR Placement metrics ■ Layout synthesis ■ Two critical phases of layout design: – placement of components on the chip; – routing of wires between components. ■ Placement and routing interact, but separating layout design into phases helps us understand the problem and find good solutions. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 14 Copyright 1998 Prentice Hall PTR Wire length as a quality metric Quality metrics for layout: – area; – delay. ■ Area and delay deterined in part by wiring. ■ How do we judge a placement without wiring? Estimate wire length without actually performing routing. bad placement Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 15 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 good placement Page 16 Copyright 1998 Prentice Hall PTR Wire length measures ■ Estimate wire length by distance between components. ■ Possible distance measures: – Euclidean distance (sqrt(x2 + y2)); – Manhattan distance (x + y). ■ Multi-point nets must be broken up into trees for good estimates. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 17 Copyright 1998 Prentice Hall PTR Placement by partitioning Placement techniques ■ Can construct an initial solution, improve an existing solution. ■ Pairwise interchange is a simple improvement metric: – Interchange a pair, keep the swap if it helps wire length. – Heuristic determines which two components to swap. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 18 Copyright 1998 Prentice Hall PTR Min-cut bisecting partitioning ■ Works well for components of fairly uniform size. ■ Partition netlist to minimize total wire length using min-cut criterion. ■ Partitioning may be interpreted as 1-D or 2D layout. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 19 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 20 Copyright 1998 Prentice Hall PTR Min-cut bisecting partitioning, cont’d ■ ■ Swapping A and B: Powerful but CPU-intensive optimization technique. ■ Analogy to annealing of metals: – B drags 1 net; – A drags 3 nets; – total cut increase: 4 nets. ■ Simulated annealing – temperature determines probability of a component jumping position; – probabilistically accept moves. – start at high temperature, cool to lower temperature to try to reach good placement. Conclusion: probably not a good swap, but must be compared with other pairs. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 21 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Routing ■ Net ordering is a major problem. Order in whch nets are routed determines quality fo result. Net ordering is a heuristic. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 23 Copyright 1998 Prentice Hall PTR Maze (or area) routing ■ Major phases in routing: – global routing assigns nets to routing areas; – detailed routing designs the routing areas. ■ Page 22 Copyright 1998 Prentice Hall PTR Will find shortest path for a single wire, if such a path exists. ■ Two phases: – Label nodes with distance, radiating from source. – Use distances to trace from sink to source, choosing a path that always decreases distance to source. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 24 Copyright 1998 Prentice Hall PTR Maze routing example Mask-Level Layout Tools ■ Layout compaction: push mask polygons as close to each other respecting design rules. ■ Design-rule checking: verify that mask patterns obey the design rules. ■ Layout extraction: reconstruct transistor circuit from the layout including parasitics. ■ Layout versus schematic (LVS): compare extracted transistor netlist with original one. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 25 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Topics ■ Timing analysis. ■ Logic synthesis. ■ Sequential machine optimizations. ■ Hardware/software co-design. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 27 Copyright 1998 Prentice Hall PTR Page 26 Copyright 1998 Prentice Hall PTR Timing analysis ■ Unlike simulation, timing analysis is valueindependent—doesn’t require specifying inputs. ■ Simulation can be optimistic—you may not apply worst-case input vector. ■ Timing analysis can be pessimistic, but that is safer than optimistic. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 28 Copyright 1998 Prentice Hall PTR Signal delay example Must apply worst case to find longest delay: Timing analysis procedure ■ Two major steps: – build graph with elemental delays; – traverse graph to find longest path. ■ Must model 0-1 and 1-0 delays independently for more accurate total delay. ■ Use value analysis to prune impossible paths. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 29 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Switch circuit example Page 30 Copyright 1998 Prentice Hall PTR Switch circuit example, cont’d ■ Make assumptions about primary inputs. ■ Primitive path delay: RC delay from power supply or primary input to transistor gate or primary output. ■ Primitive path (p0, p1, p2, p3) delays computed from RC analysis. ■ Each path forms an edge in timing analysis graph. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 31 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 32 Copyright 1998 Prentice Hall PTR Switch circuit example, cont’d ■ Timing analysis graph is analyzed to find worst-case delay through entire circuit. ■ Timing graph structure: – nodes are sources and sinks of primitive delay paths; – edges represent primitive delay paths. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 33 Copyright 1998 Prentice Hall PTR Current/signal flow analysis Timing analysis pessimism ■ False paths create unexcercisable paths which make delay pessimistic. Can be identified using analysis algorithms. ■ Some transistor configurations only allow current flow in one direction—other direction of current/signal flow is a false path. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 34 Copyright 1998 Prentice Hall PTR False current path example shift1 and shift2 are never simultaneously active. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 35 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 36 Copyright 1998 Prentice Hall PTR False current path example, cont’d ■ Many paths in barrel shifter cannot be exercised. ■ Driver gates enforce current flow in one direction on data lines, eliminating some paths through the pass transistors. ■ Path analysis which does not take into account feasible current flow will identify infeasible long paths. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 37 Copyright 1998 Prentice Hall PTR Transistor sizing ■ Once transistor-level critical path has been identified, transistors can be sized to optimize delay. ■ Transistor sizing is cast as optimization problem to meet performance goal while minimizing total active area. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Logic synthesis ■ Goal: create a logic gate network which performs a given set of functions. ■ Input is Boolean formulas; gates also implement Boolean functions. ■ Logic synthesis: – maps onto available gates; – restructures for delay, area, testability, power, etc. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 39 Copyright 1998 Prentice Hall PTR Page 38 Copyright 1998 Prentice Hall PTR Logic synthesis phases ■ Technology-independent optimizations work on logic representations that do not directly model logic gates. ■ Technology-dependent optimizations work in the available set of logic gates. ■ Transformation from technologyindependent to technology-dependent is called library binding or technology mapping. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 40 Copyright 1998 Prentice Hall PTR Boolean network Boolean network example ■ A Boolean network is the main representation of the logic functions for technology independent optimizations. ■ Each node can be represented as sum-ofproducts (or product-of-sums). ■ Provides multi-level structure, but functions in the network need not correspond to logic gates. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 41 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Support: set of variables used by a function. ■ Transitive fanout: all the primary outputs and intermediate variables of a function. ■ Transitive fanin: all the primary inputs and intermediate variables used by a function. Transistive fanin determines a cone of logic. primary inputs Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 cone Page 43 Copyright 1998 Prentice Hall PTR Technology-independent logic optimization Terms ■ Page 42 ■ Simplification rewrites node to simplify its form. ■ Network restructuring introduces new nodes for common factors, collapses several nodes into one new node. ■ Delay restructuring changes factorization to reduce path length. output Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 44 Copyright 1998 Prentice Hall PTR Cost in the Boolean network ■ Don’t know exact gate structure, but can estimate final network cost: – area estimated by number of literals (true or complement forms of variables); – delay estimated by path length. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 45 Copyright 1998 Prentice Hall PTR Simplification ■ Rewrites a node to reduce the number of literals in the node. ■ Function defined by: – on-set: set of inputs for which output is 1; – off-set: set of inputs for which output is 0; – don’t-care-set: set of inputs for which output is don’t-care. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 46 Functions, covers, and cubes Copyright 1998 Prentice Hall PTR Cover example ■ Each way to write a function as a sum-ofproducts is a cover since it covers the onset. ■ A cover is composed of cubes—product terms which define a subspace cube in the function space. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 47 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 48 Copyright 1998 Prentice Hall PTR Covers and optimizations ■ ■ Larger cover: – x1’ x2’ x3’ + x1 x2’ x3’ + x1’ x2 x3’ + x1 x2 x3 – requires four cubes, 12 literals. ■ Partially-specified functions Smaller cover: Don’t-cares can be implemented in either the on-set or off-set. ■ Don’t-cares provide the greatest opportunities for minimization in many cases. – x2’ x3’ + x1’ x3’ + x1 x2 x3 – requires three cubes, seven literals; – x1’ x2 x3’ is covered by two cubes. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 49 Copyright 1998 Prentice Hall PTR Partially-specified function example: problem Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 50 Copyright 1998 Prentice Hall PTR Partially-specified function example: cover = element of ON-SET = element of ON-SET = element of DC-SET = element of DC-SET Solution requires: two cubes, three literals. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 51 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 52 Copyright 1998 Prentice Hall PTR Espresso Partial collapsing ■ Well-known two-level logic optimizer. ■ Espresso optimization loop (see p. 479): – expand; – make irredundant; – reduce. ■ a before Page 53 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Factorization ■ ■ Based on division: Algebraic division: don’t take into account Boolean simplification (e.g. a· 1=1). Less expensive than Boolean division. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 F f2 Page 55 Copyright 1998 Prentice Hall PTR b c d e f g after Page 54 Copyright 1998 Prentice Hall PTR Factorization using division – formulate candidate divisor; – test how it divides into the function; – if g = f/c, we can use c as an intermediate function for f. ■ o2 f3 Optimization loop is designed to refine cover to reduce its size. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 o1 Three steps: – generate potential common factors and compute literal savings if used; – choose factors to substitute into network; – restructure the network to use the new factors. ■ Algebraic/Boolean divison can be used to implement first step. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 56 Copyright 1998 Prentice Hall PTR Factorization for delay ■ Remove factors from critical path, add them off critical path: Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 57 Copyright 1998 Prentice Hall PTR Library binding ■ Also known as technology mapping. ■ Rewrites Boolean network in terms of available logic functions. ■ Can optimize for both area and delay. ■ Can be viewed as a pattern matching problem. Tries to find pattern match which minimizes area/delay cost. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Technology mapping procedure ■ Write Boolean network in canonical NAND form. ■ Write each library gate in canonical NAND form. Assign cost to each library gate. ■ If network is a tree, can use dynamic programming to select minimum-cost cover of network by library gates. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 59 Copyright 1998 Prentice Hall PTR Page 58 Copyright 1998 Prentice Hall PTR Breaking into trees ■ Not optimal, but reasonable cuts usually work OK. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 60 Copyright 1998 Prentice Hall PTR Technology mapping example, cont’d Technology mapping example after three levels of matching Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 61 after four levels of matching Copyright 1998 Prentice Hall PTR Technology mapping example, cont’d ■ Try to cover tree from primary inputs to primary outputs. ■ Proceed one gate at a time. At the next level, select minimum-cost cover at that point. ■ Latest selection may require removing some earlier matches. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 63 Copyright 1998 Prentice Hall PTR Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 62 Copyright 1998 Prentice Hall PTR Sequential machine optimizations ■ State assignment: choose codes for states. ■ Create common factors in states by assigning them close codes: – s0 = 110 (x0 x1 x2’); s1 = 111 (x0 x1 x2); – s0 + s1 = x0 x1 . ■ Program KISS: minimizes symbolic state machine to find states which should be coded close together. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 64 Copyright 1998 Prentice Hall PTR Coding procedure ■ Symbolic minimization creates constraints on coding—sets of states which should be coded closely together. ■ If all constraints cannot be satisfied in minimum number of bits, can add bits to code to allow constraints to be satisfied. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 65 Copyright 1998 Prentice Hall PTR Adding code bits code groups: s1, s2 s2, s3 s3, s4, s2 3-bit encoding necessary if states in all code groups should have distance 1. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Hardware/software co-design ■ Use programmable CPUs along with specialized logic blocks to implement a particular application. ■ CPUs can implement background functions much more efficiently than dedicated logic: higher utilization of logic. ■ Important in systems-on-silicon. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 67 Copyright 1998 Prentice Hall PTR Page 66 Copyright 1998 Prentice Hall PTR Co-synthesis styles ■ Hardware/software partitioning: – Architectural template consists of 1 CPU + n ASICs. – Put operations on CPU or ASIC. ■ Distributed system synthesis: – No fixed architectural template. – Allocate function units and communication, schedule operations. Revised by SG: February 25, 2002 Modern VLSI Design 2e: Chapter 10 Page 68 Copyright 1998 Prentice Hall PTR