Standard Cell Architecture for High Frequency Operation Peter Hsu, Ph.D. Chief Architect Microprocessor Development Toshiba America Electronics Components, Inc. Toshiba Created 14 March 2001 at the University of Wisconsin in Madison Disclaimer The ideas, data and conclusions presented here are solely those of the Author, and do not in any way represent Toshiba Corporation policy or strategy. Layout Architecture for High Frequency Operation 2 Introduction High Frequency is Difficult! – Many Issues: • Signal Integrity, Power Dissipation, ... – My Approach: • Disciplined Methodology • Global Optimization Outline – Layout – Circuits – Analysis Layout Architecture for High Frequency Operation 3 Layout Strategy Leverage Advanced Technologies – Local Interconnect – Flip-Chip Area Array I/O CAD Tool Compatibility – Parasitic Estimation, Extraction Complex, High Frequency Designs – Robust Power Grid – Flexible Macro Embedding Layout Architecture for High Frequency Operation 4 Metal Usage Dimensions are for nominal 0.12µm generation process Top Metal: Flip-Chip Solder Pads Global Wires VDD 300nm 300nm 600nm 450nm 900nm Signal Clock (2x) 450nm Short Via 300nm VSS 200nm Contact Local Interconnect (M0): Tungsten, Aluminum or Copper Layout Architecture for High Frequency Operation 150nm 150nm 5 Standard Cell Layout Minimum Power Rail 6 Tracks From Edge Unrelated Wire U1.A U1.Z VDD Cell Row Power Vias VDD VSS Cell Row Power Vias (1 every 6 Tracks) A A U1 U2 Z Z Local Interconnect VDD Crosspoint Power Vias VSS U2.Z U2.A Minimum Cell 3 Tracks Layout Architecture for High Frequency Operation Minimum Pin Width 2 Tracks Pins Must Stagger Smallest Cell 13 Tracks Double Height Cell 6 Area Array I/O 1.2m Core VSS I/O VSS Core VDD Largest SRAM Macro without sacrificing I/O (16 KBytes) Cell Signal I/O VDD 2.5m2 640µm 256 Rows 256 Columns Cell Array Decoder 670µm 256 Rows 256 Columns Cell Array Sense Amp. 225m pitch 225m 5 I/O Macro (50Km2 ) Layout Architecture for High Frequency Operation Sense Amp. 538µm 102µm 307µm 56µm 7 I/O Macro Cell Self-Contained – – – – 5 Signals VDDQ, VSSQ ESD Protection Latch-Up Ring Top Metal M6 Free Routing Channels SoC Flexibility M4 M3 – Many I/O Types – Different Voltages M5 I/O Macro Use M0+M1+M2 Routing Porosity – 50% Channels Free in Global Wiring Layers – Short Output Trace on Top Metal (Electromigration) Layout Architecture for High Frequency Operation 8 SRAM Metal Usage Bit Lines VSS VDD VSS Signals VSS Word Line #1 VDD Word Line #2 M2 6-Transistor Cell (1.2 2.1 m ) M1 SRAM Macro Uses M0+M1+M2 M3 Global Wires (1 or 2 Pitch) CAD Tool Inserts M3:M2 Power Vias Layout Architecture for High Frequency Operation 9 Word Line Shielding Blocked Tracks M3 Global Wires Bit Lines Cell Array Sense Amp. Decoder Zigzag Minimizes Coupling from M3 Signals to M2 Word Lines when SRAM is Rotated Signals VDD Layout Architecture for High Frequency Operation Signals Signals VSS 10 Rationale “Effective Area” – Actual Footprint + Routing Disturbance – Larger, More Porous Layout Faster • Bigger Transistors • More Space around Bit Lines • Shielding SoC – Complex Microarchitecture – Many Small SRAMs Layout Architecture for High Frequency Operation 11 Circuit Design Building Blocks – Latch Array • Malleable, Porous, Multi-Port SRAM – Dynamic Wire-OR Gate • High Fan-in, Safe, CAD Compatible Power Dissipation – Double Edge Flipflop • 50% Clock Tree 30% Peak Chip-Wide – Interpolation Cells Layout Architecture for High Frequency Operation 12 E Decoder CK Write Address D Q G G D Q May Buffer during Place&Route Write Enable E D Q E CK Decoder D Q Latch + Tristate Driver CK Q D CK Write Data Q D Latch Array Read Address E D Q D Q G G Combinatorial Read Path D Q CK Write Pulse Generator Test Mode Read Data Layout Architecture for High Frequency Operation 13 Dynamic Wire-OR Gate Sized for Max. Length Highest Leverage Receiver Cell – Dynamic vs. Static Keeper Safe, CAD Compatible – Limit Wire Length using Timing Driven Placement – No Dynamic Inputs Output D Q Clock G Max. Length by Max-Load, Max-Transition Spec. Input D1 D __ Input DN GQ D Sized for 1 __ GQ Limit Max. N by Clock Max-Fanout Spec. Clock Sized for Max-Fanout Driver Cell Layout Architecture for High Frequency Operation 14 Double-Edge Flipflop Low Power – Clock ½ Frequency – Light Clock Load • 2 Large + 4 Small Small, Fast§ D Q – 15P + 15N Transistors Safe, Flexible Ck – Fully Static – Supports Scan Switching Nodes with Constant “1” Data ______ §B. Nikolic, et.al., “Sense Amplifier-Based Flip-Flop,” ISSCC 1999. Layout Architecture for High Frequency Operation 15 Interpolation Cells 2/ Power 3 For Post Route In-Place Optimization 5/ Power 6 Same Footprint, Shorter Transistors Full Power 1X Cell 2X Cell 4X Cell Layout Architecture for High Frequency Operation 16 Analysis Signal Integrity – Parasitics “Accurate By Construction” • Uniform Metal Density • Majority Coupling to Power Rails (Shielding) Speed Yield – Balanced with Resources • Area, Power, Design Time – Goal: Adequate Confidence Layout Architecture for High Frequency Operation 17 Uniform Metal Density VDD A A U1 U2 Z Z VSS Post Route Metal Usage Algorithmically Generated Filled Metal Layout Architecture for High Frequency Operation Uniform Density on all Layers (except Local Interconnect) 18 Advantages Design – Accurate Estimation • Capacitance has Low Variance – Known Coupling • 50% to Adjacent Power Line – Quick Feedback • Interconnect-Only Extraction is Accurate Manufacturing – Uniform Etch Resist Loading Layout Architecture for High Frequency Operation 19 Asymmetric Rise-Fall Delays Delay Duty Cycle Slow Slow Same Size P Transistors Same Size N Transistors Same Elongates Layout Architecture for High Frequency Operation Shrink 20 Pros and Cons Advantages – More Compact Cells, Faster Circuits Disadvantages – Need Careful Analysis, Greater Margin Strategy: – Main Library • Asymmetric, “No Wasted Space” – Symmetric Subset • Gated Clocks, Write Pulse Buffering, ... Layout Architecture for High Frequency Operation 21 Speed Yield Management Fast P Target Design and Characterize Library Here “Four Corner” Analysis Correct Operation Process Center Slow N Fast N Mature Process Variation Possibly Impossible to Meet Performance Goal, or Needlessly High Effort Maximum Process Variation Slow P Transistors Layout Architecture for High Frequency Operation 22 Conclusions “Precision Physical Design” – Global • Power Grid • Macro Routing Porosity – Methodical • Signal Integrity • Parasitic Extraction • Timing Uncertainties (Coupling) – Confident • Correctness and Speed Layout Architecture for High Frequency Operation 23