Part V: Design Optimizations Andrew B. Kahng UCSD and Blaze DFM, Inc. abk@ucsd.edu Three Trends Trend 1: Reactions to “failure of WYSIWYG” • Shape (litho, etch) and thickness (CMP) simulators • Geometric criteria (process-window hot-spot checkers, etc.) before electrical criteria (Iddq, FMax variation, etc.) • Library/IP development use models before full-chip use models • Analyses before optimizations Trend 2: Reactions to “uncontrollable variation” • Experiments with statistical analysis tools Trend 3: Commoditization of IDM internal technologies • Defect-oriented yield analyses: critical area analysis • Simple layout methodologies: post-route via/contact doubling DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 2 1 Some Moderate Failures of Imagination Linear extrapolation • Larger guardbands • More design rules • Better equipment Putting the “virtual fab” or “litho simulator” onto the designer’s desktop Statistical timing analysis Industry-wide regression • DFM’s first wave: “All I want is what IBM has been making and using internally for the past 10 years…” DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 3 Proposed Precepts for DFM Don’t assume what doesn’t exist • • • • Example: “detailed process information” What drives or even allows the process to improve? Process evolves over time/design with long time constant What improves the design today may hurt it tomorrow Don’t mess with anything golden • • • • Handoff: GDSII/OASIS formats, BSIM4 model, .lib model Signoff: If the design is closed, don’t un-close it !!! Analyses: RC extraction, performance, litho simulation Private: Litho setup, OPC recipes Don’t assume a “new silicon engineer” • 21st-Century IC designer = deep and broad (“from C to OPC”?) • But not unboundedly so Æ separation of concerns is a good thing • Don’t ask a designer to become a lithography engineer • Don’t ask lithography engineers to understand the design DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 4 2 Where We Are Today Huge $$$ still left on the table • “Left on table” = recoverable by improved design technology without any process or productivity change • Many concrete examples exist ! • Will recover much of this in the next 3-4 years? • Power: 0.5 x full technology node • Area: 0.3 x full technology node • Frequency: 1.0 x full technology node • Variability control: 1.0 x full technology node Simulation- and analysis-centric “first wave” of DFM • Still has some “failures of imagination” Near-term goals • Embrace variation and optimize parametric yield • Give clear ROI for products DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 5 Outline Detailed Placement for Process Window Enhancement CMP Fill at 65nm and Below Auxiliary Pattern Methodology for Cell-Based OPC Crosstalk Awareness in SSTA Other DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 6 3 Bias OPC lithography Process Original Design ( or Mask) Wafer Patterns OPC compare Lithography Process OPC Design (or Mask) Wafer Patterns Mask design is modified to match photo-resist edges to layout edge using a layout sizing technique • Bias OPC has limitation in enhancing process margins with respect to defocus and exposure dose 7 Andrew B. Kahng DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng SRAF (Sub-Resolution AF) Process Margin (180nm) Layout (or Mask ) Design 0.22 SB=0 Active 0.2 0.18 SB=1 SB=2 Wafer structure (SEM) CDÆ 0.16 0.14 0.12 0.1 0.08 DOFÆ 0.06 0.04 0.0 0.1 0.2 SB2 CD (nm) 0.3 0.4 SB1 0.5 0.6 SB0 #SB = 0 #SB=1 #SB=2 160 177 182 SRAF = Scattering Bar (SB) SRAFs enhance process window (focus, exposure dose) • Extremely narrow lines Æ do not print on water • More SBs help to enhance DOF margin and to meet the target CD DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 8 4 SRAFs and Bossung Plots 180 180 140 140 12 11.5 CD (nm) CD (nm) 12 100 11 10.5 60 10 11.5 100 11 10.5 60 10 9.5 9.5 20 20 -20 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 -20 -0.8 0.8 -0.6 -0.4 DOF (um) -0.2 0 0.2 0.4 0.6 0.8 DOF (um) Bias OPC SRAF OPC Bossung plot • Measurement to evaluate lithographic manufacturability • For though-pitch process margin, maximize the common process window • Horizontal axis: Depth of Focus (DOF); Vertical axis: CD SRAF OPC • Improves process margin of isolated pattern • Larger overlap of process window between dense and isolated lines DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 9 Forbidden Pitches #SB=1 #SB=2 #SB=3 #SB=4 170 Allowable CD (nm) 130 90 Forbidden 50 10 -30 100 W/O OPC(Best DOF) W/O OPC(Defocus) Bias OPC(Defocus) SRAF OPC (Defocus) 300 500 700 900 1100 1300 1500 pitch (nm) Some Pitches do not allow for sufficient SRAF • • Æ Lowers printability, DOF and exposure margins Æ Called the forbidden pitch Bias OPC Æ NOT allowable CD for intermediate and large pitches SRAF OPC has intervals of allowed and forbidden pitches Æ Must avoid forbidden pitches in layout DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 10 5 Layout Composability for SRAFs Better than ÅxÆ Åx+δxÆ Small set of allowed feature spacings • Perturbation makes bad-printing layout assist-correct Two components of SRAF-aware methodology • Assist-correct libraries • Library cell layout should avoid all forbidden pitches • Intelligent library design • Assist-correct placement Å THIS TOPIC • Intelligent whitespace adjustment in the placer DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 11 AFCorr: SRAF-Correct Placement Before AFCorr Forbidden pitch After AFCorr Cell boundary By adjusting whitespace, additional SRAFs can be inserted between cells • Resist image improves and avoids open fault at worst-case defocus Problem: Perturb given placement minimally to achieve as much SRAF insertion as possible DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 12 6 Horizontal AFCorr (H-AFCorr) Forbidden pitch Before H-AFCorr After H-AFCorr Horizontal Perturbation Cell Boundary Horizontal-forbidden pitch is caused by interactions of poly geometries in the same row H-AFCorr is cell placement-perturbation in horizontal direction to avoid H-forbidden pitches DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 13 Vertical AFCorr (V-AFCorr) Forbidden pitch Cell Boundary Before V-AFCorr After V-AFCorr Vertical-forbidden pitch is caused by interactions of poly geometries in the inter cell row • Æ Adjust cell row in left- or right-direction to remove forbidden pitch Æ Space becomes assist-correct DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 14 7 Perturbation (H- + V- AFCorr) H-AFCorr AFCorr V-AFCorr AFCorr: H-AFCorr + V-AFCorr • Adjusting whitespace Æ additional SRAFs Æ reduce # of forbidden pitch DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 15 Minimum Perturbation Approach Objective: • Reduce forbidden pitch violation • Reduce weighted CD degradation with defocus • Minimum perturbation: preserve timing Constraint: • Placement site width must be respected How: • One standard cell row at a time • Solve each cell row by dynamic programming DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 16 8 Feasible Placement Perturbation SaLP Sa-1RP xa-1 W a- 1 Xa Minimize Σ | δi | s.t. δa +δa-1 + Sa-1RP + SaLP + (xa – xa-1 – wa-1) ∈ AF wi and xi = width and location of Ci δi = perturbation of location of cell Ci AF = set of allowed spacings RP, LP = boundary poly shapes with overlapping y-spans S = spacing from cell border to boundary poly DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 17 Dynamic Programming Solution COST (1,b) = | x1-b| // subrow up through cell 1, location b COST (a,b) = λ(a) |(xa -b)| + MIN{Xa-SRCH ≤ i ≤ Xa+SRCH} [COST(xa-1,i) + HCost(a,b,a-1,i)+VCost(a,b)] // SRCH = maximum allowed perturbation of cell location HCost = “forbidden-pitch cost” = sum over Horiz-adjacencies of slope(j) *|HSpace –AFj| s.t. AFj+1 > HSpace ≥ AFj VCost = “forbidden-pitch cost” = sum over Verti-adjacencies of slope(j) *|VSpace –AFj| s.t. AFj+1 > VSpace ≥ AFj λ = proportional to the timing criticality of cell ‘a’ Slope = ∆CD / ∆Pitch = CD degradation per unit space between AF values AFi = closest assist-feasible spacing ≤ HSpace DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 18 9 Experimental Setup KLA-Tencor’s Prolith • Model generation for OPCpro • Best focus/ worst (0.5 micron) defocus • Calculating forbidden pitches Mentor’s OPCpro, SBar SVRF • OPC, SRAF insertion, ORC (Optical Rule Check) Cadence SOC Encounter • Placement & Route Synopsys Design Complier • Benchmark design ALU from OpenCore.org • Synthesis DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 19 Experimental Metrics SB Count • Total number of scattering bars or SRAFs inserted in the design • Higher number of SRAFs indicates less through-focus variation and is hence desirable Forbidden Pitch Count • Number of border poly geometries estimated as having greater than 10% CD error through-focus EPE Count • Number of edge fragments on border poly geometries having greater than 10% edge placement error at the worst defocus level DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 20 10 Results: Increased SB Count 300000 130nm w AFCorr 90000 80000 250000 90nm # Total SB 200000 150000 w/o AFCorr 100000 50000 SB SB SB SB SB SB 60000 difference (130) difference (90) w/o AFCorr(130) w AFCorr(130) w/o AFCorr(90) w AFCorr(90) 0 50000 40000 30000 # SB Difference 70000 20000 10000 0 90 80 70 60 50 Utilization(%) SB count increases as utilization decreases due to increased whitespace #SB increases after AFCorr placement Æ Better DOF Andrew B. Kahng DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng 21 Results: Reduced F/P and EPE 100 Reduction (%) 90 80 EPE (130) EPE (90) F/Pitch (130) F/Pitch (90) 70 60 90 80 70 60 50 Utilization(%) Forbidden pitch count • 89%~100% in 130nm, 93%~100% in 90nm EPE Count • 80%~98% in 130nm, 83%~100% in 90nm DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 22 11 Impact on Other Design Metrics Utilization(%) 130nm 90nm 90 70 80 Flow: Orig AFCorr Orig AFCorr Orig AFCorr #EPE 8772 2267 5975 962 4976 274 R/T (s) 6721 6732 6839 6899 6878 6932 GDS (MB) 42.9 41.9 41.8 42.3 42.2 42.2 Delay (ns) 4.21 4.49 4.547 4.444 4.501 4.371 #EPE 7523 1262 4813 532 2131 107 R/T(s) 4835 5011 5451 5535 5529 5632 GDS(MB) 41.1 42.3 41.2 43.2 42.2 42.3 Delay(s) 2.478 2.305 2.458 2.602 2.522 2.47 Data size Æ 3%, OPC run time Æ 4%, Cycle time Æ 6% Other impacts are negligible and/or at inherent noise level, compared to large improvement in printability metrics DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 23 AFCorr Summary AFCorr is an effective approach to achieve assist feature compatibility in physical layout Up to 100% reduction of forbidden pitch and EPE Relatively negligible impacts on GDSII size, OPC runtime, and design clock cycle time • Compared to huge improvement in printability DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 24 12 Etch Dummy Insertion Problem 120 CD (nm) 100 Resist CD Etch CD 80 60 Active Poly SRAF Etch dummy 40 100 600 1100 1600 2100 Space (nm) Etch skew increases as pitch of primary pattern increases Æ Etch dummy Æ Reduce poly-to-poly space Æ Reduce etch skew Etch dummies are placed outside of diffusion-layer (or active layer) region DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 25 Etch Dummy Correction Problem Assist feature missing Assist feature Etch dummy Poly Active No forbidden pitch forbidden pitch Given a standard-cell layout, • determine perturbations to inter-cell spacings so as to simultaneously insert SRAFs in forbidden pitches and insert etch dummies. DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 26 13 Technique 1: SAEDM (SRAF-Aware Etch Dummy Method) L R Before SAEDM (SRAF missing: L R Typical etch dummy rule: fixed rule of active-to-etch dummy spacing SAEDM: flexible etch dummy rule according to active-to-etch dummy spacing L=R) • Calculate left poly-to-dummy and right poly-to-dummy spacings to insert Assist Features and Etch Dummies simultaneously • Inserted Etch Dummies have asymmetric active-to-dummy spacings After SAEDM (SRAF inserting: L≠ R) Active SRAF Poly Etch dummy DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 27 Technique 2: AFCorr + EtchCorr Placement Correctness Requirements Key Idea: Change whitespace distribution of standard-cell placement Æ best printability • Maximize number of assist features (AFCorr) • Optimal location of etch dummy (EtchCorr) AS (ES): sets of feasible spaces between two gates that allow insertion of required assist features (etch dummy) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 28 14 Algorithmic Approach: Corr Technique Dynamic Programming (DP) The “AFCorr and EtchCorr” can be solved by dynamic programming (DP): Corr = AFCorr + EtchCorr Cost(a;b): the cost of placing cell a at placement site number b • Component 1: perturbation component (x_a - b) from the original placement of cell "a" measured in placement sites • Component 2: AFCost and EtchCost correspond to the printability deterioration of resist and etch CD, respectively λ: a factor decides the relative importance of preserving the initial placement and the final EtchCorr benefit achieved. α and β are user-defined weights for AFCost and EtchCost, respectively Andrew B. Kahng DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng 29 Design and Evaluation Flow Typical Design Flow Lithography & etch model generation Modified library & netlist Placement Post-placement (Corr) Route Route Typical GDSII Etch dummy generation based on SAEDM Etch Dummy and SRAF insertion rules, Forbidden pitch Quality metrics Assist and etch dummy corrected GDSII - Printability - #Etch dummy and #SB - EPEs of resist and etch - Performance - Delay, OPC run time SB OPC - SB Insertion - Model-based OPC OPCed GDS More amendable to insert SRAF and etch dummy Novel design flow: the added steps of forbidden pitch and SRAF insertion rules, and SAEDM and Corr techniques to typical design flow DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 30 15 Experimental Results 40000 Dummy difference SB difference Dummy w/o EtchCorr Dummy w EtchCorr SB w/o EtchCorr SB w EtchCorr 30000 25000 20000 80000 15000 10000 40000 100 Reduction(%) 120000 120 35000 # SB /D umm y D ifference # Total S B / D um my 160000 80 60 40 W SAEDM and W/O EtchCorr (Resist) W SAEDM and W EtchCorr(Resist) 20 5000 0 0 90 80 70 60 50 W SAEDM and W EtchCorr(Etch) 0 90 Utilization(%) 80 70 60 50 Utilization(%) Number of total SRAFs and etch dummies increases due to increased whitespace Forbidden Pitch Count reduction of photo process Æ 58%-97% with SAEDM and 90%-100% with (SAEDM + Corr) Forbidden Pitch Count reduction of etch process Æ 77%-97% with (SAEDM + Corr) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 31 Corr Summary Corr placement perturbation with SAEDM can achieve up to 100% reduction in number of cell border poly geometries having forbidden pitch violations. The corresponding reduction in EPE is up to 100% (resist CD) and 97% (etch CD). SB count and etch dummy counts, which indicate less through-focus CD variation and etch skew, increase up to 10.8% and 18.6%, respectively. The increases of data size, OPC running time and maximum delay overheads of Corr are within 3%, 4% and 6%, respectively. DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 32 16 Outline Detailed Placement for Process Window Enhancement CMP Fill at 65nm and Below Auxiliary Pattern Methodology for Cell-Based OPC Crosstalk Awareness in SSTA Other DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 33 BEOL Contribution to Variation Parameter Delay Impact BEOL metal (Metal mistrack, thin/thick wires) -10% → +25% Environmental (Voltage islands, IR drop, temperature) ±15 % Device fatigue (NBTI, hot electron effects) ±10% Vt and Tox device family tracking (Can have multiple Vt and Tox device families) ± 5% Model/hardware uncertainty (Per cell type) ± 5% N/P mistrack (Fast rise/slow fall, fast fall/slow rise) ±10% PLL (Jitter, duty cycle, phase error) ±10% Æ Scalable optimal CMP fill (metal, STI, timing, fill pattern) Æ Combinatorial methods for redundant via insertion Æ “Religious questions” DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 34 17 CMP and DFM Design Timing and Power R,C Parasitics Topography Depth of Focus CMP Lithographic Manufacturability • CMP and Fill effects • Cu erosion and dishing cause resistance change • Dummy fill to aid CMP in achieving planarity causes capacitance change • Topographic variation translates to focus variation for imaging of subsequent layers Æreduced process window Æ linewidth variation Æ R, C variation • CMP interacts with design as well as lithography closely DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 35 Fixed-Dissection Regime To make filling more tractable, monitor only fixed set of w × w windows • offset = w/r (example shown: w = 4, r = 4) Partition n x n layout into nr/w × nr/w fixed dissections Each w × w window is partitioned into r2 tiles Basic rules: upper / lower bounds on window densities (original layout + inserted fill) • Example: windows have w = 100um • Each window divided into r = 4 “steps” • Step distance = 25um • Æ 20mm, 10LM ASIC chip will have 6.4 million “tiles” w w/r tile Overlapping windows n DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 36 18 Previous / New Objectives in Density Control Objective for Manufacture = Min-Var minimize window density variation subject to upper bound on window density Objective for Design = Min-Fill minimize total amount of added fill features subject to upper bound on window density variation NEW !!! Multi-layer and Multi-window constraints Fully staggered fill patterning and/or wire-like (“track”) fill Maximize via fill Maximize smoothness of density Drive with CMP (post-polish wafer topography) simulation Handle analog symmetry requirements … DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 37 Previous Works on Fill Synthesis Kahng et al. • First LP-based approach for Min-Var objective • Minimize M s.t. M ≥ |dens(Wi) – dens(Wj)| ∀ i,j (force minimum variation) |dens(Wi) – dens(Wj)| ≤ K ∀ i,j neighbors (smoothness) where dens(W) = density(orig layout + added fill) in all tiles of W (Problem: there are millions of tiles in the chip!) • Iterated Monte-Carlo/greedy methods, hierarchical and multiplelayer fill methods Wong et al. • LP-based approaches for Min-Fill objective • LP-based approaches for multiple-layer fill problem and dualmaterial fill problem DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 38 19 What Would “Optimum” CMP Fill Look Like? Kahng et al. 1998: Linear Programming (LP) approach for Min-Var objective Minimize M s.t. Minimize variation M ≥ |dens(Wi) – dens(Wj)| ∀ window pairs Wi, Wj ∀ window pairs Wi, Wj that are neighbors dens(W) = sum of original layout + added fill in all tiles of W Enforce |dens(Wi) smoothness – dens(Wj)| ≤ K “fill slack” computed by initial layout analysis variables that we optimize Variables in LP = amounts of fill 0 ≤ fijk ≤ sijk added into each tile Difficulty: There are millions of variables in this LP !!! “Difficult” image sensor chip Original Solution minVar minFill maxSmoothness Min. D 0.1652 0.4153 0.3234 0.3945 Max. D 0.4717 0.5448 0.4717 0.5243 delta D 0.3065 0.1295 0.1483 0.1298 # of Fill --784,968 416,773 711,429 DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Avg. Smoothness 0.0508 0.0234 0.0317 0.0174 39 Andrew B. Kahng Density Variation and Smoothness Cases Original Average Window Minimum Maximum Density Delta Size Density Density Range Density (um) 25 0.0000 0.7600 0.7600 50 0.0000 0.7600 0.7600 100 0.0080 0.7033 0.6953 0.1339 minVar 25 50 100 maxSmoothness 25 50 100 minFill 25 50 100 0.1914 0.2273 0.2355 0.1914 0.2273 0.2354 0.1504 0.1555 0.1612 0.7600 0.7600 0.7033 0.7600 0.7600 0.7033 0.7600 0.7600 0.7033 0.5686 0.5327 0.4678 0.5686 0.5327 0.4679 0.6096 0.6045 0.5421 0.0435 0.0298 0.0532 Fill Area (um x um) 302915 308204 201952 Variation Smoothness DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 40 20 Religious Questions in BEOL DFM Should CMP fill be owned by the routing / timing closure tool or by the DRC / PG tool? • Answer: proper fill is best achieved today post-layout by a tool that maintains the signoff Must fill be “timing-driven”, or is “timing-aware” sufficient? • Answer: “Timing-aware” is likely sufficient through the 45nm node Are CMP and litho simulations for “more accurate parasitics and signoff” really necessary? • Answer: Probably not. CDs and thickness variations are “self-compensating” w.r.t. timing. Guardbands are reasonable. There is a big mess with existing calibrations of the RC extraction tool to silicon. If two solutions both meet the spec, are they of equal value? How elaborate must cost functions and layout knobs be for EDA tools to understand via yield / reliability, EM, etc.? ... DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 41 “Intelligent” Fill Goals for 65nm and beyond True timing- and SI-awareness • Driven by internal engines for incremental extraction, delay calculation, static timing/noise analysis • Open Question: is this done by the router? Or post-layout processing? True multi-layer, multi-window global optimization of effective density smoothness and uniformity • Recall: millions of “tiles” – can we optimize all fill on all layers simultaneously? Analog fill, capacitor fill, via fill Floating, grounded and track fill Standalone, ECO, and ripup-refill use models Supports thickness bias models (CMP predictors) Key technology for managing BEOL variability and enhancing parametric yield DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 42 21 Density Histogram of Pre-/Post- Fill Original “Oxide” Density Histogram (∆D = 31%) minFill “Oxide” Density Histogram (∆D = 15%) minVar “Oxide” Density Histogram (∆D = 13%) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 43 Generate Symmetric Fill (Analog Regions) Axis of Symmetry Analog Cell DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 44 22 Timing-Driven Fill: Early Ideas General guidelines: • Minimize total number of fill features • Minimize fill feature size • Maximize space between fill features • Maximize buffer distance between original and fill features Sample observations in literature • Motorola [Grobman et al., 2001]: key parameters are fill feature size and buffer distance • Samsung [Lee et al., 2003]: floating fills must be included in chip-level RC extraction and timing analysis to avoid timing errors • MIT MTL [Stine et al., 1998]: proposed a rule-based area fill methodology to minimize added interconnect coupling capacitance DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 45 Extensions Consider impact due to fill on overlap and fringe capacitance • Directly impacts dynamic power (CV2f) Multi-layer filling for better CMP modeling and timing paths across different layers Use fill to intentionally benefit timing robustness • Shortcut power/ground distribution networks Æ better IR drop • Extra capacitance for hold time critical paths Æ more robust timing Integrate a simplified CMP model in fill insertion and intermediate RC estimation Let’s look at some possibilities for timing-aware flow and CMP model integration DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 46 23 Timing-Aware and Timing-Driven Use Models SPEF, SDC SI / Timing reports List of critical nets GDS / LEF/DEF / OA Tech file / DRM User parameters P&R (ECO) DEF / DB Intelligent Fill (ECO) DEF’ / DB’ (or, GDS) External CMP Model RCX GDS Topo Map Intelligent Fill GDS’ / DEF’ / OA’ Reports SPEF’ SI / Timing (to signoff analyses) Timing-Aware = Timing-Driven = DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng + Andrew B. Kahng 47 Critical Net File Example MULT.C[46] { M1 M1 1.4 \ M1 M2 1.12 \ M2 M1 1.12 \ M2 M2 1.4 \ M2 M3 1.12 \ M3 M2 1.12 \ M3 M3 1.4 \ M3 M4 1.12 \ M4 M3 1.12 \ M4 M4 1.4 \ M4 M5 1.12 \ M5 M4 1.12 \ M5 M5 1.4 \ M5 M6 1.12 \ M6 M5 1.12 \ M6 M6 1.4 \ M6 M1_2B 2.24 \ M1_2B M6 2.24 \ M1_2B M1_2B 2.8 \ M1_2B M2_2B 2.24 \ M2_2B M1_2B 2.24 \ M2_2B M2_2B 2.8 \ } … For a net segment in M3, block 1.12um from the segment in M2. For each of the top K critical nets, e.g., block out areas in: (1) layer below the net (2) layer of the net (3) layer above the net DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 48 24 M2 Fragment Showing Timing-Aware Keepout DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 49 Timing-Aware Keepout Illustration (M4) M4 route M4 fill M4 keepout DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 50 25 Density Variation vs. #Critical Nets 14.0% Metal 5 Layer Density Range 13.8% 13.6% 13.4% 13.2% 13.0% 100 200 300 400 500 600 700 800 900 # of Critical Net Chosen • Density range is a weak function of # of critical nets. • Blaze IF can compensate the loss of potential fill areas. DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng 51 Andrew B. Kahng Timing-Aware and Power-Aware Fill Design: Image Processor (1.3mmX1.3mm, 90nm, 8 metal layers, maxSmooth) TIMING-AWARE FILL # of violating endpoints Worst endpoint slacks (ns) ..ICACHE/ICACHE/MyBusy_R_reg/D ..COPIF3/COPIFX/COPLOGIC1/CWRDATA_R_reg[31]/D ..ICACHE/ICACHE/IC_HALT_S_R_reg[1]/D ..ICACHE/ICACHE/IC_HALT_S_R_reg[0]/D ..ICACHE/ICACHE/IC_HALT_S_R_reg[2]/D Layout Density Variation Metal1 Metal2 Metal3 Metal4 Metal5 Metal6 Metal7 Metal8 POWER-AWARE FILL Dynamic power (mW) No fill Fill w/o CNF Fill w/ CNF 0 5 0 0.000 0.040 0.045 0.048 0.048 -0.084 -0.050 -0.034 -0.019 -0.003 0.000 0.044 0.045 0.048 0.048 0.659 0.747 0.769 0.703 0.684 0.665 0.600 0.613 0.659 0.805 0.721 0.804 0.748 0.730 0.630 0.613 21.229 20.471 20.131 DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 52 26 Intelligent Fill With CMP Modeling Layout, Design Data, Fill Constraints Layout, Design Data, Fill Constraints Layout, Design Data, Fill Constraints GDS, Topo Map Intelligent Fill Intelligent Fill Uniform Effective Density Objective Post-Fill Layout, Reports Uniform Effective Density + Step Height Objective External CMP Model Signoff CMP Model Signoff CMP Model (1) TOMORROW? Internal CMP Model Post-Fill Layout, Reports Post-Fill Layout, Reports Signoff CMP Model Intelligent Fill Uniform Effective Density +Step Height Objective (2) AFTER TOMORROW?? (3) AFTER AFTER TOMORROW??? DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 53 Approximating the Signoff CMP Model Calibration data for each grid point: • X (um), Y (um) • Density • Cu thickness (A) • Dielectric thickness (A) • Optional: Pre-CMP Cu thickness, trench depth, barrier thickness, etc. Layout, Design Data, Fill Constraints Test Layouts Signoff CMP Model (or silicon) Topography Predictions (or measurements) Intelligent Fill Uniform Effective Density +Step Height Objective Internal CMP Model Approximation of Signoff CMP Model Post-Fill Layout, Reports Signoff CMP Model DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 54 27 Multi-Layer Fill Optimization Min-Var Optimization T M3 M ≥ |Dmax,3 – Dmin,3| M ≥ |Dmax,2 – Dmin,2| Layers co-optimized for minimum density variation M ≥ |Dmax,1 – Dmin,1| M2 RISC CPU Core Example (90nm) M1 Y X Metal1 Metal2 Metal3 Metal4 Metal5 Metal6 Metal7 Original Thickness Post-Fill Thickness Variation (A) Variation (A) 662 492 1642 1217 1270 1300 1969 1658 1657 1608 1935 1711 1835 1670 M1 topography impacts M3 topography DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 55 CD Variation Due To Topography Side view showing thickness variation over regions with dense and sparse layout. Top view showing CD variation when a line is patterned over a region with uneven wafer topography, i.e., under conditions of varying defocus. Goal: OPC technique that is aware of post-CMP topography DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 56 28 Topography-Aware OPC (TOPC) Flow Standard OPC Flow Library & Technology SOPC GDSII SOPCed GDSII CMP Simulation DOF Model Database DOF Marking Layer TOPC Input GDSII for TOPC TOPCed GDSII A map of thickness variation from CMP simulation is converted to defocus marking layers and then fed into GDSII for TOPC 57 Andrew B. Kahng DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng TOPC Results 10000 10000 8000 Number of EPE Number of EPE 8000 SOPC TOPC 6000 4000 SOPC TOPC 6000 4000 2000 2000 0 0 0 0.1 0.2 0.3 0.4 0.5 0 0.6 0.1 (a) DOF (um) 0.3 0.4 0.5 0.6 (b) DOF (um) CASE II : 90% improvement CASE I : 53% improvement Test Case 0.2 Original SOPC SOPC TOPC TOPC GDS (MB) GDS (MB) Runtime (min) GDS (MB) Runtime (min) CASE I 2.3 3.8 35 4.2 43 CASE II 2.3 3.8 35 4.4 45 TOPC achieves up to 90% reduction in edge placement errors. The improvement in process window comes at the cost of some increase in data volume and OPC runtime. DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 58 29 Conclusions: Futures for CMP/Fill in DFM Goal: Design convergence • Integrate design intent and physical models • CMP simulation + fill pattern synthesis + RCX + timing/SI driven Performance awareness • Maintain timing and SI closure • “Multi-use” fill: IR drop management, decap creation • Device layer: STI CMP modeling / fill synthesis, etch dummy Topography awareness • Close the loop back to RCX, fill pattern synthesis, OPC guidance Intelligent fill pattern synthesis • Minimum variation and smoothness in addition to density bounds • Handle MANY constraints at once: multi-window, multi-layer, etc. • Optional mixing of grounded and floating fill • Mask data volume control (e.g., shot-size aware, compressible fill) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 59 References Thy-Lai Tung, “A Method for Die-Scale Simulation of CMP Planarization, ” Proc. of SISPAD, pp. 65-68, 1997. Brian E. Stine, Dennis O. Ouma, Rajesh R. Divecha, Duane S. Boning, James E. Chung, Dale L. Hetherington, C. Randy Harwood, O. Samuel Nakagawa and Soo-young Oh, “Rapid Characterization and Modeling of Pattern-Dependent Variation in Chemical-Mechanical Polishing, ” IEEE Trans. on Semiconductor Manufacturing, Vol. 11, No. 1, pp. 129-140, Feb. 1998. Duane S. Boning, William P. Moyne, Taber H. Smith, James Moyne, Ronald Telfeyan, Arnon Hurwitz, Scott Shellman and John Taylor, “Run by Run Control of Chemical-Mechanical Polishing,” IEEE Trans. on Components, Packaging and Manufacturing, Vol. 19, No. 4, Oct. 1996. Xuan Zeng, Mingyuan Li, Wenqing Zhao, Pushan Tang and Dian Zhou, “Parasitic and Mismatch Modeling for Optimal Stack Generation,” Proc. of ISCAS, pp. 193-196, 2000. Yu Chen, Andrew B. Kahng, Gabriel Robins and Alexander Zelikovsky, “Hierarchical Dummy Fill for Process Uniformity,” Proc. of ASP-DAC, pp.139-144, 2001. Ruiqi Tian, Robert Boone, Sejal Chheda, Brad Smith, Xiaoping Tang, Ed Travis and D. F. Wong, “Proximity Dummy Feature Placement and Selective Via Sizing for Process Uniformity in a Trench-FirstVia-Last Dual-Inlaid Metal Process,” Proc. of IITC, pp.48-50, 2001. Ruiqi Tian, Xiaoping Tang and D. F. Wong, “Dummy Feature Placement for Chemical-Mechanical Uniformity in a Shallow Trench Isolation Process,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol. 21, No.1, pp.63.71, Jan. 2002. Andrew B. Kahng, Gabriel Robins, Anish Singh and Alexander Zelikovsky, “Filling Algorithms and Analyses for Layout Density Control,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol. 18, No. 4, Apr. 1999. Yu Chen, Puneet Gupta and Andrew B. Kahng, “Performance-Impact Limited Area Fill Synthesis,” Proc. of DAC, pp. 22-27, 2003. Lei He, Andrew B. Kahng, King H. Tam and Jiang Xiong, “Variability-Driven Considerations in the Design of Integrated-Circuits Global Interconnects,” Proc. VMIC, pp. 214-221, 2004. Lei He, Andrew B. Kahng, King H. Tam and Jiang Xiong, “Simultaneous Buffer Insertion and Wire Sizing Considering Systematic CMP Variation and Random Leff Variation,” Proc. of ISPD, pp. 78-85, 2005. Atsushi Kurokawa, Toshiki Kanamoto, Tetsuya Ibe, Akira Kasebe, Chang Wei Fong, Tetsuro Kage, Yasuaki Inoue and Hiroo Masuda, “Dummy Filling Methods for Reducing Interconnect Capacitance and Number of Fills,” Proc. of ISQED, pp. 586-591, 2005. 60 DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 30 References Brian E. Stine, Duane S. Boning, James E. Chung, Lawrence Camilletti, Frank Kruppa, Edward R. Equi, William Loh, Sharad Prasad, Moorthy Muthukrishnan, Daniel Towery, Micheal Berman and Ashook Kapoor, “The Physical and Electrical Effects of Metal-Fill Patterning Practices for Oxide Chemical-Mechanical Polishing Processes,” IEEE Trans. on Electron Devices, Vol. 45, No. 3, pp. 665-679, Mar. 1998. J.-K. Park, K.-H. Lee, Y.-K. Park, and J.-T. Kong, “An Exhaustive Method for Characterizing the Interconnect Capacitance Considering the Floating Dummy-Fills by Employing an Efficient Field Solving Algorithm,” Proc. of SISPAD, pp. 98-101, 2000. Dennis Ouma, Duane S. Boning, James Chung, Greg Shinn, Leif Olsen and John Clark, “An Integrated Characterization and Modeling Methodology for CMP Dielectric Planarization,” Proc. of IITC, pp. 67-69, 1998. Keun-Ho Lee, Jin-Kyu Park, Young-Nam Yoon, Dai-Hyun Jung, Jai-Pil Shin, Young-Kwan Park and Jeong-Taek Kong, “Analyzing the Effects of Floating Dummy-Fills: From Feature Scale Analysis to Full-Chip RC Extraction,” Proc. of IEDM, pp.31.3.1-31.3.4, 2001. Yu Chen, Andrew B. Kahng, Gabriel Robins and Alexander Zelikovsky, “Area Fill Synthesis for Uniform Layout Density,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol. 21, No. 10, pp. 1132-1147, Oct. 2002. Ruiqi Tian, D. F. Wong and Robert Boone, “Model-Based Dummy Feature Placement for Oxide Chemical-Mechanical Polishing Manufacturability,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol. 20, No. 7, pp. 902-910, Jul. 2001. Brian Lee, “Modeling of Chemical Mechanical Polishing for Shallow Trench Isolation,” Ph.D. Thesis, MIT, 2002. Dennis Ouma, “Modeling of Chemical Mechanical Polishing for Dielectric Planarization,” Ph.D. Thesis, MIT, 1998. Tae Hong Park, “Characterization and Modeling of Pattern Dependencies in Copper Interconnects for Integrated Circuits,” Ph.D. Thesis, MIT, 2002. Tamba E. Gbondo-Tugbawa, “Chip-Scale Modeling of Pattern Dependencies in Copper Chemical Mechanical Polishing Processes,” Ph.D. Thesis, MIT, 2002. DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 61 Outline Detailed Placement for Process Window Enhancement CMP Fill at 65nm and Below Auxiliary Pattern Methodology for Cell-Based OPC Crosstalk Awareness in SSTA Other DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 62 31 Motivation OPC is mask modification to match photo-resist edge to layout edge It takes a long time • 12 days for OPC + MDP • 30 days for a hot lot to go through entire process It is expensive: many licenses, many CPUs Auxiliary pattern (AP) technique • Minimizes CD difference between cell-based OPC (COPC) and design-based OPC (DOPC) • Enables cell-based timing modeling • Helps OPC runtime and cell re-spins for ECO DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng 63 Andrew B. Kahng The Ideal of Cell-Based OPC Original Standard-Cell GDSII AND2X1 AND2X1 NAND2X2 NAND2X2 NOR2X4 NOR2X4 P&R … … XOR2X8 OPC SBAR OPCed Standard-Cell GDSII XOR2X8 OPCed IC Design Cell-based OPC is a solution for saving of OPC runtime • Master cell layouts are corrected before placement • P&R steps are performed with corrected master cells • OPCed IC design can be completed almost instantly after P&R Æ OPC run time is negligible ( 1~2 hours ) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 64 32 Why Cell-Based OPC Doesn’t Work OPC Re-run Cell without a neighboring cell Cell with a neighboring cell “Optical radius” of pattern interactions is between 4λ and 6 λ (λ=193nm) OPC must be re-corrected in the interaction areas between cells of a standard-cell block DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 65 Cell-Based Timing Modeling Also Fails Standard GDSII Chang SPICE netlist based on PrimtImage AND2X1 AND2X1 NAND2X2 NAND2X2 NOR2X4 NOR2X4 … … XOR2X8 OPC SBAR PrintImage Cell-based Timing-Library XOR2X8 OPC, SBAR and PrintImage are applied to each master cell SPICE netlist of cell is changed based on PrintImage result, and cell timing model is then characterized Problem: Model is inaccurate due to CD errors of gates located near boundaries of cell instances DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 66 33 Auxiliary Pattern (AP) Methodology Observation: nearest neighbor of a given feature is dominant influence on proximity CD error Idea: Insert “auxiliary patterns” (APs) to shield poly patterns near cell boundary from proximity effects AP minimizes CD difference between cell-based OPC and conventional modelbased OPC A Example: “Vertical Type-1 AP”: L=R=50nm Horizontal AP: 40nm (in 90nm processes) Restricted design rule approach needed to maintain required minimum values of A and B • A = Space between border poly and vertical AP • B = Space between border active-layer and vertical AP B L=R Cell Outline 67 Andrew B. Kahng DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Proximity Shielding – Line Body 18 15 Difference Error (nm) S Measurement window Difference (DOPC-COPC w/o AP) 12 Difference (DOPC-COPC w AP) 9 AP 6 3 0 100 180 Test pattern structure 260 340 420 500 580 660 Space (nm) • Width = 100, Pitch = 300, AP-to-outline space = 90nm Maximum CD difference between cell-OPC and standard full design-OPC: • 2.98nm without AP and 0.98nm with AP DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 68 34 Proximity Shielding – Line End 50 Line-end error w/o OPC Difference (DOPC-COPC w/o AP) Difference (DOPC-COPC w AP) Difference (nm) 40 S 30 20 AP 10 0 100 200 300 400 500 600 Space (nm) Test pattern structure • Minimum space between line-ends to insert AP = 320nm Maximum CD difference between COPC and DOPC • 10.1nm without AP and 2.7nm with AP 69 Andrew B. Kahng DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Proximity Shielding – Contact D iffe re n c e E rro r (n m ) 35 S Pitch = 300nm, Width= 100nm Contact overlap = 200x200(nm) 30 Difference (DOPC-COPC w/o AP) Difference (DOPC-COPC w AP) 25 20 15 AP 10 5 0 100 180 260 340 420 500 580 660 Space (nm) Test pattern structure • Space between poly and cell outline = 50nm • Min. space between polys to insert an AP = 380nm Maximum CD difference between COPC and DOPC • 4.37nm without AP and 1.2nm with AP DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 70 35 AP Flow Includes Placement Optimization Standard Cell GDSII Placement Post-Placement Optimization AP Generation SRAF Insertion Route OPC AP-Correct Placement OPCed Standard Cell GDSII OPC GDSII Idea: Use whitespace in the standard-cell block to maximize number of APaugmented cell instances, and benefit from cell-based OPC Recall: *CORR technique in first part of this talk ! DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 71 Result of AP Insertion After Post-Placement Opt (PO) UTIL % “ALU” “AES” W/O PO W PO W/O PO W PO 90% 7683 4906 9573 5054 80% 3802 207 7292 1250 70% 1300 0 3023 0 60% 1204 0 2113 0 50% 702 0 1315 0 Placement Opt tries to put one placement site between cells Post-placement optimization can lead to 100% cell-based OPC with utilizations of < 70% 80+% of model-based OPC work is eliminated at 80% utilizations Æ Cell-Based OPC Becomes Practical !!! DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 72 36 Outline Detailed Placement for Process Window Enhancement CMP Fill at 65nm and Below Auxiliary Pattern Methodology for Cell-Based OPC Crosstalk Awareness in SSTA Other DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 73 Variability Increased variability in nanometer VLSI designs • Process: • OPC Æ Lgate • CMP Æ thickness • Doping Æ Vth • Environment: • Supply voltage Æ transistor performance • Temperature Æ carrier mobility µ and Vth These (PVT) variations result in circuit performance variation PVT Parameter Distributions p2 Gate/net Delay Distribution d2 p1 DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng d1 Andrew B. Kahng 37 Timing Analysis Min/Max-based • Pessimistic Q Corner-based • Intra-die variation • Computational expensive max FF • Inter-die variation combinational logic FF D CLK max Statistical • pdf for delays min • Reports timing yield DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng Block-Based vs. Path-Based Represent signal arrival times as random variables • Block-based • Each timing node has an arrival time distribution • Static worst case analysis • Efficient for circuit optimization • Path-based • Each timing node for each path has an arrival time distribution • Corner-based or Monte Carlo analysis • Accurate for signoff analysis gate delay Arrival pdfs time pdf Arrival time pdf B I 1 A D C DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 38 Corner vs. Statistical Timing DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 77 Key Challenge delay typical case 130nm 90nm 65nm No improvement with worst case sign-off Over-design Æ difficult timing closure How to reduce design margin? DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 78 39 Solutions Reduce process window • • • • Fast yet accurate OPC simulator Accurate RC extraction Timing calculation with “real” RC Reduce systematic variation SSTA • Accurate manufacturing process model (foundry) • SSTA can handle non-Gaussian distribution (EDA) • SI-aware SSTA (EDA) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 79 Example of Emerging Flow Physical design Litho simulation CMP simulation Foundry model RC extraction STA Stat. RCx SSTA Old flow New flow DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 80 40 Current SSTA Tools Main players in SSTA: IBM, Extreme-DA, Magma, Synopsys • Common key features • Ability to handle Global, Spatial and Independently Random variations statistically • Handling of uncorrelated, fully correlated or partially correlated variation parameters, with multiple types of distributions • Sensitivity analysis - Analyze delay/slew sensitivity to particular process parameters enabling improved robustness • Handling correlation in reconvergent paths • Statistical tool kit: min/max/add/sub operations • Common drawbacks • Signal integrity blind • Dynamic variation missing • Can not handle non-Gaussian distribution DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng Current SSTA Tools • IBM • Based on EinsTimer • Emphasis on speed of analysis and optimization/repair • Multi-mode/multi-corner analysis in a single runtime • EinsVAT can analyze mixed corners • Synopsys • Later to market • Emphasis on accuracy • Will support statistical RC extraction • Extreme-DA • Startup • Statistical RC extraction • Handles spatial correlations • Sensitivity analysis • Block-based SSTA • Variational delay calculation DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 41 SSTA Correlations Delays and signal arrival times are random variables Correlations come from • Spatial • inter-chip, intra-chip, random variations • Re-convergent fanout • Multiple-input switching • Cross-coupling • …… g1 corr(g1, g2) g2 corr(g1, g3) corr(g2, g3) g3 DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng Multiple-Input Switching Probability Simultaneous signal switching at multiple inputs of a gate leads to up to 20%(26%) gate delay mean (standard deviation) mismatch [Agarwal-DartuBlaauw-DAC’04] Gate delay DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 84 42 Crosstalk Aggressor Alignment We consider an equally significant source of uncertainty in SSTA, which is crosstalk aggressor alignment induced gate delay variation MIS CAA DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 85 Problem Formulation (SSTA-SI) Given • a system of coupled interconnects with their driver gates • statistical signal arrival time variation at the inputs of the driver gates, and • statistical process parameter variations for the interconnects and their driver gates Find • statistical signal arrival time variations at the outputs of the system DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 86 43 Methodology Process variation extraction Performance characterization Probabilistic symbolic analysis PDF propagation DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 87 Process Variation Extraction A signal arrival time is a function of multiple parameter variabilities process • global (inter-die) • location dependent (intra-die) • purely random Polynomial approximation Principle Component Analysis (PCA) gives a set of uncorrelated r.v.’s smaller x = f (r1, r 2,...) Pr(ri ) = 1 2πσ ri − e ( ri − µ ri ) 2 2σ 2 ri DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 88 44 Performance Characterization Delay calculation for sampled crosstalk alignments Least mean square regression for piecewise quadratic polynomial approximation d2 ⎧ ⎪a + a x '+ a x ' 2 1 2 ⎪⎪ 0 d0 τ=⎨ ⎪ b + b x '+b x ' 2 2 ⎪ 0 1 d1 ⎩⎪ x' < t 0 t 0 < x ' < t1 t1 < x ' < t 2 t 2 < x' < t 3 t 3 < x' DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 89 PDF Propagation Given • Joint probabilistic density function of k random variables x • A piecewise polynomial function y = f(x) Find • Probabilistic density function of y DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 90 45 PDF Propagation Integration of conditional probabilities in the variable space Pr( y = τ ) =∑ r r ∫ Pr( x| y = τ )dx Ri xr ∈Ri =∑ ∫ Pr( x ) Pr( x Ri xr ∈Ri 1 −1 2 )... P( x k = f Ri ( x1 , x 2 ,... x k −1 , y = τ )dx1dx2 ... dx k −1 Analytical inverse function is available for order-d polynomial (d<5) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 91 Signal Integrity Aware Statistical Timing Analysis Input: Coupled interconnects and driver gates input signal arrival time distributions process variations Output: Output signal arrival time distributions 1. 2. 3. 4. Process variation extraction Performance characterization Probabilistic symbolic Analysis PDF propagation DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 46 Implementation STA-SI goes through an iteration of timing window refinement for reduced pessimism of worst case analysis SSTA-SI goes through an iteration of signal arrival time pdf refinement with reduced deviations DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng Runtime Analysis Performance characterization for N sampled crosstalk alignments takes O(N) time, where N = min(t3-t0, 6 σ of crosstalk alignment) / time_step Regression takes O(N) time Computing output signal arrival time distribution takes constant time, e.g., updating in an iterative SSTA DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 47 Experimental Setting Coupled global interconnects and 16X inverter drivers in 70nm Berkeley Predictive Technology Model Extracted coupled interconnects of 451 resistors and 1637 ground and coupling capacitors and 16X inverter drivers in 130nm industry designs 70nm L (um) W(um) S(um) T(um) global 1000 0.45 0.45 1.20 intermediate 200 0.14 0.14 0.35 local 30 0.10 0.10 0.20 DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng Interconnect Delay Distribution 0.35 SPICE (Tr = 10ps) Model (Tr = 10ps) SPICE (Tr = 20ps) Model (Tr = 20ps) SPICE (Tr = 50ps) Model (Tr = 50ps) SPICE (Tr = 100ps) Model (Tr = 100ps) 0.3 Probability 0.25 0.2 0.15 0.1 0.05 0 0 5 10 15 20 25 Interconnect Delay (ps) 30 35 40 For a pair of 1000um coupled global interconnects in 70nm BPTM technology, with 10, 20, 50 and 100ps input signal transition time, and crosstalk alignment in a normal distribution N(0, 10ps) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 48 Driver Gate Delay Distribution For a pair of 1000um coupled global interconnects in 70nm BPTM technology, with 10, 20, 50 and 100ps input signal transition time, and crosstalk alignment distribution N(0, 10ps) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng Interconnect Output Signal Arrival Time Distribution Test case 1: 1000mm interconnects of 70nm BPTM technology Delay Output SPICE Model SPICE 3s Tr(ps) m m m 50 3.83 0.85 3.82 0.83 29.4 16.2 29.7 16.6 0.78 2.46 100 100 3.82 0.92 3.84 0.83 54.8 32.8 55.6 33.9 1.52 3.38 200 200 3.78 0.96 3.78 0.82 105.2 65.9 106.3 67.0 1.06 1.65 50 s s Model s m % diff s m s Test case 2: interconnects in a 130mm industry design Delay Output SPICE Model SPICE 3s Tr(ps) m m m 50 100 4.29 0.16 4.30 0.15 54.5 16.4 53.4 16.1 -2.09 -0.05 100 100 4.30 0.18 4.30 0.17 54.8 32.9 54.1 33.0 -0.17 0.18 200 200 4.25 0.18 4.25 0.16 105.2 65.9 104.9 66.1 -0.28 0.35 s s Model s m DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng % diff s m s Andrew B. Kahng 49 Driver Gate Output Signal Arrival Time Distribution Test case 1: 1000µm interconnects of 70nm BPTM technology Output Delay SPICE Model µ σ Model % diff µ µ µ 3σ Tr(ps) µ 50 52.8 8.86 52.74 8.42 78.6 12.19 77.3 12.66 -1.65 3.86 100 100 61.4 16.0 61.85 15.9 112.9 24.9 113.7 24.43 0.71 –2.16 200 200 74.3 23.2 74.13 23.1 177.4 52.9 173.8 53.83 -2.03 1.72 50 σ SPICE σ σ σ Test case 2: coupled interconnects in a 130nm industry design Delay Output SPICE Model 3σ Tr(ps) µ µ 50 169.7 0.81 168.84 0.8 195.4 16.4 100 200 198.0 1.5 197.68 1.5 200 200 198.8 2.52 198.73 2.5 50 σ σ SPICE Model % diff µ µ µ σ σ σ 193.6 16.2 −0.92 -1.03 299.5 32.96 291.8 33.6 -2.57 1.61 301.9 66.49 297.8 65.8 -1.36 –0.93 DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng Outline Detailed Placement for Process Window Enhancement CMP Fill at 65nm and Below Auxiliary Pattern Methodology for Cell-Based OPC Crosstalk Awareness in SSTA Other DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 100 50 Parametric Yield Optimization – Blaze MO models libraries Design RTL SP&R PV Manufacturing rules Blaze GDSII RET Mask FEOL BEOL Test Design driven • Turn design requirements into manufacturing directives • Intercept at the hand-off Æ the first manufacturing step is software Parametric focus • Improve leakage, timing, variability, and yield No major changes • “Same” data, design flow, golden signoff, manufacturing handoff DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 101 Design-Specific Manufacturing Aggressive leakage reduction objective; desired length 94-96nm Gate on setupcritical path; desired length 88-90nm Gate on holdcritical path; desired length 90-92nm Blaze MO: Silicon QOR impact (even “post-tapeout”) • Small increase in gate length Æ large reduction in leakage power and variability • Benefit to customer: Reduce leakage power by 20%, leakage variability by 30% • Benefit to manufacturing: Same process offers targeted value to customer: power, speed, variability, parametric yield • Blaze MO design kits available from major foundries at 90nm, 65nm *Patent Pending DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 102 51 Blaze MO Results – ARM926 Block Leakage cut by 25%; Variability cut in half 103 Andrew B. Kahng DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Yield Boost at Sort: (A,B) Silicon Results 1 1.0 8.68 0.83 POR POR BLAZE 0.8 BLAZE 0.8 Normalized Yield Normalized Yield 0.83 0.6 0.53 0.4 0.2 0.6 0.53 0.4 0.2 0.0 0 0.0 0.5 1.0 1.5 2.0 8 Normalized IDDQ-1.35V 9 10 11 12 Normalized FMAX1 Blaze MO optimization consistently gives lower IDDQ and higher total yield over the entire FMAX-IDDQ range of interest. DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 104 52 Increased Value Likely in 45nm Node Parametric yield and variability improvements from CD biasing are likely to remain significant at 45nm node At 45nm, multi-Vt knob for leakage reduction may disappear • Reduced supply voltages do not leave enough headroom for HVT device • Æ Gate length biasing is the main leakage reduction technique available at device level For foundry processes, 5nm of CD bias likely to be permitted Example 45nm low-power strategy scenario • Two distinct types of library cell layouts, e.g., with 40nm and 60nm gates • CD biasing range of 40-45nm (positive biasing only) for 40nm gates • CD biasing range of 55-65nm (both negative and positive biasing) for 60nm gates • With this range of available biasing options, gate-length biasing will likely continue to offer significant potential for leakage and variability reduction DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 105 Other Topics of Interest? Restricted layout methodologies? … (your questions here) DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng 106 53