- Stanford University

advertisement
Design for Yield Using Statistical Design
Fabian Klass
Director of Technology and Manufacturing
EE 380 Computer Systems Colloquium, Stanford University
February 7, 2007
8/31/05
P.A. Semi, Inc. - Company Confidential
1
Outline
About P.A.Semi
Process Variability
Statistical Models
Circuit Examples
Statistical Timing
PVT Margin
Test Structures
CAD Challenges
Summary
2
About P.A. Semi
Santa Clara-based fabless processor company
Power ArchitectureTM Licensee
Design our own Power Architecture processors
Only 3rd company after IBM and Freescale
Noted industry veterans combine in 150-strong organization
Venture backed by Bessemer, Venrock, and Highland Capital
Currently engaged with over 100 customers across different
market segments
Strategically partnered with IBM
Breakthrough processor solution focused on low power
@ high performance
Scalable 64-Bit Power multicore architecture
Redefines high performance (2GHz) at ultra low power (4W)
39 patents filed and 11 more patents in progress towards filing
3
Target Markets
Compute
Server Blades
Digital
Entertainment
Embedded
Boards
Routers
Critical Requirements
Power Efficiency
High Performance
Cost Efficiency
Throughput Efficiency
Open source OS/Tools etc
Game
Players
Imaging
Systems
Storage
Systems
Switches
Wireless
Basestations
4
The Challenge
Power
Management
Process
Variability
5
Process Variability
Three main reasons why process variability has
become so important:
Moore's law:
Exponential growth in device integration
Billions of devices per die in 65nm and beyond
Shrinking devices:
Gate oxides approaching a few Angstroms
Fewer dopants under the gate (~102)
Ultra low VDD:
VDD scaling < 1V to manage power.
Vt not scaling, limited by leakage.
Less headroom, more sensitivity to ∆Vt.
6
Process Variation
Global: die-to-die, wfr-to-wfr,
and lot-to-lot variations caused
by changes in:
Tox
Xtor W & L
N/PWELL doping
N/PMOS flatband voltage
Stress-induced effects
Global
Local: within-the-die variations
caused by:
Xtor W & L mismatch
Vt mismatch
ACLV
Local
7
Width and Length Mismatch
Caused by variations in the lithographic process
Width and Length variations are uncorrelated
Small transistors more sensitive to W/L changes
Intel 65nm 6T SRAM cell
65nm CMOS NAND cell
8
Vth Mismatch
Random fluctuations due to
relatively small number of
dopants in the channel
Vth=0.78V
Vth variance is inversely
proportional to transistor
area
170
Dopants
Pelgrom's Law:
V th =K / W × L
Vth=0.56V
Source: Asenov A. “Random Dopant Induced Threshold Voltage Lowering and Fluctuations in Sub-0.1 um
MOSFET's: A 3-D 'Atomistic' Simulation Study” IEEE Trans. On Electron Devices, Vol 45, No 12, Dec 1998
9
Statistical Models
Provided by most
foundries
More realistic than
corner models.
NMOS
SF
Cover the full design
space.
TT
Foundries typically
offer a 3σ process.
The number of local
sigma is determined
by the designer.
FF
1 23
SS
45
σL (Local)
1 2
3 4
σG (Global)
FS
PMOS
10
Monte Carlo
Monte Carlo involves simulating a circuit over a wide
range of randomly chosen devices parameters
The result is a distribution plot of design constraints,
e.g., delay or noise margin
Typically tens of thousands simulations needed,
including Vdd and Temp sweeps
Delay
Distribution
0.400
0.300
0.200
0.100
0.000
Vth
3. 2. 2. 2. 2. 2. 1. 1. 1. 1. 1. 0. 0. 0. 0. 0. -0 -0 -0 -0 -1 -1 -1 -1 -1 -2 -2 -2 -2 -2 -3
0 8 6 4 2 0 8 6 4 2 0 8 6 4 2 0 .2 .4 .6 .8 .0 .2 .4 .6 .8 .0 .2 .4 .6 .8 .0
Delay
11
When to use statistical analysis
Usage limited to process-sensitive circuits:
Races
Contention
Mismatch
Usage limited to high-usage circuits:
SRAM cells
Register file cells
Flip-flops
Sensamps
Usage limited to highly-critical circuits:
Max and min critical paths
12
How many Sigmas?
Failure criteria:
(µ - N σG – M σL) > Safe Margin
where:
µ is the mean
N is determined by the foundry and is typically 3.
M is determined by the number of instances of the circuit
being analyzed:
# of instances
100
1,000
10,000
100,000
1,000,000
10,000,000
M
2.33
3.09
3.72
4.26
4.75
5.20
Example: 1MB SRAM needs M=5 sigma for the bit design.
13
Example 1: 6T SRAM Cell
Find stable VDD window for
6T SRAM cell (1MB)
Flow:
Run Monte Carlo SNM sims
Find µ, σG, & σL across VDD
Define safe margin
Plot 3σG and 5σL curves
0.250
Mean
Find Vdd window where
SNM > Safe margin.
SNM [MV]
0.200
3σG
0.150
0.100
Safe 0.050
Margin
5σL
0.000
0.5
0.55
0.6
0.65
0.7
0.75
0.8
Vdd [V]
0.85
0.9
0.95
1
14
Example 2: Sense Amplifier
Find min VDIFF for sensamp
Flow
Run Monte Carlo
Plot passing ratio vs. ∆Vin
Find µ & σL for sensamp
Find µ & σL for SRAM Iread
100%
90%
Min VDIFF:
70%
 SA
1− M ×
2
IREAD
/ IREAD  
PASS
V DIFF =
80%
60%
50%
40%
30%
σSA
20%
10%
0%
-200
-150
-100
-50
0
50
100
Delta Vin [mV]
150
200
15
Other Circuits
Other possible applications for statistical circuit design:
Dynamic logic
Latches
Register files cells
Pulsed flops
Level shifters
Analog circuits
Advantages:
All circuits designed to a target sigma
Avoid weak links
Avoid overdesign
16
Statistical Timing
Each gate has a mean and sigma. Sigmas can be
computed using Monte Carlo
The sigma of a path is determined by adding (i.e., sumsquare) the sigmas of individual gates
Critical Path
σ2
σ1
σn
 Path = 21 22... 2n
17
Speed Distribution
Each chip has a local distribution on top of the global
distribution due to local variations
Not all parts within a [+3σ, -3σ] window will yield above
target due to local variations
Speed Distribution
Local
σLocal
+3σGlobal
Global
σGlobal
−3σGlobal
18
OCV Ratio and Yield
On-chip Variability (OCV) = σLocal
OCV Ratio = σLocal / σGlobal
Speed yield strongly dependent on OCV ratio
Speed Distribution
0.400
NormDist
0.300
OCV Ratio
0.200
OCV Ratio=2.0
OCV Ratio=1.5
OCV Ratio=1.0
OCV Ratio=0.5
OCV Ratio=0.0
0.100
0.000
3. 2. 2. 2. 2. 2. 1. 1. 1. 1. 1.Speed
0. 0. 0. 0. 0. -0 -0 -0 -0 -1 -1 -1 -1 -1 -2 -2 -2 -2 -2 -3
0 8 6 4 2 0 8 6 4 2 0 8 6 4 2 0 .2 .4 .6 .8 .0 .2 .4 .6 .8 .0 .2 .4 .6 .8 .0
19
Yield Examples
Speed yield is affected by the shape of the timing
histogram:
These three histograms have
very different speed yields
(OCV Ratio=1.5):
These three histograms have
the same speed yield (OCV
Ratio=0.9):
Slack
90ps
60ps
30ps
0
-30ps
Top Path [ps]
2GHz Yield
Slack
90ps
60ps
30ps
0
-30ps
Top Path [ps]
2GHz Yield
Hist#1
10000
1000
100
10
0
500
37.5%
Hist#1
10000
1000
100
10
0
500
89.8%
Hist#2
1000
100
10
0
0
470
74.0%
Hist#2
10000
1000
350
0
0
470
89.8%
Hist#3
1000
100
10
5
1
530
60.8%
Hist#3
10000
1000
100
0
1
530
89.8%
20
Margining Races
Races need to be margined for PVT variations.
Fixed PVT margin (conventional):
D 21 =D2− D1m× D 1 D 2 D21 
Fixed PVT sigma:
D 21 =D 2− D 1M    D 1D 2D 21 
Drawbacks of fixed margin:
Pessimistic for long delays
Optimistic for short delays
D21
D1
D2
Advantages of fixed sigma:
Accurate (pseudo-statistical)
M can be tuned for a specific design
21
Margining Races (cont.)
Fixed sigma:
PVT margin varies with the logic depth
PVT margin varies with Vdd
50.0%
PVT Margin [%]
40.0%
Lower
Vdd
30.0%
20.0%
10.0%
0.0%
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18
Logic Depth
22
Measuring Process Variability
Test structures were developed to measure process
variability.
A testchip was built in a 65nm, triple-Vt, dual-oxide
CMOS process.
Data was collected across dies, wafers, lots, and across
voltage and temperature.
Measured data was used to:
Validate statistical SPICE models
Monitor process development
Determine design margins
Predict circuit limited yield
23
A Racer Circuit
A Racer circuit measures
on-die process variations
in Si.
> 100 copies of the Racer
module are placed
across the die.
The spread in the
location of the leading
“1” provides an
indication of the process
variability.
1/2
0
FF
FF
FF 1
1
FF
FF 1
0
FF
FF 0
1
clk_en
CKG
MUX
XOR/
XNOR
0
FF
FF
1
FF
FF
0
0
CKG
1
0
select_0_L
select_1_L
MUX
2
4
6
clk
24
Racer Results
Bar Chart
Yield @ 4 Sigma
100%
800
600
400
200
0
1000
800
600
400
200
0
Lower Vdd
Racer data shows large spreads
at low Vdd.
Data can be used to predict
circuit yield across Vdd.
Low Vdd is the yield limiter!
1000
1000
800
600
400
200
0
1000
800
600
400
200
0
Yield
75%
1000
800
600
50%
Lower Vdd
400
200
0
1000
25%
800
600
400
200
0%
3%
6%
8%
10%
11%
#Inverters
13%
14%
15%
17%
0
0
1
2
3
4
5
6
7
8
#Inverters
9
Binned Inverter Count
25
10
A Leaker Circuit
A Leaker circuit measures
leakage spread (Ioff/Ion) in
Si.
It measures Ioff/Ion by
sensing a tied-off skewed
inverter with a 2P:1N
inverter and latching to a
flop.
Multiples copies of the
leaker module are placed
on the die.
Separate modules are used
for standard Vt, low Vt, and
high Vt devices.
1024xWN
LK
FF 1
512xWN
LK
FF 1
LK
FF 1
LK
FF
LK
FF
2xW
N
WN
0
0
clk
WP
WN
26
Leaker Results
IOFF/ION RATIO
50
45
DISTRIBUTION
40
35
30
HVT
SVT
25
LVT
20
15
10
5
0
0
1
2
3
4
5
6
7
8
9
IOFF/ION RATIO (Log2)
10
11
12
IOFF/ION RATIO MEAN + 4 SIGMA
20.0
IOFF/ION RATIO [Log2]
Leaker data was collected
across voltage and
temperature.
Distributions were
generated and µ/σ data
was obtained.
Ioff/Ion ratio worse at low
Vdd for all Vt devices
Ioff/Ion ratio worse at high
temperature for all Vt
devices
18.0
16.0
14.0
12.0
HVT
SVT
10.0
LVT
8.0
Safe
Margin
6.0
4.0
2.0
0.0
0.66V
0.76V
VDD
0.91V
1.1V
27
CAD Challenges
Applications for statistical design:
Timing
Power
ERC
Reliability
Main Challenges:
Run time: Running Monte Carlo on a library would take
years!
Tools need to be 'context aware': Ex:Timing optimization
depends on the shape of the timing histogram
Pseudo-statistical approach
Using statistical methods without running Monte Carlo.
28
CAD Challenges (cont.)
Cell based designs
Library characterization should produce µ,σ.
Timing analyzer output should be speed yield.
Transistor level design
In-situ characterization to generate µ,σ
Timing analyzer to create µ,σ for macro
ERC/Reliability
Statistically derived design rules
Waivers based on distributions and yield impact
Yield, Yield, Yield
Tools should predict yield as a metric for signoff.
29
CAD: What's missing
Tool Integration
Integration of DFM and DFY tools to predict:
Manufacturing yield
Functional Yield
Speed yield
Overall product yield
Validation
Validation of DFY tools in Silicon
Justification of investment
30
Summary
Ignoring process variability may lead to non-functional
designs or suboptimal yields.
DFY will become more relevant as Vdd continues to
scale and device geometries keep shrinking.
Circuit solutions alone will not be sufficient if Moore's
law continues.
Process variability need to be handled at higher levels
of the design process
Future designs will incorporate:
Self-checking logic
Self-correcting logic
Redundant logic (besides SRAMs)
Wearout compensation mechanisms.
31
Thank You
The P.A. Semi name and the P.A. Semi logo and combinations thereof are trademarks of P.A. Semi, Inc.
The Power name is a trademark of International Business Machines Corporation, used under license therefrom.
SPECint and SPECfp are registered trademarks of the Standard Performance Evaluation Corporation (SPEC).
All other trademarks are the property of their respective owners.
8/31/05
P.A. Semi, Inc. - Company Confidential
32
Download