Undergraduate Admissions & College of Engineering

advertisement
IC Manufacturing and Yield
ECE/ChE 4752: Microelectronics
Processing Laboratory
Gary S. May
April 15, 2004
Outline
 Introduction
Statistical Process Control
 Statistical Experimental Design
 Yield

Motivation
IC manufacturing processes must be stable,
repeatable, and of high quality to yield
products with acceptable performance.
 All persons involved in manufacturing an
IC (including operators, engineers, and
management) must continuously seek to
improve manufacturing process output and
reduce variability.
 Variability reduction is accomplished by
strict process control.

Production Efficiency


Determined by actions
both on and off the
manufacturing floor
Design for
manufacturability
(DFM): intended to
improve production
efficiency
Process Design
High
Volume
Manufacturing
Circuit Design
OFF
ON
Variability


The most significant challenge in IC production
Types of variability:
 human error
 equipment failure
 material non-uniformity
 substrate inhomogeneity
 lithography spots
Deformations


Variability leads to => deformations
Types of deformations
1) Geometric:
 lateral (across wafer)
 vertical (into substrate)
 spot defects
 crystal defects (vacancies, interstitials)
2) Electrical:
 local (per die)
 global (per wafer)
Outline
Introduction
 Statistical Process Control
 Statistical Experimental Design
 Yield

Statistical Process Control

SPC = a powerful collection of problem
solving tools to achieve process stability
and reduce variability

Primary tool = the control chart; developed
by Dr. Walter Shewhart of Bell Laboratories
in the 1920s.
Control Charts


Quality characteristic
measured from a
sample versus sample
number or time
Control limits
typically set at  3s
from center line (s =
standard deviation)
Control Chart for Attributes



Some quality characteristics cannot be easily
represented numerically (e.g., whether or not a
wire bond is defective).
In this case, the characteristic is classified as either
"conforming" or "non- conforming", and there is
no numerical value associated with the quality of
the bond.
Quality characteristics of this type are referred to
as attributes.
Defect Chart



Also called “c-chart”
Control chart for total number of defects
Assumes that the presence of defects in samples of
constant size is modeled by Poisson distribution,
in which the probability of a defect occurring is
c x
e c
P( x) 
x!
where x is the number of defects and c > 0
Control Limits for C-Chart

C-chart with ± 3s control limits is given by
UCL  c  3 c
Centerline = c
LCL  c  3 c
(assuming c is known)
Control Limits for C-Chart


If c is unknown, it can be estimated from the
average number of defects in a sample.
In this case, the control chart becomes
UCL  c  3 c
Centerline = c
LCL  c  3 c
Example
Suppose the inspection of 25 silicon wafers yields 37
defects. Set up a c-chart.
Solution:
Estimate c using
37
c
 1.48
25
This is the center line. The UCL and LCL can be found as
follows
UCL  c  3 c  5.13
LCL  c  3 c  2.17
Since –2.17 < 0, we set the LCL = 0.
Defect Density Chart



Also called a “u-chart”
Control chart for the average number of defects
over a sample size of n products.
If there are c total defects among the n samples,
the average number of defects per sample is
c
u
n
Control Limits for U-Chart

U-chart with ± 3s control limits is given by:
UCL  u  3 u / n
Center line =
u
LCL  u  3 u / n
where u is the average number of defects over m groups of
size n
Example
Suppose an IC manufacturer wants to establish a defect density chart.
Twenty different samples of size n = 5 wafers are inspected, and a
total of 183 defects are found. Set up the u-chart .
Solution:
Estimate u using

u
c
183
u 

 1.83
m mn (20)(5)
This is the center line. The UCL and LCL can be found as follows
UCL  u  3 u / n  3.64
LCL  u  3 u / n  0.02
Control Charts for Variables
In many cases, quality characteristics are
expressed as specific numerical
measurements.
 Example: the thickness of a film.
 In these cases, control charts for variables
can provide more information regarding
manufacturing process performance.

Control of Mean and Variance

Control of the mean is achieved using an
chart:
x
x1  x2    xn 1 n
x
  xi
n
n i 1

Variance can be monitored using the s-chart,
where:
1 n
2
s
(
x

x
)
 i
n  1 i 1
-
Control Limits for Mean
UCL  x  3 s 2 / n
Center  x
LCL  x  3 s / n
2
where the grand average is:
x1  x2    xm
x
m
Control Limits for Variance
s
UCL  s  3
1  c42
c4
Center  s
s
LCL  s  3
1  c42
c4
where:
1 m
s   si
m i 1
and c4 is a constant
Modified Control Limits for Mean

The limits for the x -chart can also be written
as:
UCL  x 
LCL  x 
3s
c4 n
3s
c4 n
Example
Suppose x and s-charts are to be established to control
linewidth in a lithography process, and 25 samples of size n
= 5 are measured. The grand average for the 125 lines is 4.01
mm. If s = 0.09 mm, what are the control limits for the
charts?
Solution:
3s
 4.14mm
For the x -chart: UCL  x 
c4 n
LCL  x 
3s
c4 n
 3.88mm
Example
Solution (cont.):
For the s-chart:
s
UCL  s  3
1  c42  0.19 mm
c4
s
LCL  s  3
1  c42  0
c4
Outline
Introduction
 Statistical Process Control
 Statistical Experimental Design
 Yield

Background




Experiments allow us to determine the effects of several
variables on a given process.
A designed experiment is a test or series of tests which
involve purposeful changes to variables to observe the
effect of the changes on the process.
Statistical experimental design is an efficient approach for
systematically varying these process variables and
determining their impact on process quality.
Application of this technique can lead to improved yield,
reduced variability, reduced development time, and
reduced cost.
Comparing Distributions


Consider the following
yield data (in %):
Is Method B better
than Method A?
Wafer
Method A
Method B
1
89.7
84.7
2
81.4
86.1
3
84.5
83.2
4
84.8
91.9
5
87.3
86.3
6
79.7
79.3
7
85.1
86.2
8
81.7
89.1
9
83.7
83.7
10
84.5
88.5
Avg
84.24
85.54
Hypothesis Testing


We test the hypothesis that B is better than A using the
null hypothesis:
H0: mA = mB
Test statistic:
(y  y )
t0 
A
sp
where:
B
1
1

n A nB
are sample means of the yields, ni are
number of trials for each sample, and
yi
2
2
(
n

1
)
s

(
n

1
)
s
A
B
B
s 2p  A
n A  nB  2
Results





Calculations: sA = 2.90 and sB = 3.65, sp = 3.30, and t0 =
0.88.
Use Appendix K to determine the probability of computing
a given t-statistic with a certain number of degrees of
freedom.
We find that the likelihood of computing a t-statistic with
nA + nB - 2 = 18 degrees of freedom = 0.88 is 0.195.
This means that there is only an 19.5% chance that the
observed difference between the mean yields is due to pure
chance.
We can be 80.5% confident that Method B is really
superior to Method A.
Analysis of Variance




The previous example shows how to use
hypothesis testing to compare 2 distributions.
It’s often important in IC manufacturing to
compare several distributions.
We might also be interested in determining which
process conditions in particular have a significant
impact on process quality.
Analysis of variance (ANOVA) is a powerful
technique for accomplishing these objectives.
ANOVA Example




Defect densities (cm-2)
for 4 process recipes:
k = 4 treatments
n1 = 4, n2 = n3 = 6, n4
= 8; N = 24
Treatment means:
y1  61 y2  66

y3  68 y4  61
Grand average:
y  64
1
62
2
63
3
68
4
56
60
63
59
67
71
64
65
66
71
67
68
62
60
61
63
66
68
64
63
59
Sums of Squares

Within treatments:
nt
k
S R   ( yti  yt ) 2
t 1 i 1

Between treatments:
k
ST   nt ( yt  y ) 2
t 1

Total:
k
nt
S D   ( yti  y ) 2
t 1 i 1
Degrees of Freedom

Within treatments:  R  N  k

Between treatments:

Total:  D  N 1
 T  k 1
Mean Squares

Within treatments: sR2  S R / R

Between treatments:

Total: sD2  S D / D
sT2  ST / T
ANOVA Table for Defect Density
Source
Sum of
Squares
Degrees of
Freedom
Mean
Square
F-ratio
Between
Treatments
ST = 228
T = 3
sT2 = 76.0
sT2/sR2=
13.6
Within
Treatments
SR = 112
R = 20
sR2 = 5.6
Total
SD = 340
D = 23
sD2 = 14.8
Conclusions




If null hypothesis were true, sT2/sR2 would follow
the F distribution with T and R degrees of
freedom.
From Appendix L, the significance level for the Fratio of 13.6 with 3 and 30 degrees of freedom is
0.000046.
This means that there is only a 0.0046% chance
that the means are equal.
In other words, we can be 99.9954% sure that real
differences exist among the four different
processes in our example.
Factorial Designs




Experimental design: organized method of conducting
experiments to extract maximum information from limited
experiments
Goal: systematically explore effects of input variables, or
factors (such as processing temperature), on responses
(such as yield)
All factors varied simultaneously, as opposed to "onevariable-at-a-time“
Factorial designs: consist of a fixed number of levels for
each of a number of factors and experiments at all possible
combinations of the levels.
2-Level Factorials




Ranges of factors discretized
into minimum, maximum and
"center" levels.
In 2-level factorial, minimum
and maximum levels are used
together in every possible
combination.
A full 2-level factorial with n
factors requires 2n runs.
Combinations of a 3-factor
experiment can be represented
as the vertices of a cube.
(-1,1,1)
(1,1,1)
(-1,-1,1)
(-1,1,-1)
(1,1,-1)
(-1,-1,-1)
(1,-1,-1)
3
2


Factorial CVD Experiment
Run
1
Factors:
temperature (T),
2
pressure (P), flow
3
rate (F)
4
5
Response:
deposition rate (D)
6
7
8
P
+
+
+
+
T
+
+
+
+
F
+
+
+
+
D (Å/min)
d1=94.8
d2=110.96
d3=214.12
d4=255.82
d5=94.14
d6=145.92
d7=286.71
d8=340.52
Main Effects


Effect of any single variable on the response
Computation method: find difference between average
deposition rate when pressure is high and average rate
when pressure is low:
P = dp+ - dp- = 1/4[(d2 + d4 + d6 + d8) - (d1 + d3 + d5 + d7)] =
40.86



where P = pressure effect, dp+ = average dep rate when
pressure is high, dp- = average rate when pressure is low
Interpretation: average effect of increasing pressure from
lowest to highest level increases dep rate by 40.86 Å/min.
Other main effects for temperature and flow rate computed
in a similar manner
In general: main effect = y+ - y-
Interaction Effects


Example: pressure by temperature interaction (P × T).
This is ½ difference in the average temperature effects at
the two levels of pressure:
P × T = dPT+ - dPT- = 1/4[(d1 + d4 + d5 + d8) - (d2 + d3 + d6 +
d7)] = 6.89


P × F and T × F interactions are obtained similarly.
Interaction of all three factors (P × T × F): average
difference between any two-factor interaction at the high
and low levels of the third factor:
P × T × F = dPTF+ - dPTF- = -5.88
Yates Algorithm




Can be tedious to calculate effects and interactions
for factorial experiments using the previous
method described above,
Yates Algorithm provides a quicker method of
computation that is relatively easy to program
Although the Yates algorithm is relatively
straightforward, modern analysis of statistical
experiments is done by commercially available
statistical software packages.
A few of the more common packages: RS/1, SAS,
and Minitab
Yates Procedure







Design matrix arranged in standard order (1st column has alternating and + signs, 2nd column has successive pairs of - and + signs, 3rd
column has four - signs followed by four + signs, etc.)
Column y contains the response for each run.
1st four entries in column (1) obtained by adding pairs together, and
next four obtained by subtracting top number from the bottom number
of each pair.
Column (2) obtained from column (1) in the same way
Column (3) obtained from column (2)
To get the Effects, divide the column (3) entries by the Divisor
1st element in Identification (ID) column is grand average of all
observations, and remaining identifications are derived by locating the
plus signs in the design matrix.
Yates Algorithm Illustration
P
T
F
y
-
-
-
94.8
+
-
-
-
+
+
(1)
(2)
(3)
Div
Eff
ID
205.76 675.70
1543.0
8
192.87 Avg
110.96
469.94 867.29
163.45
4
40.86
P
-
214.12
240.06
57.86
651.35
4
162.84
T
+
-
255.82
627.23 105.59
27.57
4
6.89
PT
-
-
+
94.14
16.16
264.18
191.59
4
47.90
F
+
-
+
145.92
41.70
387.17
47.73
4
11.93
PF
-
+
+
286.71
51.78
25.54
122.99
4
30.75
TF
+
+
+
340.52
53.81
2.03
-23.51
4
-5.88
PTF
Fractional Factorial Designs




A disadvantage of 2-level factorials is that the
number of experimental runs increasing
exponentially with the number of factors.
Fractional factorial designs are constructed to
eliminate some of the runs needed in a full
factorial design.
For example, a half fractional design with n
factors requires only 2n-1 runs.
The trade-off is that some higher order effects or
interactions may not be estimable.
Fractional Factorial Example




23-1 fractional factorial
design for CVD
experiment:
New design generated by
writing full 22 design for P
and T, then multiplying
those columns to obtain F.
Drawback: since we used
PT to define F, can’t
distinguish between the P
× T interaction and the F
main effect.
The two effects are
confounded.
Run
P
T
F
1
-
-
+
2
+
-
-
3
-
+
-
4
+
+
+
Outline
Introduction
 Statistical Process Control
 Statistical Experimental Design
 Yield

Definitions


Yield: percentage of devices or circuits that meet a
nominal performance specification.
Yield can be categorized as functional or
parametric.
 Functional yield - also referred to as "hard
yield”; characterized by open or short circuits
caused by defects (such as particles).
 Parametric yield – proportion of functional
product that fails to meet performance
specifications for one or more parameters (such
as speed, noise level, or power consumption);
also called "soft yield"
Functional Yield
Y = f(Ac, D0)
Ac = critical area
(area where a defect
has high probability
of causing a fault)
D0 = defect density (#
defects/unit area)
Poisson Model



Let: C = # of chips on a wafer, M = # of defect
types
CM = number of unique ways in which M defects
can be distributed on C chips
Example: If there are 3 chips and 3 defect types
(such as metal open, metal short, and metal 1 to
metal 2 short, for example), then there are:
CM = 33 = 27
possible ways in which these 3 defects can be
distributed over 3 chips
Unique Fault Combinations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
C1
M1M2M3
M1M2
M1M3
M2M3
M1M2
M1M3
M2M3
M1
M2
M3
M1
M2
C2
C3
15
M1M2M3
16
M1M2M3 17
M3
18
M2
19
M1
20
M3
21
M2
22
M1
23
M2M3
24
M1M3
25
M2M1
26
M2M3
27
M1M3
C1
M3
M1
M1
M2
M2
M3
M3
C2
M1M2
M1M3
M2M3
M1
M2
M3
M2
M3
M1
M3
M1
M2
C3
M2M1
M3
M2
M1
M2M3
M1M3
M2M1
M3
M2
M3
M1
M2
M1
Poisson Derivation

If one chip contains no defects, the number of ways to distribute M
defects among the remaining chips is:
(C - 1)M

Thus, the probability that a chip will have no defects of any type is:
 C – 1 M
1 M

--------------------- =  1 – ---- 
M
C
C

Substituting M = CAcD0, yield is # of chips with zero defects, or:
 1
Y  lim 1  
C 
 C

CAc D0
 exp(  Ac D0 )
For N chips to have zero defects this becomes:
Y  exp(  Ac D0 ) N  exp(  NAc D0 )
Murphy’s Yield Integral



Murphy proposed that defect density should not be constant.
D should be summed over all circuits and substrates using a
normalized probability density function f(D).
The yield can then be calculated using the integral

Y   e  Ac D0 f ( D)dD
0

Various forms of f(D) exist and form the basis for many
analytical yield models.
Probability Density Functions
Poisson Model


Poisson model assumes f(D) is a delta function:
f(D) = d(D - D0)
where D0 is the average defect density
Using this density function, the yield is

Y   e  Ac D0 f ( D)dD  exp(  Ac D0 )
0
Uniform Density Function


Murphy initially investigated a uniform density
function.
Evaluation of the yield integral for the uniform
density function gives:
2 D0 Ac
1 e
Yuniform 
2 D0 Ac
Triangular Density Function


Murphy later believed that a Gaussian distribution would
be a better reflection of the true defect density function.
He approximated a Gaussian function with the triangular
function, resulting in the yield expression:
1 e
Ytriangular  
 2 D0 Ac
 D0 Ac




2
The triangular model is widely used today in industry to
determine the effect of manufacturing process defect
density.
Seeds Model


Seeds theorized high yields were caused by a large
population of low defect densities and a small proportion
of high defect densities
D
1


f
(
D
)

exp
He proposed an exponential function:
D
D
0


0

This implies that the probability of observing a low defect
density is higher than observing a high defect density.
Substituting this function in the Murphy integral yields:
Yexponential


1

1  D0 Ac
Although the Seeds model is simple, its yield predictions
for large area substrates are too optimistic.
Negative Binomial Model
f(D)
a=3
a=2
a=1
D/D 0



Uses Gamma distribution
Density function: f(D) = [G(a)ba]1Da1eD/b
Average defect density is D0 = ab
Negative Binomial (cont.)
a

Yield:

a = “cluster” parameter (must be empirically determined
 a high: variability of defects is low (little clustering);
gamma function approaches a delta function; negative
binomial model reduces to Poisson model
 a low: variability of defects is significant (much
clustering); gamma model reduces to Seeds exponential
model
If the Ac and D0 are known (or can be measured), negative
binomial model is an excellent general purpose yield
predictor.

 Ac D0 
Ygamma  1 

a 

Parametric Yield


Evaluated using “Monte Carlo” simulation
 Let all parameters vary at random according to
a known distribution (usually normal)
 Measure the distribution in performance
Recall:
W
2
I Dnsat  m nCox VGS  VTn 
L

Or: IDnsat = f (tox, VTn)
Input Distributions

Assume: mean (m) and standard deviation (s) are
known for tox, VTn
tox

VTn
Calculate IDnsat for each combination of (tox, VTn)
Output Distribution
f(x)
IDnsat
(bad devices) c
b
(moderate devices)


Yield (best parts) =
 f ( x)dx
a
c

Yield (worst parts) =
 f ( x)dx

a
x
(good devices)
Download