Oxide Etch Rate Estimation Using Plasma Impedance Monitoring

advertisement
SEMICONDUCTOR MANUFACTURING
1
Oxide Etch Rate Estimation Using Plasma
Impedance Monitoring
Daniel Tsunami, Student Member, IEEE
Abstract— The oxide etch rate of plasma etch tools is estimated
from plasma impedance monitoring data. Linear statistical modelling and stepwise regression are used to generate predictions of
the mean and range oxide etch rate. The relationship of the mean
etch rate to yield is explored for one processing step. Potentials
for advanced process control of the etch tools are presented.
Index Terms— Etch Rate Estimation, Plasma Impedance Monitoring
I. I NTRODUCTION
P
LASMA impedance monitoring (PIM) is used for the
monitoring of reactive ion etching processes in order to
characterize their operation. The use of PIM to characterize
or “fingerprint” the chamber of a plasma etch tool is a
simple approach to ensuring that the tool characteristics are
unchanged from a known state. The state of the chamber is
evaluated during production as some metric representing the
distance between the current state and a ideal state. If this
distance becomes to large the tool is considered out of control
and a fault is signaled. The knowledge based control paradigm
extends this fault detection by specifying possible modes of
failure when a fault is detected. These models of fault detection
usually require intervention to diagnose and correct the error
once it has been signaled.
Advanced process control (APC) uses real time tool data
as the input to adaptive an adaptive controller which alters
tool settings to ensure correct operation. The etch rate of a
plasma etch tool chamber is the most important parameter
for the precise control of etch steps. Etch recipes are often
designed conservatively when real time measurements of the
etch rate cannot be obtained. This is the case for oxide etches
where the etch step clears the oxide layer so ideally there is no
material left that a thickness measurement could be obtained
from in order to estimate the etch rate. Alternatively the etch
endpoint is actively detected and used to determine the etch
time. The standard methods of endpoint detection often are
not applicable due to the small ratio of material being etched
which results if negligible changes in plasma chemistry that
are difficult to detect.
Oxide etch recipes are consequently designed conservatively
to overetch by enough of a margin that the variability of the
etch rate does not result in incomplete clearing of the oxide
and possible opens or high impedance connections. The etch
rate can be measured using special wafers and test procedures
on some regular interval to ensure that the etch remains
within some broad specifications. This approach provides a
reasonable level of control of the etch processes, however
the variability of the etch rate does produce a yield impact
for sensitive steps such as local interconnect (LI). Ideally a
realtime estimate of the etch rate could be used to control the
tool settings to remove this yield impact.
The hypothesis that the plasma impedance data collected
during the etching of production wafers contains some information about the etch rate of the chamber was investigated.
Little has been published on the use of plasma impedance
for the prediction or estimation of etch rate. Garvin and Grizzle
describe their broadband radio frequency sensing technology
to empirically and parametrically estimate the etch rate using
a multiple regression technique and a stepwise approach to
subset selection to reduce multicollinearity [2]. The biased
coefficient of multiple determination was used to quantify the
performance of regression models in a 5 run experiment in
which the 5th run was estimated from the first four yielded a
R2 equal to 0.997 and 0.962 for the empirical and parametric
regression models respectively. The broadband sensing gives
a estimate of the impedance at a range of frequencies and
is captured over a short interval time by actively driving the
plasma. The predictor variables used to generate the etch
rate estimates were the values of the impedance at various
frequencies. The plasma impedance data used in this study
was collected throughout the entire etch steps and summary
statistics rather than raw values were used as possible predictor
variables. Also the wafers used in [2] were test wafers not
product and the situations were those of a controlled experiment not normal production. Another significant difference
is that the etch rates estimated in [2] are for polysilicon not
silicon dioxide.
Other prior works which describe the estimation of etch
rate using plasma impedance measurement are less relevant,
Ently et al describe their attempts to maximize the etch rate
in order to reduce CFC emissions [1]. Kim et al used neural
networks to predict etch rate from flow rates, applied power
and pressure [3].
This paper describes the used of standard linear statistical
models to the problem of estimating the oxide etch rate of
Lam Research Exlan etch tools using plasma impedance data.
The plasma impedance data was collected using Straatum
SmartPIM Plasma Impedance Monitoring devices.
II. M ETHODOLOGY
The plasma impedance data is recorded as several harmonics
of voltage, current and phase for two frequencies, 27 MHz
and 2 MHz. The 2 MHz voltage applied is automatically
adjusted between 1.8 MHz and 2.2 MHz to minimize the
reflected power. The power is only applied at the fundamental
frequencies but the plasma presents a nonlinear load which
Mean Etch Rate 10−10 m/min
lower 1st
lower 2nd
lower 3rd
lower 4th
12
10
8
6
4
2
0
4
Plasma Voltage (V)
2
7000
6800
6600
6400
6200
0
50
100
150
Etch Rate Range 10−10 m/min
Plasma Current (mA)
SEMICONDUCTOR MANUFACTURING
200
lower 1st
lower 2nd
lower 3rd
lower 4th
3
2
1
0
50
100
0
50
100
150
200
250
150
200
250
5000
4000
3000
2000
1000
0
Time (days)
0
0
50
100
150
200
Fig. 4. Example SPC chart of the etch rate mean and range from January
to August 2004 on a single chamber.
Time (s)
Fig. 1. Voltage and current waveforms for the 1st through 4th harmonic
during a typical oxide etch.
−3
4.5
x 10
pm1
pm2
pm3
pm4
4
Estimated pdf
3.5
3
2.5
2
1.5
1
0.5
0
6300
6400
6500
6600
6700
6800
6900
7000
Etch Rate 10−10m/min
Fig. 2. PDF estimates of the etch rate measurements as obtained from test
wafers over the interval form July 2003 to July 2004.
Etch Rate measurements Date 27−Feb−2003 08:08:00
100
7000
1
80
2
60
Vertical Location (mm)
5
4
Etch Rate 10−10 m/min
3
40
6
20
7
8
9
10
0
11
−20
12
13
14
−40
15
16
−60
17
−80
−100
−100
Fig. 3.
−50
0
50
Horizontal Location (mm)
100
6300
Sample locations for the etch rate measurements on test wafers.
generate power at several harmonics of each of the fundamental frequencies. The fundamental and the first four harmonics
are recorded for the 2 MHz signal and the fundamental and
the first two harmonics are recorded for the 27 MHz signal.
An example of the waveforms produced for several harmonics
of voltage and current are shown in Fig. 1.
The main etch step occurs during only a portion of the
etch recipe. The voltage, current, and phase waveforms were
extracted for the main etch step and the plasma strike step
which immediately precedes it. 660 different summary statistics were calculated from the extracted waveforms to provide
a pool of potential predictor variables. The statistics that
were calculated from the waveforms included the integral,
mean, variance, median, interquartile range of the sampled
voltage, current, phase, frequency, real power, complex power,
impedance magnitude, resistance and reactance. The mean,
variance, median, interquartile range of the derivatives of the
sampled signals were also calculated. In addition the maximum of each extracted signal was also included as predictor
variables.
The etch rates are measured nominally every seven days
but perhaps much more often if the tool has undergone
maintenance or it is known that a out of control condition
exists. The estimated probability density functions of the 1st
to 95th percentile range of the etch rate measurements taken
over an one year interval from July 2003 to July 2004 in 4
separate chambers of etch tool is shown in Fig. 2. Traditionally
the test wafer is sampled in 17 places and the mean and
range are plotted in statistical process control (SPC) chart.
The sample locations and an example of the SPC chart
used are shown in Figs 3 and 4. The specifications for the
mean and range are also shown in Fig. 4. Excursions outside
the specification limits have been observed to be indicative
of chamber conditions that produce poor yielding wafers.
Excursions of the range do not in all cases indicate that adverse
conditions exist. Consequently only the results concerning the
mean etch rate will be emphasized.
The mean of each of the seventeen sample locations for each
etch rate measurement was used to supervise the regression
modelling. The measurements were mapped to PIM data
obtained from production wafers that had been processed
SEMICONDUCTOR MANUFACTURING
3
shortly after the measurements were taken. The interval after
the etch rate measurement in which production PIM data was
taken was adjusted according to the sparsity of the data for
different tool/chamber/recipe/device combinations to enable
adequate estimation of the model parameters. The valid data
interval was either shrunk or expanded until 120 samples were
selected. The PIM data samples were then mapped to the etch
rate which they closely followed.
The normal assumptions were used and the form of the
standard linear statistical model shown in 1 was assumed.
er = Xβ + ²
Measurements
Predictions
50
50
where RSS0 is the residual sum of squares of the original
model with out the additional variable, RSS1 is the residual
sum of squares of the model with the additional variable, p1
is the number of variables present in the larger model, and n
is the number of samples. Using the normal assumptions the
statistic 2 can be shown to have a F distribution with 1 and
n − p1 degrees of freedom. The variables were dropped if any
exhibited a T statistic smaller than the 95th percentile of the
T distribution with the degrees of freedom corresponding to
that stage. The T statistic of the jth coefficient is given as
β̂j
√
σ̂ vj
150
200
250
(1)
100
150
Days since Jan 1 2004
200
250
Fig. 5. PM1 etch rate mean and range predictions, R2 = 0.96, 0.93 for the
mean and range respectively.
21609 PM1 LIG12LOW
Data Points
Kernel Regression
95% Confidence Intervals
0.9
0.85
0.8
Yield
That is, er is the nx1 vector of etch rate measurements, X
is the nxp matrix of predictor variables, β is a px1 vector
of linear coefficients to be determined and epsilon is an nx1
vector of independent error terms that are normally distributed
with zero mean and constant variance σ 2 .
Forward stepwise regression was applied due to the relatively small amount of data relative to the 660 potential
predictor variables. Additional variables were added, selecting
at each stage the potential predictor with the largest F statistic
that was also over the acceptance threshold of the 95th
percentile of the F distribution with the degrees of freedom
corresponding to that stage which can be expressed as
RSS0 − RSS1
F=
(2)
RSS1 /(n − p1 )
T=
100
0.75
0.7
0.65
0.6
6600
6650
6700
Estimated Etch Rate
6750
Fig. 6. Relationship between etch rate estimates and yield for PM1. A
nonparametric Nadaraya-Watson kernel estimate of the conditional mean,
d of the yield given the etch rate is plotted along with the 95
E(yld|ER),
% confidence interval of the variance of the kernel estimator. The bias is
known to be relatively small for point of uniform sample density inside the
domain.
(3)
where β̂j is the estimate of the jth, σ̂ is the estimate of the
error variance shown in 4 and vj is the jth diagonal element
¡
¢−1
of XT X
. Under the hypothesis β̂j = 0, 3 has a T
distribution with n − p degrees of freedom. The error variance
may be estimated with 4 where er
b k is the kth estimate of the
etch rate found using the estimated coefficient of the assumed
model 1 and erk is the kth true etch rate.
n
1 X
2
σ̂ =
(erk − er
b k)
(4)
n−p
k=1
The model coefficients are found using the QR decomposition
and backwards substitution to avoid numerical instabilities
that may be encountered while inverting matrices. The QR
decomposition of X is a factorization of the form X = QR.
Here, Q is a nxp orthogonal matrix, (QT Q = I), and R is
a pxp upper triangular matrix. The coefficients are thus given
in 5.
β̂ = R−1 QT er
(5)
Notice that we need not invert the matrix R in 5 because it is
upper triangular and may be solved by back substitution.
III. R ESULTS
Linear regression models were created for each
tool/chamber/step combination. The data from the first
180 days of 2004 was used to train the model. The forward
stepwise regression algorithm was used to select the 25
predictor variables with the largest F statistics. The models
created for 3 chambers of one tool are presented. This tool
is used to etch LI and this particular step is known to be
sensitive to etch rate. The model predictions for the mean
and uniformity of one of the three chambers is shown in Fig.
5. The vertical axes on Fig. 5 were adjusted to exclude some
extreme predictions so that the majority of the predictions
could be seen. The relationship between the yield of devices
processed by this chamber and step is shown in Fig. 6. The
hypothesized sensitivity of the yield to this particular etch
step may be observed. The plot of the killer defect density of
SEMICONDUCTOR MANUFACTURING
4
21609 PM1 LIG12LOW
0.45
Data Points
Kernel Regression
95% Confidence Intervals
0.4
Etch Tool
Production
Data
PIM
Test Wafer
Data
Defect Density
0.35
Etch Time
0.3
0.25
0.2
APC Controller
Test Wafer
Etch Rate
Measurements
Predictors
Measured Etch
Rates
Estimated Etch Rates
Adaptive
Linear
Model
0.15
6640
6660
6680
6700
6720
Estimated Etch Rate
6740
Fig. 9.
6760
Fig. 7. Relationship between etch rate estimates and defect density for PM1.
R2a Adjusted R2
1
0.9
0.8
0.7
0.6
0.5
−2
8
10
−1.5
−1
−0.5
0
0.5
1
1.5
2
PRESS
6
10
4
10
2
10
−2
−1.5
−1
−0.5
0
0.5
1
Data interval from etch rate measurement
1.5
2
Fig. 8. Adjusted R2 and cross validation PRESS statistics vs the interval of
data before or after the etch rate measurement used to estimate model coefficients. The variables were chosen using the stepwise procedure previously
2 was calculated for the coefficient estimates
described and the adjusted R2 , Ra
produced from each data interval using the same set of predictor variables.
the same etch step/chamber combination is shown in Fig. 7.
The definition of killer defect density and its relationship to
yield is shown in 6 where A is the die area.
DD = − ln(Y ield)/A
(6)
This provides a normalization of yield and enables the accurate
comparison of the yields across multiple devices.
IV. D ISCUSSION
The Figs 7 and 6 indicate that the etch step may be
optimized to produce higher yielding devices. Ideally we
would like the curves on these plots to be flat, that is, we
want the yield or defect density to be independent on this
step. A possible design for a APC controller that could be used
to effect this ideal is shown if Fig. 9. The controller would
use a adaptive model that updates the coefficient estimates
periodically using test wafer measurements and production
Possible APC controller to optimize the LI etch step.
data in only slightly different than the static model presented
here.
The model assumes that the relationship between the PIM
variables and the etch rate remains unchanged over time. This
assumption may be inaccurate. We know that the etch rate
itself changes over time as the chamber conditions drift. This
fact is displayed in Fig. 8 in which the adjusted R2 is plotted
against the interval over which data was collected to be paired
with a given etch rate measurement. The formula for the
adjusted R2 , Ra2 is given in 7.
P
n − 1 (yi − ŷi )2
2
P
Ra = 1 −
(7)
n−p
(yi )2
The nature of the relationship between the PIM data and etch
rate is not known however but if this relationship changes with
time an adaptive model may produce superior results to the
static model presented here.
The use of SPC charts using the production wafer etch rate
predictions, as in 5 rather than test wafers could also reduce
the need for test wafers and the corresponding costs.
V. S UMMARY
A linear statistical model was presented for estimating the
etch rate of plasma etch tools from PIM data during normal
production. The model specifics were presented along with an
example of the relationship of one etch step to yield. This
technique allows the estimation of etch rate without the use
of test wafers and procedures during normal production. The
high frequency behavior of the etch rate may also be studied
along with the relationship that the etch rate has with the
ultimate metric in semiconductor manufacturing, yield. The
yield example shows that the step chosen exhibits a relatively
strong relationship with yield and that an yield optimization
is possible if the etch rate is controlled more precisely. The
same technique could also be used to identify steps without
any relationship to yield and the etch rate control on these
steps could be relaxed to increase production and tool uptime.
Finally etch rate predictions could be used to signal when
the tool was drifting out of specifications and preventative
maintenance was required. These areas will be the focus of
further research.
SEMICONDUCTOR MANUFACTURING
R EFERENCES
[1] William Ently, William Hennessy, and John G Langan. C2 F6 /O2 and
C3 F8 /O2 plasmas SiO2 etch rates, impedance analysis, and discharge
emissions. Electrochemical and Solid-State Letters, 3(2):99–102, February 2000.
[2] Craig Garvin and J. W. Grizzle. Demonstration of broadband radio
frequency sensing: Empirical polysilicon etch rate estimation in a lam
9400 etch tool. Journal of Vacuum Science and Technology, Part A:
Vacuum, Surfaces and Films,, 18(4):1297–1302, 2000.
[3] Byungwhan Kim, Kwang-Ho Kwon, Sung-Ku Kwon, Jong-Moon Park,
Seong Wook Yoo, Kun-Sik Park, In-Kyu You, and Bo-Woo Kim. Modeling etch rate and uniformity of oxide via etching in a CHF3 /CF4 plasma
using neural networks. Thin Solid Films, 426(1-2):8–15, February 2003.
5
Download