SEMICONDUCTOR MANUFACTURING 1 Oxide Etch Rate Estimation Using Plasma Impedance Monitoring Daniel Tsunami, Student Member, IEEE Abstract— The oxide etch rate of plasma etch tools is estimated from plasma impedance monitoring data. Linear statistical modelling and stepwise regression are used to generate predictions of the mean and range oxide etch rate. The relationship of the mean etch rate to yield is explored for one processing step. Potentials for advanced process control of the etch tools are presented. Index Terms— Etch Rate Estimation, Plasma Impedance Monitoring I. I NTRODUCTION P LASMA impedance monitoring (PIM) is used for the monitoring of reactive ion etching processes in order to characterize their operation. The use of PIM to characterize or “fingerprint” the chamber of a plasma etch tool is a simple approach to ensuring that the tool characteristics are unchanged from a known state. The state of the chamber is evaluated during production as some metric representing the distance between the current state and a ideal state. If this distance becomes to large the tool is considered out of control and a fault is signaled. The knowledge based control paradigm extends this fault detection by specifying possible modes of failure when a fault is detected. These models of fault detection usually require intervention to diagnose and correct the error once it has been signaled. Advanced process control (APC) uses real time tool data as the input to adaptive an adaptive controller which alters tool settings to ensure correct operation. The etch rate of a plasma etch tool chamber is the most important parameter for the precise control of etch steps. Etch recipes are often designed conservatively when real time measurements of the etch rate cannot be obtained. This is the case for oxide etches where the etch step clears the oxide layer so ideally there is no material left that a thickness measurement could be obtained from in order to estimate the etch rate. Alternatively the etch endpoint is actively detected and used to determine the etch time. The standard methods of endpoint detection often are not applicable due to the small ratio of material being etched which results if negligible changes in plasma chemistry that are difficult to detect. Oxide etch recipes are consequently designed conservatively to overetch by enough of a margin that the variability of the etch rate does not result in incomplete clearing of the oxide and possible opens or high impedance connections. The etch rate can be measured using special wafers and test procedures on some regular interval to ensure that the etch remains within some broad specifications. This approach provides a reasonable level of control of the etch processes, however the variability of the etch rate does produce a yield impact for sensitive steps such as local interconnect (LI). Ideally a realtime estimate of the etch rate could be used to control the tool settings to remove this yield impact. The hypothesis that the plasma impedance data collected during the etching of production wafers contains some information about the etch rate of the chamber was investigated. Little has been published on the use of plasma impedance for the prediction or estimation of etch rate. Garvin and Grizzle describe their broadband radio frequency sensing technology to empirically and parametrically estimate the etch rate using a multiple regression technique and a stepwise approach to subset selection to reduce multicollinearity [2]. The biased coefficient of multiple determination was used to quantify the performance of regression models in a 5 run experiment in which the 5th run was estimated from the first four yielded a R2 equal to 0.997 and 0.962 for the empirical and parametric regression models respectively. The broadband sensing gives a estimate of the impedance at a range of frequencies and is captured over a short interval time by actively driving the plasma. The predictor variables used to generate the etch rate estimates were the values of the impedance at various frequencies. The plasma impedance data used in this study was collected throughout the entire etch steps and summary statistics rather than raw values were used as possible predictor variables. Also the wafers used in [2] were test wafers not product and the situations were those of a controlled experiment not normal production. Another significant difference is that the etch rates estimated in [2] are for polysilicon not silicon dioxide. Other prior works which describe the estimation of etch rate using plasma impedance measurement are less relevant, Ently et al describe their attempts to maximize the etch rate in order to reduce CFC emissions [1]. Kim et al used neural networks to predict etch rate from flow rates, applied power and pressure [3]. This paper describes the used of standard linear statistical models to the problem of estimating the oxide etch rate of Lam Research Exlan etch tools using plasma impedance data. The plasma impedance data was collected using Straatum SmartPIM Plasma Impedance Monitoring devices. II. M ETHODOLOGY The plasma impedance data is recorded as several harmonics of voltage, current and phase for two frequencies, 27 MHz and 2 MHz. The 2 MHz voltage applied is automatically adjusted between 1.8 MHz and 2.2 MHz to minimize the reflected power. The power is only applied at the fundamental frequencies but the plasma presents a nonlinear load which Mean Etch Rate 10−10 m/min lower 1st lower 2nd lower 3rd lower 4th 12 10 8 6 4 2 0 4 Plasma Voltage (V) 2 7000 6800 6600 6400 6200 0 50 100 150 Etch Rate Range 10−10 m/min Plasma Current (mA) SEMICONDUCTOR MANUFACTURING 200 lower 1st lower 2nd lower 3rd lower 4th 3 2 1 0 50 100 0 50 100 150 200 250 150 200 250 5000 4000 3000 2000 1000 0 Time (days) 0 0 50 100 150 200 Fig. 4. Example SPC chart of the etch rate mean and range from January to August 2004 on a single chamber. Time (s) Fig. 1. Voltage and current waveforms for the 1st through 4th harmonic during a typical oxide etch. −3 4.5 x 10 pm1 pm2 pm3 pm4 4 Estimated pdf 3.5 3 2.5 2 1.5 1 0.5 0 6300 6400 6500 6600 6700 6800 6900 7000 Etch Rate 10−10m/min Fig. 2. PDF estimates of the etch rate measurements as obtained from test wafers over the interval form July 2003 to July 2004. Etch Rate measurements Date 27−Feb−2003 08:08:00 100 7000 1 80 2 60 Vertical Location (mm) 5 4 Etch Rate 10−10 m/min 3 40 6 20 7 8 9 10 0 11 −20 12 13 14 −40 15 16 −60 17 −80 −100 −100 Fig. 3. −50 0 50 Horizontal Location (mm) 100 6300 Sample locations for the etch rate measurements on test wafers. generate power at several harmonics of each of the fundamental frequencies. The fundamental and the first four harmonics are recorded for the 2 MHz signal and the fundamental and the first two harmonics are recorded for the 27 MHz signal. An example of the waveforms produced for several harmonics of voltage and current are shown in Fig. 1. The main etch step occurs during only a portion of the etch recipe. The voltage, current, and phase waveforms were extracted for the main etch step and the plasma strike step which immediately precedes it. 660 different summary statistics were calculated from the extracted waveforms to provide a pool of potential predictor variables. The statistics that were calculated from the waveforms included the integral, mean, variance, median, interquartile range of the sampled voltage, current, phase, frequency, real power, complex power, impedance magnitude, resistance and reactance. The mean, variance, median, interquartile range of the derivatives of the sampled signals were also calculated. In addition the maximum of each extracted signal was also included as predictor variables. The etch rates are measured nominally every seven days but perhaps much more often if the tool has undergone maintenance or it is known that a out of control condition exists. The estimated probability density functions of the 1st to 95th percentile range of the etch rate measurements taken over an one year interval from July 2003 to July 2004 in 4 separate chambers of etch tool is shown in Fig. 2. Traditionally the test wafer is sampled in 17 places and the mean and range are plotted in statistical process control (SPC) chart. The sample locations and an example of the SPC chart used are shown in Figs 3 and 4. The specifications for the mean and range are also shown in Fig. 4. Excursions outside the specification limits have been observed to be indicative of chamber conditions that produce poor yielding wafers. Excursions of the range do not in all cases indicate that adverse conditions exist. Consequently only the results concerning the mean etch rate will be emphasized. The mean of each of the seventeen sample locations for each etch rate measurement was used to supervise the regression modelling. The measurements were mapped to PIM data obtained from production wafers that had been processed SEMICONDUCTOR MANUFACTURING 3 shortly after the measurements were taken. The interval after the etch rate measurement in which production PIM data was taken was adjusted according to the sparsity of the data for different tool/chamber/recipe/device combinations to enable adequate estimation of the model parameters. The valid data interval was either shrunk or expanded until 120 samples were selected. The PIM data samples were then mapped to the etch rate which they closely followed. The normal assumptions were used and the form of the standard linear statistical model shown in 1 was assumed. er = Xβ + ² Measurements Predictions 50 50 where RSS0 is the residual sum of squares of the original model with out the additional variable, RSS1 is the residual sum of squares of the model with the additional variable, p1 is the number of variables present in the larger model, and n is the number of samples. Using the normal assumptions the statistic 2 can be shown to have a F distribution with 1 and n − p1 degrees of freedom. The variables were dropped if any exhibited a T statistic smaller than the 95th percentile of the T distribution with the degrees of freedom corresponding to that stage. The T statistic of the jth coefficient is given as β̂j √ σ̂ vj 150 200 250 (1) 100 150 Days since Jan 1 2004 200 250 Fig. 5. PM1 etch rate mean and range predictions, R2 = 0.96, 0.93 for the mean and range respectively. 21609 PM1 LIG12LOW Data Points Kernel Regression 95% Confidence Intervals 0.9 0.85 0.8 Yield That is, er is the nx1 vector of etch rate measurements, X is the nxp matrix of predictor variables, β is a px1 vector of linear coefficients to be determined and epsilon is an nx1 vector of independent error terms that are normally distributed with zero mean and constant variance σ 2 . Forward stepwise regression was applied due to the relatively small amount of data relative to the 660 potential predictor variables. Additional variables were added, selecting at each stage the potential predictor with the largest F statistic that was also over the acceptance threshold of the 95th percentile of the F distribution with the degrees of freedom corresponding to that stage which can be expressed as RSS0 − RSS1 F= (2) RSS1 /(n − p1 ) T= 100 0.75 0.7 0.65 0.6 6600 6650 6700 Estimated Etch Rate 6750 Fig. 6. Relationship between etch rate estimates and yield for PM1. A nonparametric Nadaraya-Watson kernel estimate of the conditional mean, d of the yield given the etch rate is plotted along with the 95 E(yld|ER), % confidence interval of the variance of the kernel estimator. The bias is known to be relatively small for point of uniform sample density inside the domain. (3) where β̂j is the estimate of the jth, σ̂ is the estimate of the error variance shown in 4 and vj is the jth diagonal element ¡ ¢−1 of XT X . Under the hypothesis β̂j = 0, 3 has a T distribution with n − p degrees of freedom. The error variance may be estimated with 4 where er b k is the kth estimate of the etch rate found using the estimated coefficient of the assumed model 1 and erk is the kth true etch rate. n 1 X 2 σ̂ = (erk − er b k) (4) n−p k=1 The model coefficients are found using the QR decomposition and backwards substitution to avoid numerical instabilities that may be encountered while inverting matrices. The QR decomposition of X is a factorization of the form X = QR. Here, Q is a nxp orthogonal matrix, (QT Q = I), and R is a pxp upper triangular matrix. The coefficients are thus given in 5. β̂ = R−1 QT er (5) Notice that we need not invert the matrix R in 5 because it is upper triangular and may be solved by back substitution. III. R ESULTS Linear regression models were created for each tool/chamber/step combination. The data from the first 180 days of 2004 was used to train the model. The forward stepwise regression algorithm was used to select the 25 predictor variables with the largest F statistics. The models created for 3 chambers of one tool are presented. This tool is used to etch LI and this particular step is known to be sensitive to etch rate. The model predictions for the mean and uniformity of one of the three chambers is shown in Fig. 5. The vertical axes on Fig. 5 were adjusted to exclude some extreme predictions so that the majority of the predictions could be seen. The relationship between the yield of devices processed by this chamber and step is shown in Fig. 6. The hypothesized sensitivity of the yield to this particular etch step may be observed. The plot of the killer defect density of SEMICONDUCTOR MANUFACTURING 4 21609 PM1 LIG12LOW 0.45 Data Points Kernel Regression 95% Confidence Intervals 0.4 Etch Tool Production Data PIM Test Wafer Data Defect Density 0.35 Etch Time 0.3 0.25 0.2 APC Controller Test Wafer Etch Rate Measurements Predictors Measured Etch Rates Estimated Etch Rates Adaptive Linear Model 0.15 6640 6660 6680 6700 6720 Estimated Etch Rate 6740 Fig. 9. 6760 Fig. 7. Relationship between etch rate estimates and defect density for PM1. R2a Adjusted R2 1 0.9 0.8 0.7 0.6 0.5 −2 8 10 −1.5 −1 −0.5 0 0.5 1 1.5 2 PRESS 6 10 4 10 2 10 −2 −1.5 −1 −0.5 0 0.5 1 Data interval from etch rate measurement 1.5 2 Fig. 8. Adjusted R2 and cross validation PRESS statistics vs the interval of data before or after the etch rate measurement used to estimate model coefficients. The variables were chosen using the stepwise procedure previously 2 was calculated for the coefficient estimates described and the adjusted R2 , Ra produced from each data interval using the same set of predictor variables. the same etch step/chamber combination is shown in Fig. 7. The definition of killer defect density and its relationship to yield is shown in 6 where A is the die area. DD = − ln(Y ield)/A (6) This provides a normalization of yield and enables the accurate comparison of the yields across multiple devices. IV. D ISCUSSION The Figs 7 and 6 indicate that the etch step may be optimized to produce higher yielding devices. Ideally we would like the curves on these plots to be flat, that is, we want the yield or defect density to be independent on this step. A possible design for a APC controller that could be used to effect this ideal is shown if Fig. 9. The controller would use a adaptive model that updates the coefficient estimates periodically using test wafer measurements and production Possible APC controller to optimize the LI etch step. data in only slightly different than the static model presented here. The model assumes that the relationship between the PIM variables and the etch rate remains unchanged over time. This assumption may be inaccurate. We know that the etch rate itself changes over time as the chamber conditions drift. This fact is displayed in Fig. 8 in which the adjusted R2 is plotted against the interval over which data was collected to be paired with a given etch rate measurement. The formula for the adjusted R2 , Ra2 is given in 7. P n − 1 (yi − ŷi )2 2 P Ra = 1 − (7) n−p (yi )2 The nature of the relationship between the PIM data and etch rate is not known however but if this relationship changes with time an adaptive model may produce superior results to the static model presented here. The use of SPC charts using the production wafer etch rate predictions, as in 5 rather than test wafers could also reduce the need for test wafers and the corresponding costs. V. S UMMARY A linear statistical model was presented for estimating the etch rate of plasma etch tools from PIM data during normal production. The model specifics were presented along with an example of the relationship of one etch step to yield. This technique allows the estimation of etch rate without the use of test wafers and procedures during normal production. The high frequency behavior of the etch rate may also be studied along with the relationship that the etch rate has with the ultimate metric in semiconductor manufacturing, yield. The yield example shows that the step chosen exhibits a relatively strong relationship with yield and that an yield optimization is possible if the etch rate is controlled more precisely. The same technique could also be used to identify steps without any relationship to yield and the etch rate control on these steps could be relaxed to increase production and tool uptime. Finally etch rate predictions could be used to signal when the tool was drifting out of specifications and preventative maintenance was required. These areas will be the focus of further research. SEMICONDUCTOR MANUFACTURING R EFERENCES [1] William Ently, William Hennessy, and John G Langan. C2 F6 /O2 and C3 F8 /O2 plasmas SiO2 etch rates, impedance analysis, and discharge emissions. Electrochemical and Solid-State Letters, 3(2):99–102, February 2000. [2] Craig Garvin and J. W. Grizzle. Demonstration of broadband radio frequency sensing: Empirical polysilicon etch rate estimation in a lam 9400 etch tool. Journal of Vacuum Science and Technology, Part A: Vacuum, Surfaces and Films,, 18(4):1297–1302, 2000. [3] Byungwhan Kim, Kwang-Ho Kwon, Sung-Ku Kwon, Jong-Moon Park, Seong Wook Yoo, Kun-Sik Park, In-Kyu You, and Bo-Woo Kim. Modeling etch rate and uniformity of oxide via etching in a CHF3 /CF4 plasma using neural networks. Thin Solid Films, 426(1-2):8–15, February 2003. 5