Design of Virtual Metrology Models by Machine Learning : A Matlab Prototype A. 1Ecole 1 Ferreira , G. 1 Pages , Y. 2 Oussar Nat. Sup. des Mines de Saint-Etienne, 2 ESPCI Paristech Motivation Matlab Prototype We propose a methodology for building VM models using machine learning techniques. After a standard data pre-processing, a variable ranking and selection procedure is applied to determine the most relevant variables for predicting the metrology variables. Different techniques coming from the statistical learning theory such as: PLS regression and Least Squares Support Vector Machine LS-SVM regression are implemented. For each technique, the model parameters are estimated using a training algorithm. A k-fold cross-validation (or leave-one-out) procedure is used to select the model that exhibits the best generalization capabilities. Its performance is then estimated using a test dataset. The EMSE-CMP methodology was implemented in a Matlab Prototype dedicated to Virtual Metrology Models design. Basic Diagram for the Design of VM Models based on Machine Learning Techniques Ranking and selection variables Case Study The main scientific contributions of EMSE-CMP are the development of filter and wrapper methods to ranking and selection variables. Some manufacturing processes have a very large number of input variables. The result is complex predictive models with poor generalization capabilities: The confidence level of a model is even larger when it uses a small number of adjusted parameters. In addition, taking into account irrelevant variables leads to introducing noisy data that yields to overfitting and then poor generalization capabilities. The goal of variable ranking and selection is to determine the smallest subset of variables, carrying as much information as possible, to explain the dependent variable, while discarding both redundant and/or irrelevant variables (i.e., poorly informative). EMSE-CMP has two main contributions in ranking and variable selection: STMicroelectronics Rousset site Prediction of Overlay of Photolithography process 1)Contribution to filter method: Mutual Information-based Variable Selection using a Probe Feature. 2)Contribution to wrapper method: Wrapper with a meta-heuristic approach, namely a Tabu search algorithm (TabuWrap). Filter and Wrappers Approaches Approach Pros Cons Filter • Model free • Low computational cost • Fairly irregular • May degrade performances Wrapper • Consistent • High accuracy • Improves performances Computational burden Conclusion Two EMSE-CMP original contributions in variable ranking and selection were implemented with the LS-SVM regression method in a Matlab Prototype for the design of VM models. The Matlab Prototype was validated using real data from two case studies: • • Austriamicrosystems: Prediction of PECVD (Plasma Enhanced Chemical Vapor Deposition) oxide thickness for an Inter Metal Dielectric (IMD) layers. STMicroelectronics Rousset case: Prediction of Overlay of Photolithography process 43 variables out of 169 have been selected