Learning Theory Applications to Product Design Modeling by JUAN C. DENIZ B.S., Mechanical Engineering Massachusetts Institute of Technology, 1998 Submitted to the Department of Mechanical Engineering in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE IN MECHANICAL ENGINEERING at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2000 @ 1999 Massachusetts Institute of Technology, All Rights Reserved .................. .................... Department of Mechanical ngineeri Signature of Author................................. Certified by......................................................................................David Wallace Esther and Harold E. Edgerton Associate Professor of Mechanical Engineering Thesis Supervisor Accepted by ...................................................................................... mnA. :!sonin Chairman, Department Committee on Graduate Students MASSACHUSETTS INSTITUTE OF TECHNOLOGY SEP 2 0 2000 LIBRARIES Learning Theory Applications to Product Design Modeling By JUAN C. DENIZ Submitted to the Department of Mechanical Engineering on May 5, 2000 in Partial Fulfillment of the Requirements for the Degree of Master of Science in Mechanical Engineering ABSTRACT Integrated product development increasingly relies upon simulations to predict design characteristics. However, several problems arise in integrated modeling. First, the computational requirements of many sub-system models can prohibit system level optimization. Further, when several models are chained in a distributed environment, overall simulation robustness hinges upon the reliability of each of the individual simulations and the communications infrastructure. A solution to these problems might lie in the application fast surrogate models, which can emulate slow systems or mirror unreliable simulations. Also, in many cases sub-system behavior is described through empirical data only. In such cases, surrogate modeling techniques may be applicable as well. This thesis explores the use of Artificial Neural Network (ANN) algorithms in integrated design simulation. In summary, the thesis focuses on generating a quick to setup, easy to use, and reliable neural network tool for the use within DOME (Distributed Object Modeling Environment). The work was divided in four areas. Understanding the system to be emulated, generating a good exemplar set from the system, designing the neural network, and finally demonstrating the usefulness of the application through case studies. The ANN object is validated on a variety of test problems: from a surrogate LCA, to aiding Dynamic Monte Carlo Simulation. Finally, the ANN is applied to a design application within Ford Motor Company door model where the neural object trained an ANN to emulate an FEA for a car door-seal sub-system. Thesis Supervisor: David Wallace Title: Esther and Harold Edgerton Associate Professor of Mechanical Engineering ACKNOWLEDGEMENTS I appreciate the support of the Ford Motor Company for partial research funding. I give special thanks to the Graduate Engineering for Minority (GEM) fellowship for partial support of my graduate stipend. Thanks to my family of the CAD Laboratory, especially Professor David Wallace for his guidance, Nick Borland for his computer skills, Bill Liteplo, Priscilla Wang, and Stephen Smyth for the eventful thesis all-nighters, and to Chris Tan, Jeff Lyons, Julie Eisenhard, and Ines Sousa for using and debugging my work, without this working family, graduate school would not have been as joyful. Quiero darle las gracias en especial a mis padres (los cuatro), quien si su apoyo, compreci6n y consejos no habria alcanzado ni recorrido tan lejos en mi vida. I would like to dedicate this thesis to Liliana who I will always carry in my memories. TABLE OF CONTENTS Learning Theory Applications to Product Design M odeling............................................. Abstract............................................................................................................................... Acknowledgements........................................................................................................ Table of Contents................................................................................................................ List of Figures ..................................................................................................................... List of Tables ...................................................................................................................... 1 Introduction................................................................................................................ 2 Background ................................................................................................................ 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 3 Research Scope .......................................................................................................... 3.1 3.2 4 A Learning Theory Approach: The Artificial Neural Network................................................... Training the Neural Model..............................................................................................................14 A Quick Comparison with Other Techniques .............................................................................. Applications of ANN ...................................................................................................................... Feed Forward Neural Network Tradeoffs .................................................................................. Distributed Object-based Modeling Environment (DOME)....................................................... Surrogate Modeling.........................................................................................................................19 Basic Network Design.....................................................................................................................20 Describing Systems.........................................................................................................................22 Case Studies: Design applications.............................................................................................. M odel Complexity Assessment............................................................................... 4.1 4.2 4.3 4.4 5 System Function Decomposition ................................................................................................ Metrics: a quick assessment to input/output relationship............................................................ Assessment of System Test.............................................................................................................32 Possible Alternate Methods.............................................................................................................34 Set of Exemplar Generation.................................................................................... 5.1 Collecting the Optimal Set of Exemplars................................................................................... 5.1.1 Listening for changes: a parasite analogy.......................................................................... 5.1.2 Evenly distributed Generation........................................................................................... 5.1.3 Random Generation...............................................................................................................38 5.2 Ranked Function Data Generation Method................................................................................ 5.3 Comparing Data Collection Methods, and Training the Network...............................................40 5.3.1 Testing the Ranked Method Data Collection Scheme ....................................................... 5.3.2 Training the Hilly Function using DOME's Neural Network Module ............................. 6 Creating and Benchmarking The Neural M odule ................................................. 6.1 6.2 7 Neural Applications Within DOM E......................................................................... 7.1 7.2 7.3 8 The Designer's Neural Network ................................................................................................ Benchmarking against the MATLAB Neural Network Toolbox .............................................. Life Cycle Assessment Surrogate Model.....................................................................................53 Dynamic Monte Carlo Tool (DMCT)....................................................................................... Finite Element Analysis emulation ............................................................................................. Conclusions and Future W ork.................................................................................. 8.1 8.2 Summary ......................................................................................................................................... Future W ork .................................................................................................................................... Appendix A - Collected Neural Network Data............................................................. References.............................................................................................67 1 3 5 7 8 9 11 13 13 15 16 16 17 21 27 29 29 31 35 35 36 37 39 41 43 47 47 49 53 54 55 59 59 60 61 LIST OF FIGURES Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure 2-1 A simple Feed-Forward 2 Layer Neural Network.. ....................... 2-2 The Neural Network Module within DOME ............................................... 2-3 Surrogate Modeling Concept ...................................................................... 3-1 Thesis' Work Flow Chart............................................................................. 3-2 2-Dimensional Functions.. ........................................................................ 3-3 3D Ramp function...................................................................................... 3-4 The Unimodal function. .............................................................................. 3-5 Himmelbau's function................................................................................ 3-6 The H illy Function...................................................................................... 4-1 System D ecom position................................................................................... 4-2 Linear Regression Assessment Concept.. .................................................. 5-1 Manual Data Generation.. ........................................................................... 5-2 Even Distribution of Training Samples....................................................... 5-3 Random Data Generation............................................................................... 5-4 Ranked Input Data Generation.................................................................... 5-5 Data Generation Scheme Network Training Comparison. ......................... 5-6 Training the Hilly with 144 points............................................................. 5-7 Training the Hilly with 625 exemplars.. .................................................... 5-8 A Surrogate Hilly Function......................................................................... 5-9 Mapping the Tallest peak of the Hilly......................................................... 6-1 Slopes Shifts vs. Hidden Neurons............................................................... 6-2 The Hilly, The MATLAB and the ANN.................................................... 6-3 A MATLAB Radial Basis Function Hilly Approach.................................. 7-1 Comparison of Life-Cycle energy consumption........................................ 7-2 Addition of two Uniform Distribution within DOME................ 7-3 A door-seal Marc Model............................................................................. 7-4 Comparison between ANN Generated plot and FEA Plot............. 14 18 19 21 23 24 25 26 27 30 31 37 38 39 40 42 43 44 45 46 49 50 52 54 55 56 57 LIST OF TABLES Table 4-1 2-Dimensional Function Complexity Test results........................................ 33 Table 4-2 3-Dimensional Function Complexity Analysis results................................. 34 Table 5-1 Comparison Study for all Data Collection Schemes for Ramp function......... 41 45 Table 5-2 Dividing and Conquering the Hilly ............................................................ 48 Table 6-1 Training the 3-D system s............................................................................. Table 6-2 MATLAB Neural Network Toolbox vs. ANN Module Data....................... 50 Table Appendix 1 This figure is the data for the Set of Exemplar Generation Chapter.. 61 Table Appendix 2 This figure is the data for the Set of Exemplar Generation Chapter.. 62 Table Appendix 3 Training Comparison Data of four Exemplar Collection Schemes. .. 62 Table Appendix 4 Hilly Function Progress Training.................................................... 64 Table Appendix 5 Dynamic Monte Carlo Simulation Training session...................... 64 Table Appendix 6 Hilly Training Data ........................................................................ 64 Table Appendix 7 Hidden Neurons vs. Training Error Data........................................ 65 11 1 INTRODUCTION Internet-based computer-aided design environments gain more and more recognition for their importance in product development practice. Their potential to reduce product cycle time makes them extremely desirable. In general, these environments impact product development process in several ways. By improving communication between designparticipants there is potential for increased rapid product design iteration, which in turns leads to an improvement in the quality of the product (Pahng and Wallce, 1998). An example of such an environment is the DOME project (Abrahamson et al., 2000), which allows distributed and heterogeneous engineering models to be linked to form integrated system simulations. These integrated design environments predict the performance of very complex products using many individual sub-models. Unfortunately, it is common that many of the detailed engineering simulations involve time-expensive computation. For example, a product like the seal element of an automobile movable glass system (MGS), or door glass system, involves well over 20 simulations, including several FEA (Finite Element Analysis) models. The combination of several of these models within an integrated system simulation may render the exploration of many tradeoffs or optimization impractical. This is especially problematic from a system optimization viewpoint. Consequently, techniques are needed for rapid approximate simulations in replacement detailed simulations when appropriate-this is referred to as surrogate modeling. There are many approaches for emulating systems, some of which include: Linear Regression (LR), Response Surface Methodology (Osborne and Armacost, 1997; Meghabghab and Nars, 1999) and Learning Theory techniques like Neural Networks and Support Vector Machines (Vapnik, 1993). In this work Neural Networks are explored as they have the potential to be a very flexible approach for creating surrogate design models. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 12 This research describes the development of a generic Artificial Neural Network (ANN) object for use within the DOME (Distributed Object-based Modeling Environment) project (Abrahamson et al., 2000). The work attempts to establish an automated method for evaluating the complexity of a model and designing a neural network accordingly. The range of models and applications vary widely in product design, and thus it is important that the artificial network be flexible and robust. With the use of surrogate models, it is hoped that an integrated system model will be able to perform rapidly and robustly. For example, running a genetic optimization of an integrated system model can be slow and time consuming with the presence of many detailed sub-system models (Deniz, 1998). Previous studies have shown that surrogate neural networks dramatically decrease the time to optimize the model of a complicated system. This thesis will also explore the use of ANN algorithms for other design applications including: system and data analysis, function approximation, and time series prediction. With an integrated modeling environment like DOME, the possibilities for neural network application are broad. Ultimately, the goal of the work is to develop a software object that makes the application of artificial neural networks accessible to product designers with little experience in learning theory. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 13 2 BACKGROUND 2.1 A Learning Theory Approach: The Artificial Neural Network Learning Theory refers to the study of mathematical algorithms that can be trained metaphorically to assist with a numerical task. The theory has gained relevance in the field of engineering with revolutionary techniques like Artificial Neural Networks and Support Vector Machines. These methods, based on the simple notion of learning from examples, may be applied to many useful problems. They include: classification, time series prediction, optimization, association, system control, and computation solving (Bose, 1996, Masters, 1993, and Cichocki, 1993). In general, Artificial Neural Networks (ANN) are simplifications of biological neuron networks. The way the brain processes information inspires these networks. They emulate basic properties of biological nervous systems in attempt to replicate the brain's adaptive learning ability from experience. An ANN is a mathematical model for information processing with a high learning potential (Bandy, 1998). Therefore, the neural model contains many parameters that control its behavior. Some of these include: activation functions; connection weights; neuron biases; and the total number of neurons in its architecture. Some of these parameters can be seen then figure 2.1 below. Consequently, the value of these parameters influences directly the performance of neural nets. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 14 Weights Activation Function Hidden Neurons Layer Uj = T{ Wii Inputs (Xi. Outputs k column X2 Y1 x2 Yk xi Wik i column Out j column = 1{Z(uj-wjk)} Neurons Layer Figure 2-1 A simple Feed-Forward 2 Layer Neural Network. Architecture displays its inputs (xI, x2, (N'), weights (Wij, wik) and connections. x ), 1 outputs (y1 , y1), activation functions 2.2 Training the Neural Model The tremendous capability of acquiring knowledge through a learning process makes neural nets attractive for computation and nonlinear system problems (Cichocki and Unbehauen, 1993). Generally, a learning algorithm trains the artificial network to perform a desired task. The training of an ANN could be either supervised or unsupervised depending on its applications. Supervised learning requires a period of training in which the ANN is given a set of inputs and it is taught its respective outputs through an external error feedback signal algorithm. In contrast, a neural network using unsupervised learning trains while it is being executed by establishing similarities and regularities in the input data sequence (Hung and Jan). Thus the longer it runs the more experience it gets and the better it should perform. Neural networks may be trained using several learning techniques. This work uses supervised training algorithms: Back propagation (BP); and the Radial Basis Function (RBF). For reasons of flexibility, the integrated modeling environment application requires a robust neural model capable of approximating multiple-input multiple-output (MIMO) systems. The most widely used network for system approximation is the Back MassachusettsInstitute of Technology - Computer Aided Design Laboratory 15 Propagation Network (BPN). More research on BPN has been performed than in any other neural model. The Radial Basis Function is also a good training candidate for several reasons. The RBF network is designed for function approximation. It is significantly faster than the BPN (Jacob, 1998) and the sigmoid function used in multilayer feed-forward networks trained with the back propagation does not yield the approximation capabilities of RBF networks (Bose and Liang, 1996). 2.3 A Quick Comparison with Other Techniques Neural Networks are not the only effective technique for surrogate modeling. Other popularly applied theories include: Response Surface methodology (RSM) and Support Vector Machines (SVM). SVM theory is relatively new unlike RSM, which was developed almost fifty years ago. Both theories have been successfully implemented and they are used in many engineering applications. However, neural models have become increasingly popular over the past several years. Compared to SVM, the ANN research body is expansive. Documentation is readily available and implementations are straightforward. Neural networks have been applied in many areas successfully, while SVM are still in a development phase and have only been applied thoroughly on a few applications such as classification. On the other hand, even though RSM has been applied successfully many times, most of its applications are limited to parametric techniques (Meghabghab and Nasr, 1999). Therefore, Neural Network's robustness, wide applicability, and nonparametric characteristics makes it an ideal candidate for our general surrogate modeling need. Neural models have an advantage in that they are less sensitive to errors or distortions in the input that they receive. Due to their non-linear design, many researches favored them for solving problems that are too complex for other techniques to handle. For example, they are used for problems for which algorithmic solutions do not exist (Bose and Liang, 1996). Massachusetts Instituteof Technology - Computer Aided Design Laboratory 16 2.4 Applications of ANN In the CAD Laboratory at MIT, the potential of using neural networks in application to genetic optimization (Deniz, 1998) has been demonstrated. In this previous work, a neural network was trained to emulate a simple 3D model outlined in Pro-Engineer, a CAD modeling system. For each single time the 3D model was called it took 11 seconds to yield results. In a genetic optimization this model is called at least 8,000 times, making an optimization last almost 30 hours. However, once the neural net had been trained (3 hr) it emulated the system with a degree of inaccuracy (15%). With the ANN acting as the 3D model, the GA optimization time was reduced from 29 hours to 12 minutes. The results encouraged the development of a more reliable, robust, and easy to use artificial network. In addition to surrogate modeling, the neural module design is flexible enough to be used for several other applications within the web-based modeling environment. These applications may include: classification, data fitting, data and system analysis, function approximation, and time series prediction. For example, a system analysis possible application is using a neural model to analyze the systems behavior. The neural net gains an insight in the relationship between the inputs and the outputs of the system. This is very important in areas like data mining. 2.5 Feed Forward Neural Network Tradeoffs Neural models are not the best technique for every problem. Neural networks may be ineffective for certain numerical and symbolic manipulations. For example, when one of the inputs of the network is a discrete value like a word, this value has to be indexed into a number in order for the network to interpret it. In many cases this in undesirable, since by indexing this input, it might loose the actual influence it has on the system, or for that matter gain unnecessary one. Another disadvantage of this network architecture lies in that there is no fast and reliable training algorithm that will guarantee the convergence of a global minimum (Masters, 1993). However, neural networks are very flexible techniques, which may develop intuitive concepts where the nature of computations required in a task is not well understood (Bose and Liang, 1996). For example, an ill- MassachusettsInstitute of Technology - Computer A idedDesign Laboratory 17 designed system that lacks understanding may be better modeled with a neural net than a rule-based approach. Another issue presents itself when the numerical accuracy of the system is crucial. A neural model will have limited accuracy; however, a properly train neural network will learn the system's the trends and emulating these well. Accuracy depends greatly on the training and architecture of the network. Many factors influence the inaccuracy of the neural model. Some of these may include: a small amount of training time; incorrect setup of the network; or an inadequate training-set of exemplars. In the case of surrogate modeling, accuracy is compromised to the model's execution speed. Therefore, it is up to the modeler to prioritize between speed and accuracy. 2.6 Distributed Object-based Modeling Environment (DOME) DOME is a web-based design-modeling framework that allows the integration of models. Within this environment, these models may be treated as systems with input and output variables. The system enables collaborative and concurrent product development through the integration of existing and user designed software programs (Senin et al., 1997). Some examples of existing software include: Excel; MATLAB; Pro Engineer; Solid Works; IDEAS; and TEAM (a life-cycle assessment program). The integration of software provides a common environment for all participants involved in the same project. This permits a fast propagation of information to all disciplines in the product model. The software also allows the evaluation and optimization of design according to selected factors and parameters (Senin et al., 1997). MassachusettsInstitute of Technology - ComputerAided Design Laboratory 18 Distributed Object Modeling Environment Web-based design-modeling framework Enables collaborative and concurrent PD Ease of information propagation Allows Design evaluation and optimization Figure 2-2 The Neural Network Module within DOME DOME allows designers to integrate many mathematical models through its modular capability (Senin et al., 1997). The package decomposes the design problem into different sections called modules, which are interrelated according to their dependability upon other variables from other modules (Senin et al., 1997). The package will only move as fast as its slowest module. Some modules require extensive computation in third party applications, which can make them very slow. As a result, these modules slow the whole process. The basis of this thesis lies in fmding optimal Artificial Neural Networks MassachusettsInstitute of Technology - Computer Aided Design Laboratory 19 that could be trained to simulate effectively slow modules, and therefore as a result, speed the processes. 2.7 Surrogate Modeling It is important to stress that a computationally expensive model slows down the flow of information within a web-based integrated environment. In a way these slow model may be analog to bottlenecks in manufacturing processes. The substitution of these bottleneck models with alternate or surrogate models is referred as Surrogate Modeling. The concept is illustrated in figure 2-3. The idea relies on that once a slow model is identified, a neural network monitors the input/output activity of the intricate model/system. Every time the bottleneck executes the network stores the inputs and outputs in a database. When an adequate number of input/output points are collected, the neural network trains with the points in the database and then eventually replaces the slow system. Figure 2-3 Surrogate Modeling Concept Surrogate modeling implies that the neural network module has other applications as well. This may include: linear regression; classification; time series prediction; etc. The neural network will train as long as the data is reliable and coherent. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 20 2.8 Basic Network Design When designing a neural network, several issues must be considered. Part of the architecture of the network depends in the number of inputs and outputs of the system to be emulated. The input signal is an independent variable and the output is a dependent variable. In general, designing a network with exactly one output neuron proves to be favorable, even though a single network may be design to predict more than one variable (Bandy, 1998). There, an emphasis must be made on what is to be predicted and what will be used to make the prediction. Before using the data for training, input variables must be preprocessed; a great effort in the design of neural model is devoted to this problem (Bandy, 1998). The input variables should be presented in dimensionless terms, if possible. A neural network rarely processes raw data (Bandy, 1998). In order to avoid this, scaling, averaging, and normalizing are used to smooth the input for better training results. The preprocessing of the data increases the usefulness of the information. Generally, two to three layers are used for a feed-forward network. The standard neuron ratio between a 3 layer network is I : 2 x (SQRT I) : 1 for input, hidden, and output layer, respectively, where I is the number of input neurons. However, intricate simulations require many more neurons in the hidden layer. The art to an optimal design is using just the right amount of neurons. The time to train the neural net will vary exponentially with the number of hidden neurons. The neural model trains with a set of history data called exemplars. A good approach is to partition the data as follows: 60% for training the model, 20% for testing the model periodically during training, and 20% for verifying the model with out-of-sample data (Bandy, 1998). In the training session the network will learn the relationship between the input (independent) and output (dependent) variables. If after training the network fails to predict the sample data, then further adjustment to the net needs to be performed. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 21 3 RESEARCH SCOPE The scope of this work seeks to provide a neural network tool within the DOME framework. With an integrated modeling environment like DOME, many possibilities for neural network application arise. However, to create a neural net able to satisfy all these potential applications successfully may be a challenge. This thesis intends to introduce a robust and solid paradigm for designing neural networks for distinct model/applications within a web-based integrated model environment. There are many issues to be considered. The scope of the work can be decomposed in three phases. First the complexity of the model needs to be determined. The next issue lies in finding an optimal set of exemplars from the model, and these in turn will useful for designing and training the best neural network possible. Tradeoffs between net speed and the compromised accuracy of the model can be established at the later stages. Figure 3-1 Thesis' Work Flow Figure 3-1 illustrates how the different sections of the thesis relate to each other. Scope begins with analyzing the system to be emulated, then a good training data is generated, and finally the neural network is trained using the acquired information. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 22 As pointed out previously, the neural network tool aims at aiding the design process not just for system emulation but also it may be use for quick assessment of system complexity and exemplar generation. The tool's interface design is comprised of a userfriendly environment where expertise in neural networks is not required to operate it. 3.1 Describing Systems The following section describes some of the functions that will be utilized through the next chapters for experimentation and testing purposes. Note that these functions will be used within DOME as systems in order to simulate a specific behavior, from very simple (e.g. a line) to intricate. The simplicity of these systems will serve to illustrate the simple function complexity assessment. Most of the three dimensional function were taken from genetic algorithm testing (Gruninger, 1996). Also note that in the later chapters, real problem applications will be discussed as case studies. Two Dimensional Functions In this plot we illustrate three different types of functions: linear; cubic; and sinusoidal. By examining these three functions it is possible to test each one of them and compare their results. As we can see for the figure (3-2), it is expected that the sinusoidal function give a higher complexity value, follow by the cubic, and finally the line. These functions are described by equations (3.1, 3.2, 3.3, and 3.4). y =1.6. x -0.8 (3.1) y =1000 -(x)3 (3.2) y = sin(2)r -x) (3.3) y=x 5 (3.4) MassachusettsInstitute of Technology - Computer Aided Design Laboratory 23 0.667 7".. (.). / Figure ~ ~ -iesoa 0- ucin.Teeicue ie() fnc7o fucton( .. ),5"dere ui ucin adasnsia - x+(. 0333 A 3D-0333 ov r[;1- 0 p .66 at... . 5. p.s.d.....by ... -------------- 3D ... The Y(1I 2 = 50 .+X3(.5 MassachusettsInstitute of Technology - ComputerAided Design Laboratory 24 1500 500, 10 11- 5 4 6 2 x2 0 0X1 Figure 3-3 3D Ramp function. A Unimodal function This test function, as shown in figure 3-4, is define over [-10; 10] x [-10; -10] and has global maxima at (0,0). In this three-dimensional function we can see that x, and x2 are symmetrical. Thus we would expect xi/y and x2/y relations to have the same variance or complexity metric score. Equation (3.6) is used to generate this test function. Y(x I X2 ) 600-(xl+x 2 ) 2 60 _(X 1 2 +X 22 ) MassachusettsInstitute of Technology - Computer Aided Design Laboratory (3.6) 25 10810 -0 5x2 10 10 0 A Figure 3-4 The Unimodal function. Himmelbau's Function The figure is shown below 3-5. Graphed over [-5; 5] x [-5; 5] space, it contains four maxima, which are unevenly placed providing an unsymmetrical shape unlike the unimodal figure (3-4) above. Even though the figure is not symmetrical, by the look of the position of the maxima points, we would expect similar variance for both x1/y and x2/y relationships. Equation (3.7) defines the. y(XIX 2 )=5-(x12 + x 2 -l)2 +(x 1 +x 200 2 7)2 MassachusettsInstitute of Technology - ComputerAided Design Laboratory 26 5 - -- 0 - x2 -5 Figure 3-5 Himmelbau's function. The Hilly Function This function was primarily design for genetic optimization purposes (Grininger, 1996). However, due to its great complexity it may be a good subject for our neural network. We know from previous work (Deniz, 1998), that it will be very hard for this function to be emulated, however, this system was trained using evenly distributed data generation, we would like to see if we can train now the net using the ranked function method develop in this thesis. The function in itself is divided between a huge space of [-100; 100] x [-100; 100]. It contains 36 maxima points non-uniformly distributed across the system's space. It is described by the equation (3.8). [IxI ( y(x Ix2)=10- e 50 1-Cos 3X16 6 -3 1004 O, il1 +e 62 2s0 1004 e +2- JX2|4 -COS ) MassachusettsInstitute of Technology - Computer Aided Design Laboratory [ _, 50 b _x2) ) 27 4 with b= -- 1004 . (3.8) (6 40s.--- 30, 120, 10.. 100 x1 Figure 3-6 The Hilly Function 3.2 Case Studies: Design applications This section discusses briefly the design applications used later to test the ANN. The use of these case studies will serve to demonstrate the flexibility of the neural module to perform seamlessly in different applications within the DOME framework. The neural module will be use for function regression, system emulation, and data analysis problems. The first case study deals with the use of neural network module as a surrogate for a Life Cycle Assessment (LCA) model. LCA systematically assessed the environmental impact throughout the entire life of the product (Sousa et al, 1996). However, this assessment MassachusettsInstitute of Technology - ComputerAided Design Laboratory 28 may take a long time to accomplish, and be very difficult to perform as well. With a neural network as a surrogate LCA, real-time approximate LCA estimates are feasible (Sousa et al, 1999). Within the DOME framework there are many tools for facilitating and easing the interactions within models. These tools may include from providing system optimization, 3 rd party software integration, and system structure assessment to system confidence tools. They are design to work within a collaborative framework that allows a swift interaction on information within system models. However, some of these tools may be computationally expensive requiring between minutes to hours of execution and rendering them hard to interact with. In general, this amount of time is fine, however, if the modeler would like to iterate through the model more quickly, even if it slightly compromises accuracy, a neural network could be trained to emulate a setup tool. As a second case study, the neural network will be trained to interact for a system. Finally, the third case deals with the problems that arise when running a Genetic Optimization. In an optimization a model in general is called thousands of times. Some of these models are bottlenecks in a whole optimization process, which in many cases propels the optimization to take many hours to finish. For this case study, we will use a neural network to emulate an FEA analysis for an automotive door seal, in order to reduce the execution time of a Genetic Optimization. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 29 4 MODEL COMPLEXITY ASSESSMENT In this work, studying the system model is a very important first step in the neural network development. With a design environment like DOME we can quickly query the system to get an insight on how complex or hard will it be to emulate. This information will be used to determine how much effort is necessary to generate a good set of exemplars, as well as, how big the architecture of the neural network should be. In general, a very intricate model needs a large amount of neurons; the complexity of the system to be assessed directly relates to the number of neurons in the neural network. If the intricate model turns out to complex, several neural networks may be use to emulate the model instead of one. Model complexity in this work refers to the level of difficulty the neural network will have in training properly for the system. The complexity in general is influence by the presence of some characteristics in the model. These may include: the number of minima and maxima points involved in the space system, the steepness of the function derivatives, as well as the sensitivity of each of the input variables of the system. These characteristics not only give a sense of intricacy for the model but also in reality make it very difficult to map and emulate the systems containing them by neural networks. Thus, the higher the presence of this features in a system the harder it is to emulate them. In summary, the first part of this thesis explores a fast and simple way to evaluate these factors in order to obtain a quick insight about the complexity of the model/system. 4.1 System Function Decomposition The approach is based upon the premise that an output is more sensitive to some inputs changes than others. Therefore, each input/output relationship is assumed to be a different function (e.g. three inputs and two outputs decompose into six functions). At this point every function is assessed individually. By evaluating every input-output relationship, the complexity of each function will be determined and ranked within the rest. It is expected that those functions with higher sensitivity will require that more exemplars be taken MassachusettsInstitute of Technology - ComputerAided Design Laboratory 30 from these relationships, in order to map their behavior more effectively. In the figure 4-1 below, we show a simple decomposition of the 3D ramp function discuss above, equation 3.5. 1500 1000. 500 10 0 0 1 10M 450 am 350 700 300 sW sm SM -250 2DW 150, 300 190 2M 100 50 0 1 2 3 4 5 6 7 0 9 1 00 I 1 2 2 3 3 4 4 5 5 6 7 a 9 1 Figure 4-1 System Decomposition. This figure illustrates the decomposition of a 3D ramp function into a line and a cubic function. It is important to point out that this method will not be entirely accurate as for its maximum performance relies upon a complete decoupling of the system. We understand the tradeoffs, when an output is dependant to more than one input the model decoupling into independent functions in many cases will be impossible, rendering the assessment to be inefficient. However, the purpose of assessing the system in this manner it is simply just to get a quick idea on how intricate the model is and not to give a detailed assessment of the system behavior. We decouple the system by probing it with a changed on an input while we hold the rest of the inputs at constant values. Then we acquired the corresponding points for each of MassachusettsInstitute of Technology - ComputerAided Design Laboratory 31 the outputs' changes. We follow the same procedure for each of inputs in the system. With the information gather from each analysis, we can determine which of the input output relationships need more data in order to map these surfaces correctly. 4.2 Metrics: a quick assessment to input/output relationship In order to evaluate and effectively compare an input/output relationship with another, it is necessary that we measure them with the same scheme. Therefore, in this section we define a simple metric that we relate to function complexity. In no ways does this metric provide a detailed assessment of the function complexity. The assessment of each function follows a series of steps. First, a user predetermined amount of points for each input / output dimension relationships is collected holding the other inputs at random values. This is done several times to gather several functions for each relation. To insure that every function is measured equally, each one is normalized in a scope of [0; 10] x [0; 10], as seen in figure 4-2b below. Then, the module performs a linear regression on the function and calculates the regression error variances. An average error variance is calculated for each relation. Finally, the module also counts the number of times each function shifts slope signs, which will be used later to design the size of the architecture of a neural network. Figure 4-2 Linear Regression Assessment Concept. This figure illustrates the concept of assessing a function. (a) shows the function, (b) shows the normalized function to a [0; 101 x [0, 101 frame, and the (c) shows the regressed hine. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 32 Neural networks emulate straight lines more easily than sinusoidal lines. Thus, the analysis parts on the premise that the more complex the function is, the harder it will be to perform a liner regression, yielding higher variance error, and more slope shifts would be found. 4.3 Assessment of System Test In order to develop complexity metrics, first linear regression variances were calculated for several continuous SISO systems. These include: a line, a cubic function, a sine function, and an exponential function of the 5th degree. The 2-dimensinoal results are shown below in table 4-1. As expected the linear regression was able to fit the line without any difficulty yielding an error variance of zero; however, once the systems increased with complexity it became harder to fit the curves yielding higher variances. It is important to point out that the linear regressions where performed with various number of points as to reach a stable variance value. By comparing the functions in the table 4-1 below we can infer that the higher the intricacy of the system the higher the variance of the regression. However, the variance was not enough to assess the intricacy of the system. In order to corroborate our hypothesis, we took a neural network with the same training parameters and implemented it for all three systems. According to the results of the training error, the lower the variance, the fewer the number of points needed to train the neural net. With these results, the next step was to develop the system complexity metrics. A system metric for variance, correlated with the information gathered from the experiments (See Appendix A), was developed. Since we envision the use of the metric as a way to relate the system's input/output, we determined that it was best to start with a metric value of 1 when the normalize variance error of the regression was zero (a line), which contains no slope shifts. Also, the thought that the variance could reach to very high values made it necessary for the metric to be scaled down considerably for such MassachusettsInstitute of Technology - Computer Aided Design Laboratory 33 cases but that it was also necessary for it to be sensitive to small variance values. These factors led to the development of a logarithmic scaling factor for the variance error. After playing with the logarithmic relation we came up with the equation below, where a variance of zero yields a metric equal to one. (4.1) metricvariance =ln(errVar+1) +1 Table 4-1 2-Dimensional Function Complexity Test results. All neural networks were trained using the same parameters: il = 0.002 (training rate), 80,000 epochs, and a 1:15:1 network architecture. The detailed data collection may be found in Appendix A. Normalized Error (%) Error (%) Variance Error 5 points 30 points Line 0.000 12.0 8.9 1.00 Sine 2.093 71.0 16.5 2.12 Cube 1.063 24.2 16.2 1.72 5th Degree Function 1.245 24.7 20.4 1.81 2D Function Metric The metric was tested using the different functions (two inputs / one output) described in section 3.1. As we can see in Table 4-2, by decoupling the functions into individual relations we were able to differentiate the two inputs and identify, relative to each other, their influence in the system. As shown in Table 4-2, it was expected that the symmetry in the Unimodal function gave both decouple relationships equal metric value, hence each input has exactly the same influence on the system. Also in the case of the Hilly function, it's complexity is so thorough that we almost got the same results by analyzing just a quarter [0; 100] x [0; 100] of the system, instead of the complete model [-100; 100] x [100; 100]. Massachusetts Institute of Technology - Computer Aided Design Laboratory 34 Table 4-2 3-Dimensional Function Complexity Analysis results. The variances where calculated using n = 20 number of exemplars. Note: that on Appendix A we have a more detailed figure of the gathered data. Variance Variance Metric Metric Aslope Aslope x1/y x2/y xl/y x2/y x1/y x2/y Ramp 0.00 1.06 1.0 1.7 0 2 Unimodal 2.15 2.32 2.2 2.2 0 0 Himmelbau 2.06 1.72 2.1 2.0 1 2 Hylly (sub) 1.71 2.78 2.0 2.3 4 5 Hylly 2.52 2.93 2.3 2.4 8 9 3D Functions function With the use of the system variance metrics and also using a predetermined number of exemplars, we can start training the system. In the next chapter, we test the develop metrics for collecting the data and compare the efficiency of this method with other data generating techniques by training a neural network with equal parameters. 4.4 Possible Alternate Methods There are other methods that can provide better insights into model complexity. Approaches like Genetic Algorithms may be use to determine the number of minima and maxima (Goldberg, 1989) of the system and with this knowledge concentrating the exemplar generation in these areas. However, this method requires running an actual optimization, which may be slow and require unaffordable time. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 35 5 SET OF EXEMPLAR GENERATION This section describes the importance of data collection prior to training a neural net. We describe the different methods by which exemplars may be gathered. A brief description for each of them is also given. Finally, this chapter also presents a comparative study in which all data generating methods are trained with the same neural network. Note that exemplar generation only applies to the implementation of the neural network for system emulation purposes. DOME's neural network module can also be fed the data through a file; in this case data generation may not be necessary and in most cases it is not possible. 5.1 Collecting the Optimal Set of Exemplars A neural network can only be as good as the data it is trained with. Finding a good set of exemplars is key in coming up with a good surrogate model. Therefore, a good part of this thesis was dedicated to developing/implementing a mechanism that, given a system analysis (Section 4), the network module will be able to generate a set of exemplars for model training. For this work, we define a good set of exemplars as the collection of the minimum amount of data points necessary to get a satisfactorily trained neural network. These types of algorithms may be referred as to data compression, where a small amount of exemplars make a good representation of the system's space (Plutowski and White, 1993). It is intended that when the neural net is hooked to the system, the network will study it and then determine the best set of training exemplars. By identifying which input changes have more influence on the system behavior, we are able to take a closer look at these inputs. This is especially important when generating the exemplar set because then we will be able to get more exemplars for these inputs. It is important to point out that the key to exemplar generation is collecting just the right amount without taxing the network with unnecessary points. For a given number of neurons, the smaller the number of exemplars the faster the neural network will train. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 36 Also there is a point in time where adding more exemplars will not make a difference in the quality of the training. There could be several ways of collecting data within DOME. These may include: listening to changes, collecting a uniformly distributed number of points for each input, a random data generation, and finally a ranked input data generation. All these methods are implemented within the neural network package; however, only the rank input method was created for this thesis. 5.1.1 Listening for changes: a parasite analogy Once the neural network is added in the modeling environment and connected to the system, listening for changes is the simplest collection scheme to use; however, it is also the most inefficient data collection scheme. The idea is that as the model is generally used within DOME the networks just lives attached to the model like a parasite. It listens to all input and outputs changes without interfering or stimulating the system while collecting data. Once the neural network is activated, it detaches from the system and trains from the data it has just collected. In this scheme the network is completely dependent on the user. The best-case scenario is that the system is simple and easy to emulate requiring very few exemplars to train the network. Even then, the network is dependant on the expertise of the user. Thus this method becomes risky as for important points might have never shown up while the network was listening, and therefore these points were never collected. Figure 5-1 shows 99 points collected manually by consciously trying to map the whole system space [0; 10] x [0; 10]. From the figure we can see that certain areas has been untouched by the collecting scheme, therefore, the neural network will not be able to emulated accurately this areas. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 37 1500 1000- -10 5 6 2 4 x2 0 0 X1 Figure 5-1 Manual Data Generation. Probing the system manually we gathered 99 points. Many areas lack sample points. 5.1.2 Evenly distributed Generation By collecting data evenly across the sample space, we are able to cover systematically the space within establish parameters as seen in the figure 5-2 below. This is very effective when dealing with complex systems, where all inputs have similar influence on the system and thus every input/output relationship has similar complexity. However, in many cases when there is disparity of influence within the inputs then this method becomes ineffective due to the high collection of data points when it is not necessary to have many points. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 38 1500 100. 5 4 2 x2 0 0 X Figure 5-2 Even Distribution of Training Samples. The neural network module probes the system and collected an equal number of points for every input/output relationship. In total, the network gathered 100 points. 5.1.3 Random Generation Random data collection is a very popular scheme. It is widely use for generating a population in Genetic Optimizations (Gruninger, 1996). However, for neural network purposes, the method proves to be somewhat inefficient. As seen in the figure 5-3 below, the collection scheme leaves many areas unexplored, which in turn the neural network will not be a able to generalize accurately. Unlike GA's, which have mutation and eventually produce these areas, a neural network will only work with the data that was given from the beginning. Below we can see an example of random data generation within the same specified scope as the previous figures. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 39 1000 -- 00, 10 10 5 6 2 x2 0 0 A Figure 5-3 Random Data Generation. 100 points collected. Some areas contain more points than the minimum necessary amount to train the neural network, while other areas lack this minimum requirement. 5.2 Ranked Function Data Generation Method Using the metric explained in the previous chapter, we are able to identify which input/ output relationships are more intricate than the others. With this information, then the neural network module can prioritize for these complex relations and collect more points for them. Once the network module obtains the values of the metrics for each input/output relationship in the system, the number of exemplar points for each dimension can be determined using the calculated percentage of influence by each input. Equation 5.1 describes this calculation, nInputs nExemplars. = metric totalExemplars nInputs [ (5.1) metric, where i is the input index, and totalExemplarsis the maximum amount of data points that the user desires to collect. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 40 Figure 5-4 shows the generation of 100 data points throughout the system space. As expected, the network module identified the dimension x2 as the most intricate or harder to map of the two inputs, therefore, it collected more points for this relationship. The next section contains a detailed comparison of all the collection schemes presented in the previous sections. 1500 - 10 - -- -.- 10 5 2 x2 0 0 4 X Figure 5-4 Ranked Input Data Generation. 100 points collected within the system space. Notice that there are more points in the x 2 dimension than the first. 5.3 Comparing Data Collection Methods, and Training the Network Using the neural network module, we collected data from the Ramp function using all four cases mentioned above. After collecting the data, in order to insure an unbiased comparison the net module trained for all cases with the exact same parameters. In this first exercise, we were not looking to get a good trained neural network instead we focused in understanding how the neural network trained for the different fed data. In this test, some of the results of the test were expected while others came as a surprise. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 41 5.3.1 Testing the Ranked Method Data Collection Scheme As seen in the table 5-1 below, training the network with the listening method proves to be very inefficient. We can see that even in a small and simple sample space, the acquired data with this method is not very useful. Even though the neural network trained to within a respectable percentage, when the network module tested the network with never unseen exemplars, the just listening method yielded a high percentage error. The evenly distributed method, and the ranked method performed as expected. The neural network trained within a good test error. However, because the ranked method puts more emphasis on the more complex input / output relationship, the collected data holds a slight advantage over the evenly distributed data. Table 5-1 Comparison Study for all Data Collection Schemes for Ramp function. The neural network trained using the same parameters for all collecting cases (Ti = 0.0015, and 15 hidden neurons). Data Generation Just Listening Evenly Distributed Random Generated Ranked Method Test % # Exemplars Epochs Training % 37 40,000 22.07 42.27 60 50,000 14.74 32.30 36 40,000 17.40 20.31 64 40,000 5.44 7.20 40 40,000 5.03 10.05 60 40,000 11.05 9.99 40 40,000 6.25 5.20 60 30,000 8.37 6.75 MassachusettsInstitute of Technology - ComputerAided Design Laboratory 42 The random generated data proved to be better than expected, enabling the neural net to train within a fair testing error. In this case, the good results are probably due to the fact that clustering the data in certain areas gave the network a good basis for generalizing for the untested ones. In this example, since the surface is flat, the neural network was able to generalize with ease. However, in the cases where the random collection scheme fails to collect data in areas where the complexity is high, the network will not be able to generalize accurately for thee areas. The figure 5-5 below describes the behavior between the Test Error % and the number of epochs the neural network trains for. The reason behind the listening test error erratic behavior lies in that the test error percentage is actually calculated every time with new unseen data randomly generated. These plots show that the manual listening method fails to create the data necessary for a neural network to learn how the generalize accurately, while the automated other three methods perform a good job. 50 140 V . 40 0 ------- I 40 ----/------- - - - -- - .... 2 - - -I- - - - - - - - -- 2010 20 1.5 ------- -- ------- - ------- 2.5 Epochs -- 3 . ...... 3.5 4 x 1O4 1 1.5 2 2.5 3 Epochs .......... 3.5 4 5 4.5 w 4 Figure 5-5 Data Generation Scheme Network Training Comparison. (a) Forty points collected for training. (b) Sixty points collected for training. Figure 5-5 shows two graphs displaying how the Test Error percentage fluctuates with the amount of epochs the neural network trains for the Ramp function. In the plot on the left (5-5a), the network module collected around 40 exemplars using the different MassachusettsInstitute of Technology - ComputerAided Design Laboratory ====7=1 -Zim 43 mechanisms describe above. The plot on the right (5-5b) shows the results for the network trained with 60 exemplars instead. 5.3.2 Training the Hilly Function using DOME's Neural Network Module The Hilly Function's complexity as you may see from figure 3-6 demands respect. It is a very difficult function for the neural network to handle. Hilly comprises 36 maxima points; of which, all of them are unequally-spacedpeaks of non-uniform height. In order to get an understanding on the training difficulty of the Hilly function, we started by just plotting it with few points; empirically we concluded that the minimum amount of points to map the function within an acceptable percentage of 10% would be around 576 points. 25A 17"0, 20, 15, p/ ~It 100 50 100 00 x2 -100100 v Figure 5-6 Training the Hilly with 144 points. (a) Illustrates a plotted Hilly function with only 144 points. (b) The neural surface emulation after training for 3 hours and 16 minutes. Parameters used: 1n = 0.005, and 15 hidden neurons. It is important to point out that the test error at this point yield 38%. The first attempt at training the hilly function proved to fairly successful (figure 5-5b). This net was only trained with only 144 points that covered the system space. With the same 144 points we plotted the Hilly in figure 5-6a. From the figure it can be observed that the neural net, indeed emulated the system that it was given. The 144 points plotted Hilly on the left, only has 16 maxim points (instead of its regular 36), and the trained network contains the same number of peaks. In the end the training results promised that the neural network could be trained even better under more detailed conditions. In other MassachusettsInstitute of Technology - ComputerAided Design Laboratory 44 words, given more points, more time and a slightly bigger network, it was the understanding of the author that neural module was capable enough to handle the emulation of a function as complex as the Hilly. In the next attempt at emulating the Hilly, the neural network module trained with 625 points taking 5 hours and 43 minutes / 1,300,000 epochs to reach 23% training error and 30.77% test error. After this point the network was incapable of further decreasing the error. The figure 5-7 illustrates the plotted neural network simulation (b) along with the Hilly function plotted (a) with the same 625 exemplars the neural network was trained with. 40- ... ... 40 . 00 1000 100 0 -550 x2 100 1 100 -100 x x X1 Figure 5-7 Training the Hilly with 625 exemplars. (a) The hilly function plotted with the 625 exemplars. (b) The surrogate Hilly function, with ±30% test error. Parameters used: 1 = 0.0070, and 18 hidden neurons. The second attempt at training the network proved to be very good. Even though the neural network did not trained within a good testing error of 10%, the shape of the surface simulation resembles a lot more that of the Hilly function. Based on the previous experiments, we concluded training a network for the hilly function was going to either take more exemplars or a larger neural network architecture. Increasing the number of data should be avoided at all cost as it exponentially increases the training time exponentially. Recognizing the limitations of one neural network when MassachusettsInstitute of Technology - ComputerAided DesignLaboratory 45 presented with a big system space, we decided to tackle the training of the Hilly function by dividing the space into four equally spaced sectors: I, II, III, IV explained below in 144. table 5-2. Each sector contained the same amount of training points 576/4 Table 5-2 Dividing and Conquering the Hilly. The hilly function was divided into 4 equal sections: [0; 1001 x 10; 1001 (QI), [-100; 01 x [0; 1001 (QII), [-100; 01 x [-100; 01 (QIII), and [0; 1001 x [-100; 01 (QIV). The neural network trained for all sections using the same parameters: 1 = 0.005, and 15 hidden neurons. Training % Testing % Sector Epochs Time (sec) MS Error I 700,000 2247 0.00069 11.01 7.5 II 800,000 2585 0.00015 9.83 5.4 III 900,000 2899 0.00018 10.12 3.4 IV 600,000 1921 0.00019 11.31 2.1 ...... 40 ........ 30 ........ 40 ,,20 .......... 300 v 0, 100 100 50 so 0 -M x2 -50 -100 -100 X1 Figure 5-8 A Surrogate Hilly Function. (a) The Hilly function (b) the neural network trained to emulate the Hilly surface. Notice the striking similarity, and also the absence of the tallest peak in the surrogate model. Dividing the Hilly space yielded striking results. Not only the average test error was below 10%, the total training time that it took to train the four neural networks was only 2 hours and 40 minutes. The figure 5-8 above shows how similar the neural model emulated the Hilly surface. Notice the absence of the tallest peak in the neural net generated surface. With only a few points describing the peak, the neural net fail to MassachusettsInstitute of Technology - ComputerAided Design Laboratory 46 generalize those points. In order to map the tall peak, more points would be required around this area in order to emphasize it. In order to get them maximal global absent in the simulation, sector one was trained again with the same 144 exemplars used originally plus 10 more points taken from the sub-space of the maxima point at (79, 79). Figure 5-9 below shows the new sector I neural network surface emulation (b). By adding more points in that are the neural network was able to map that region more accurately within a 14% test error. 40 ... ~30 20 ~,0 40aY 40 0 x2 0 X1 Figure 5-9 Mapping the Tallest peak of the Hilly. Training Parameters: 1 = 0.0035, 16 hidden neurons. With the original data analysis proposed in section 4, locating this global maxima point would be impossible as for the analysis only detects a slope change in that area but not how wide or steep is the peak. This is good ground for continuing research. By having a more accurate understanding of the system space, the neural module could generate more exemplars in those areas that would be harder to map, for example the tallest peak of the Hilly function. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 47 6 CREATING AND BENCHMARKING THE NEURAL MODULE With the knowledge attained from intricate model complexity analysis, the architecture for the neural network can be designed. Finding an optimal neural network implies determining which parameters of the net to modify. This part of the thesis aims at discovering how the parameters of the network relate more to the complexity of the system. The idea is that with the information obtained from a previous analysis, the neural network module will automatically create a customized network, generate good exemplars and train the neural network with just a click of a button. From the ranked/metric function analysis, we have been able to gather an idea on how complex some or all of the relations in a system might be. Now using an empirical method we will try to determine a good size for the network, and parameter optimization using the data analysis. Even though a neural net contains several modifiable parameters, to insure simplicity the neural module will only be able to automate the number of neurons in the hidden layer. The rest of the values will be left at default. However, the user can still override any parameter (including the training rate, and the number of hidden neurons) manually in the neural module. 6.1 The Designer's Neural Network The key goal of this section is to automate much of the process of creating, designing and training a good neural network. In order to accomplish this, the metric of the variance plus the number of slope shifts calculated in the analysis will be used to suggest values for the number of hidden neurons and also for the speed at which the neural network should be trained. In this section, all functions presented in section 3.1 will be used to calibrate the calculation of these parameters. With the use of the neural module, each function was assessed and its combined metric and slope shift number were calculated. Then the best neural network was manually developed for each function individually. The results are shown in the table 6-1 below. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 48 From these results, it can be seen that the Slope shifts relate to the network size (number of hidden neurons). It is important to point out that the number of exemplars, with exception for the Hilly, surpasses that of the desire minimums. This was done in purpose in order to diminish the influence of this parameter when training the network and focus on the number of hidden neurons. Also all nets were trained with the same training rate of 0.0035. Table 6-1 Training the 3-D systems. This figure illustrates that the number of slope shifts has actually more influence on how the neural network on the network architecture than the combine system metric developed in section 4. Try # Combine Slope Hidden Exemplars Metric Shifts Neurons Epochs Training % Unimodal 100 4.4 0 5 22,000 1.9 Ramp 100 2.7 2 15 40,000 5.4 Himelbaus 100 4.1 3 16 60,000 6.6 Hilly 625 4.7 17 20 4,000,000 22.2 A linear relationship between the number of Slope Shifts and the number of hidden neurons was established using the Unimodal, the Ramp and the Himelbau functions since the neural network trained within good error. As seen the figure 6-1 below a line was fitted to these three points. The calculated line is defined below. Notice that the plot would suggest 70 hidden neurons, in order to train the Hilly function properly. This is not recommended or even desired, as for the training of such a neural net would take days to train. In this case then, then neural model would suggest a maximum amount of neurons of 20. nHiddenNeurons= 3.81 -nSlopeShifts + 5.57 MassachusettsInstitute of Technology - Computer Aided Design Laboratory (6.1) 49 25 20 10- U 0.5 1 1.5 2.5 3 2 # Slope Shifts 3.5 4 4.5 5 Figure 6-1 Slopes Shifts vs. Hidden Neurons. This figure illustrates a possible linear relation that suggests a good number of hidden neurons given number of Slope shifts on a curve. The lines parameters are m = 3.81, and b = 5.57. The figure was created with three points gathered from the ramp, himelbau and unimodal function. 6.2 Benchmarking against the MATLAB Neural Network Toolbox In this section we benchmark the neural net module with one of the most readily available neural network packages, the MATLAB neural network toolkit. The goal of this section is to get an idea on where does the neural network module stands in comparison to commercially available neural network tools. Using key elements we were able to compared both networks. These include: time to setup, training parameters and test results. Both networks trained to emulate the Himelbau's function (fig. 3-5 and 6-2) and the Hilly function. In order to ensure a good comparison, a the MATLAB feed-forward neural network trained using the same using the same training algorithm, the Back Propagation Method, and with the same training variables (training parameters, and same neural network architecture). Table 6-2, below, shows the results for both functions cases. Massachusetts Institute of Technology - Computer A ided Design Laboratory 50 Table 6-2 MATLAB Neural Network Toolbox vs. ANN Module Data Comparison MATLAB ANN MATLAB ANN Elements Himelbau Himelbau Hilly Hilly Setup Time (min) 90 5 90 5 Hid Neurons 10 10 15 18 Training Time (min) 0.99 2.21 360 346 Computer Speed (MHz) 300 400 550 400 Epochs 200 20000 1,400,000 1,300,000 0.00180 0.0007 48.0 0.0050 ±1.0 ±1.0 +1000 ±30 Mean Square Error Test % 21, 0.------.5 55 0. a0 -5 -5 5 4 z3 0 5- 5g 00 x2N -5 -5 xN -5 x 5 A1 Figure 6-2 The Hilly, The MATLAB and the ANN. (a) The top figure is the Himelbaus function. (b) The left figure is the MATLAB's simulation. (c) The figure on the right is the ANN module simulation. Both neural networks trained with the same amount of points (441 exemplars). MassachusettsInstitute of Technology - ComputerAided DesignLaboratory 51 The first test involved training Himelbau's function. Figure 6-2 above shows the performance by both neural networks. Test results (Table 6-2) indicate that even though the MATLAB network outperformed the ANN, the latter still emulates well. On the other hand, due to the DOME environment, the setup time for the ANN resulted on a much lower value than MATLAB. The setup time refers to the time it takes to load the data, prepare the neural network and begin training. The second and final test aimed at training the MATLAB network to emulate the Hilly function. The attempt failed, as the training mean square error never decreased less than 48. The plotted MATLAB simulation did not succeed in illustrating a resemblance of the Hilly. A final attempt to train the Hilly function using the MATLAB back propagation algorithm was futile. As in the previous section, the Hilly function was divided into for sectors (I, II, III, and IV). Even though the MATLAB network trained better for the smaller sectors yielding an MSE average of 32, the plotted surfaces poorly resemble the Hilly function. Thus, we failed in training the Hilly function with the MATLAB back propagation method. In a final attempt to train the Hilly function using MATLAB, we used a different algorithm, the Radial Basis Function. Even though, this is not the same network as for it contains a slightly different parameter configuration, we wanted to train a successful MATLAB Hilly function emulation. As seen in the figure 6-3 above, although a little distorted, the radial network mapped successfully the Hilly function to within 5% test error. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 52 40 - 20 - 01 10- 50 0 -50 x2 100 ---- ..-- - - 0 50 -50 -0 10X Figure 6-3 A MATLAB Radial Basis Function Hilly Approach. The Radial Function was trained using a spread of 5.0. MATLAB took 3.06 hours time to train. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 53 7 NEURAL APPLICATIONS WITHIN DOME As mention in the previous sections, the possibilities of using the neural network module within DOME are broad. This section aims at presenting some of these applications. Even though they may be simple, they illustrate effectively some of the most possible uses. 7.1 Life Cycle Assessment Surrogate Model The integration of LCA models into a design framework like DOME gives designers an effective way in determining environmental tradeoffs between design alternatives (Borland and Wallace, 1998). As mention previously, a Life Cycle Assessment of a product is a hard and complicated task, sometimes months are required to build these models (Sousa et al, 1996). During the conceptual design, a product evolves rapidly thus requiring for this assessment to be built in days rather than moths. This work concentrates on developing surrogate LCA models using learning theories like neural networks in order to provide quick LCA assessments as the product design changes. Sousa's recent work makes the use of the neural module described in this thesis (Sousa et al, 1996). From a product's LCA database, certain product descriptors were collected for each product. The goal of this study aimed at training a neural network to predict the life cycle energy consumption of each product. With the data collected (product descriptors with their corresponding results) the neural network trained. Since the data were limited, 10 neurons were used in the hidden layer. The neural network trained for 5,000 epochs. As a result, the life-cycle energy prediction fell within ±35 %, which for a real LCA is typically around ±30%. After testing the neural network with unseen data (test data) Sousa concluded that the surrogate model was able to predict the energy accurately as seen in the figure 7-1 below. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 54 10,000,000 - d d LCA mANN SwmrgaW LCA 10.000 1,0 00.,.., w mdk l eNad peleb pojctoe Preissef Figure 7-1 Comparison of Life-Cycle energy consumption. This figure was taken from (Sousa et al, 1996). It is important to point out the four products used here were not used to train the neural network. 7.2 Dynamic Monte Carlo Tool (DMCT) The Dynamic Monte Carlo simulation tool was developed by Jeff Lyons at the MIT CAD Laboratory, the tool provides a seamlessly dynamic propagation of probability distributions within the modeling environment (Lyons, 2000). Depending on the desire accuracy, a Monte Carlo simulation may take around the average of 2 minutes as it needs to iterate through a process many times. This simple case study will demonstrate how quickly a neural network can be setup to be use as a surrogate model. DOME makes use of the Dynamic Monte Carlo tool to evaluate the mathematical interactions between probability distributions. For this case study, the author adds two uniform probability distributions and calculates the corresponding output using the DMCT as seen in figure 7-2. Each uniform distribution contains a mean (R) and a standard deviation (a) that are use as inputs in the simulation, which in turns outputs mean and the standard deviation of the resulting distribution. The calculation of A+B takes about 1 minute 40 seconds each time the simulation is ran. Thus, the goal of the neural module aims at learning that from a given variables (the mean and the deviation) of these two probability distributions to generate the corresponding variables to the new distribution (the mean and the deviation). MassachusettsInstitute of Technology - ComputerAided Design Laboratory 55 Figure 7-2 Addition of two Uniform Distribution within DOME. The setup of the neural network module plus the training time was about 27 minutes. This includes the 16 evenly distributed exemplars that the neural network automatically gathered before starting to train for the network. Due to the simplicity of the model, the network trained for less than a minute with 0.20% training error and with a less than 1% test error. It is important to point out that the neural network will only be this accurate within the values that it was trained with. Outside of these boundaries, the output of the network will not be valid. 7.3 Finite Element Analysis emulation Marc's FEA package (fig. 7-3) is one of the 3 rd party tools that can interact with DOME seamlessly. In a concurrent DOME research (Abrahamson et al., 2000), it is presently use to calculate the force relative to position exerted by the glass window in a car's door-seal, which belongs to the Movable Glass System (MGS). Each time the analysis is called it takes 15 to 20 minutes to yield values. Lets say that in a genetic optimization of the subsystem this package would be called 8,000 times. 8,000 multiplied by the time it takes the FEA to calculate the force graph would prolong the GA optimization for 44 days. MassachusettsInstitute of Technology - Computer Aided Design Laboratory - 56 Thus, as a final case study, the neural module will be use to train and emulate the part of the FEA analysis that calculates the force graph. Figure 7-3 A door-seal Marc Model. The figure illustrates a seal door FEA analysis in the Marc software. In order to calculate this force plot, the FEA is fed four inputs: the displacement of the window, the friction, and specific thickness of the seal G, and H. The neural module gathered two hundred exemplars, for which it trained for 3 hours and 2 minutes. The end result of the training yields 6.5e-5 Mean Square Error. As seen below (fig. 7-4) on the plot the neural net simulation (in dotted lines) with unseen data resembles greatly that of the real FEA. MassachusettsInstituteof Technology - ComputerAided Design Laboratory W Romp-L_ " aft 57 0.35 0.12 0.3 0.1 0.25 - 0.08 0.2 ~0.15 IL 0.1 0.0 0.04 - 0.05 0.02 - - -- - - - - - - -- -- -- - - 0 -10. -9 -8 -7 -4 -5 -6 Displacement (mm) -3 -2 -1 0 -10 -9 -8 -7 4 -5 -6 Displacement (mm) -3 -2 -1 0 Figure 7-4 Comparison between ANN generated plot and FEA plot. (a) First 8 exemplars represented. (b) 8 Exemplars. In both plots, the dotted lines (-) represent the neural network simulation, while the solid line is the FEA results. Parameters used: 11 = 0.003 and 15 hidden neurons. MassachusettsInstitute of Technology - ComputerAided Design Laboratory 58 MassachusettsInstitute of Technology - ComputerAided Design Laboratory 59 8 8.1 CONCLUSIONS AND FUTURE WORK Summary In summary, this thesis focused on generating a quick to setup, easy to use, and reliable neural network tool for the use within DOME (Distributed Object Modeling Environment). The work was divided in four areas. Understanding the system to be emulated, generating a good exemplar set from the system, designing the neural network, and finally demonstrating the usefulness of the application through case studies. A quick way to analyze the system was develop using linear regression techniques. With these analyses, a quick and insightful estimate about the system complexity could be discerned. The data gathered was instead used for both generating a good exemplar set and also for suggesting a good size for the neural network architecture. The neural network module has several ways of automating the collection of data: random collection; evenly distributed collection, and rank relationship collection. Of these collection methods, the latter was design to use the information gathered in the system analysis. It was demonstrated that the ranked method holds a slight advantage that over the other two automated methods (random, and even data generation). The information gathered about the system analysis proves useful to the neural module, which uses it to suggest a good size for the neural network architecture. With this at hand, the whole process of creating, designing, and training the neural network can be automated to create a one click neural network. Benchmarking analysis showed that the Neural Network Module is capable of competing with commercially available products. In an effort to assess how good the ANN is, a comparison study with the MATLAB Neural Network Toolbox showed very positive results. Not only did the ANN perform similarly, but due the fact that it lies within a MassachusettsInstitute of Technology - ComputerAided Design Laboratory 60 modeling environment (DOME), allows the setup of the neural net to be almost 45 times faster. Finally, three case studies were presented to demonstrate the usefulness of the neural network module within DOME. An optimization of the MGS model with the Marc's FEA analysis was presented as infeasible. However, this obstacle was bypassed by using the neural network module as the FEA surrogate model. The neural model also showed how the iteration speed of the model could be greatly improved by using the neural network to emulate these slow systems. 8.2 Future Work Although the system metrics developed prove to be very useful, a more accurate and rich in information system analysis needs to be developed. With more information, a more detailed analysis would break way for more efficient types of exemplar generation methods. This methods will not have to be limited to just input / output relationships but could identify more accurately those areas in the system space were more exemplars are needed. An example is the tallest peak of the hilly function that was absent of the surrogate model. Even though a GA method may prove slow, a proposed methodology may include their partial use in order to search for the minima and maxima where exemplar may be more needed. A divide and conquered automation would be ideal in such cases where the system is too complex to train. A sample of the implications of such a method was demonstrated when the Hilly function was divided into four sections each containing a neural network and the results were very promising. Thus, it is suggested that valuable work could be made to determined the situation when dividing a very large system space into several neural networks, might improve the speed of training and the over all network accuracy. Finally, due to the success of the MATLAB Radial Basis Function in training the Hilly Function. It would be ideal for the ANN to implement this neural algorithm as well. MassachusettsInstitute of Technology - Computer A ided Design Laboratory 61 APPENDIX A - COLLECTED NEURAL NETWORK DATA Table Appendix 1 This figure is the data for the Set of Exemplar Generation Chapter. In here we show the training values for each of the functions that we use to test the system complexity metric developed. Try Line Cube Sine # Expis. Training # MS Error Hidden Training Time Epochs Error (%) Neurons Rate 5 1 2,000 0.0190 49.00 15 0.002 5 5 10,000 0.0006 11.00 15 0.002 5 16 40,000 0.0005 12.00 15 0.002 5 33 80,000 0.0005 12.00 15 0.002 30 2 2,000 0.0030 7.46 15 0.002 30 13 10,000 0.0030 9.29 15 0.002 30 106 80,000 0.0020 8.90 15 0.002 5 1 2,000 0.0531 125.70 15 0.002 5 4 10,000 0.0073 66.18 15 0.002 5 12 40,000 0.0044 25.20 15 0.002 5 25 80,000 0.0040 24.20 15 0.002 30 2 2,000 0.0034 52.97 15 0.002 30 14 10,000 0.0025 49.62 15 0.002 30 98 80,000 0.002 16.25 15 0.002 5 1 2,000 0.0560 85.00 15 0.002 5 4 10,000 0.0520 72.00 15 0.002 5 13 40,000 0.0510 71.00 15 0.002 5 26 80,000 0.0510 71.00 15 0.002 30 3 2,000 0.0304 67.00 15 0.002 30 10 10,000 0.0290 67.90 15 0.002 30 97 80,000 0.0015 16.53 15 0.002 5 1 2,000 0.0400 113.54 15 0.002 MassachusettsInstitute of Technology - ComputerAided Design Laboratory 62 5 4 10,000 0.0110 58.18 15 0.002 5 th 5 16 40,000 0.0064 24.60 15 0.002 Degree 5 31 80,000 0.0056 24.70 15 0.002 30 3 2,000 0.0090 126.00 15 0.002 30 15 10,000 0.0044 67.00 15 0.002 30 120 80,000 0.0003 20.40 15 0.002 Function Table Appendix 2 This figure is the data for the Set of Exemplar Generation Chapter. In here we show how the value of the variance and metric change with the number of data points collected for the system. Try (# points) Variance Variance Metric Metric x1/y x2/y x1/y x2/y Ramp (10) 0.00 1.17 1 1.77 Ramp (20) 0.00 1.03 1 1.71 Ramp (50) 0.00 0.97 1 1.67 Himel (10) 1.41 1.13 1.87 1.75 Himel (20) 1.34 0.95 1.80 1.67 Himel (50) 1.23 0.93 1.80 1.65 Modal (10) 3.13 3.39 2.40 2.47 Modal (20) 2.50 2.59 2.25 2.25 Modal (50) 3.72 3.72 2.55 2.55 Hilly-Sub (10) 3.10 3.76 2.41 2.50 Hilly-Sub (20) 1.95 2.88 2.08 2.35 Hilly-Sub (50) 1.73 2.79 2.00 2.33 Hilly (10) 2.99 4.75 2.38 2.74 Hilly (20) 1.90 2.93 2.06 2.36 Hilly (50) 1.98 2.83 2.09 2.12 Table Appendix 3 Training Comparison Data of four Exemplar Collection Schemes. MassachusettsInstitute of Technology - Computer Aided Design Laboratory 63 Try Epochs Listening (37) Listening (60) Random (40) Random (60) Even (36) Even (64) Time MSE Train % Test % 10,000 16 0.0041 23.29 31.21 20,000 32 0.0038 22.97 50.44 30,000 48 0.0036 22.53 122.26 40,000 64 0.0035 22.07 42.27 10,000 25 0.0028 18.99 22.60 20,000 50 0.0025 18.06 26.36 30,000 75 0.0021 16.87 27.65 40,000 100 0.0017 15.81 57.45 50,000 125 0.0015 14.74 32.30 10,000 17 0.0033 25.39 24.50 20,000 34 0.0017 20.13 25.00 30,000 52 0.0007 11.96 18.99 40,000 70 0.0003 5.03 10.05 10,000 29 0.004 22.46 28.24 20,000 58 0.003 19.97 15.96 30,000 88 0.0022 15.72 19.04 40,000 117 0.0012 11.05 9.99 50,000 146 0.0007 7.99 24.30 10,000 16 0.004 27.28 20.83 20,000 32 0.003 38.17 21.21 30,000 48 0.002 28.58 22.10 40,000 65 0.001 17.40 20.31 10,000 27 0.0037 29.96 31.78 20,000 54 0.0019 21.23 32.40 30,000 82 0.0005 10.41 17.49 40,000 109 0.0002 5.44 7.2 50,000 137 0.0001 5.26 14.09 10,000 17 0.0033 34.80 31.9 MassachusettsInstitute of Technology - Computer A ided Design Laboratory 64 Ranked (40) Ranked (60) 20,000 38 0.0013 22.10 29.77 30,000 58 0.0005 9.66 15.96 40,000 79 0.0003 6.25 5.20 10,000 25 0.0022 23.65 32.88 20,000 50 0.005 12.04 16.86 30,000 76 0.002 8.37 6.75 40,000 102 0.00016 8.9 12.93 50,000 128 0.00015 9.5 15.28 Table Appendix 4 Hilly Function Progress Training. 144 data points, parameters used: 11 = 0.005, and 15 hidden neurons. It is important to point out that the test error at this point yield 38%. Training % MS Error Time (sec) Epochs 10,000 63 0.023 44.0 100,000 601 0.0208 39.4 500,000 2,943 0.0054 16.9 1,200,000 7,057 0.0009 7.3 2,000,000 11,750 0.0003 3.8 Table Appendix 5 Dynamic Monte Carlo Simulation Training session. 16 points training points, 4 testing points. Test % Training% MSE Time Epochs - 2,000 2 0.022 7.0 4,000 3 0.019 0.6 10,000 7 0.014 0.5 - 20,000 14 0.013 0.2 0.1 Table Appendix 6 Hilly Training Data. Training Parameters: nExemplars = 625, hidden neurons. = 0.007, and 18 MassachusettsInstitute of Technology - ComputerAided Design Laboratory 65 Test % Training % MSE Time Epochs 2,000 32 0.0276 58.92 100,000 1,546 0.0213 46.94 63.9 200,000 3,126 0.0144 37.81 2299.4 300,000 4,720 0.0138 36.92 79.6 400,000 6,310 0.0074 30.03 44.8 500,000 7,902 0.0063 27.21 86.4 600,000 9,482 0.0058 25.12 37.7 700,000 11,065 0.0057 24.22 43.3 800,000 12,708 0.0055 24.28 30.2 900,000 14,290 0.0054 24.19 38.6 1,000,000 15,871 0.0053 23.63 19.8 1,100,000 17,469 0.0053 23.09 187.8 1,200,000 19,049 0.0053 22.89 30.8 1,300,000 20,631 0.0052 22.79 29.1 1,400,000 22,213 0.0052 22.74 24.0 2,000,000 31,742 0.0052 22.59 571.9 3,000,000 47,613 0.0051 23.06 35.3 4,000,000 63,484 0.0050 23.1 111.7 Table Appendix 7 Hidden Neurons vs. Training Error Data. The table contains data aimed a finding out the right amount of hidden neurons for the Himelbau's function. The training rate was held constant at q = 0.0035 and the neural network trained with 100 evenly distributed points. Hidden neurons Epochs Time MSE Training % 3 120,000 201 0.0051 9.03 5 120,000 273 0.0039 7.77 8 120,000 327 0.0036 7.33 10 120,000 405 0.0038 7.90 13 120,000 478 0.0018 4.85 MassachusettsInstitute of Technology - Computer Aided Design Laboratory 66 15 120,000 531 0.0019 4.10 16 120,000 576 0.003 2.32 18 120,000 636 0.007 3.33 20 120,000 726 0.003 7.36 MassachusettsInstitute of Technology - Computer Aided Design Laboratory 67 REFERENCES Abrahamson, S., Wallace, D., Senin , N., Sferro, P., "Integrated Design in a Service Marketplace", Computer-aided Design, 32 2(2000) pp. 9 7 - 1 07 Bandy, Howard. Developing a Neural Network System. American Association of Individual Investors: http:// www.aaii.com/, 1998 Borland, N., and D. Wallace, "Integrating Environmental Impact Assessment into Product Design", ProceedingsofDETC98, Atlanta, GA, 1998 Bose, N. K., and Liang, P., Neural Network Fundamentals with Graphs, Algorithms, and Applications. McGraw-Hill, Inc., 1996. Cichocki, A., Unbehauen, R., Neural Networks for Optimization and Signal Processing. John Wiley and Sons, Ltd., 1993. Deniz, J. C., Application of Surrogate Neural Networks to Reduce the Computation time of Genetic Algorithms, Undergraduate Thesis, MIT Mechanical Engineering, MA, June 1998 Goldberg, D. E., Genetic Algorithms in Search, Optimization & Machine Learning, Reading, MA; Addison Wesley Longman, Inc, 1989 Gruninger, T., Multimodal Optimization using Genetic Algorithms, Masters of Science Thesis, Universitat Stuttgart, Germany, July 1996 Hung, S. L., and J. C. Jan, "Machine Learning in Engineering Design - An Unsupervised Fuzzy Neural Network Case-Based Learning Model", Department of Civil Engineering, National Chiao Tung University, Taiwan, R.O.C. Jacob, P. J., "User-Guide to the Manchester Radial Basis Function network", Division of Mechanical and Nuclear Engineering, U. of Manchester. May 1998 Lyons, J., Using Designer Confidence and a Dynamic Monte Carlo Simulation Tool to Evaluate Uncertainty in System Models, Masters of Science Thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts, 2000 Masters, T., PracticalNeuralNetworks Recipes in C++. Academic Press, 1993 Meghabghab, G., Nasr, G., "Iterative RBF Networks as Meta-models of Stochastic Simulations" IEEE, 1999 MassachusettsInstitute of Technology - ComputerAided Design Laboratory 68 Osborne, D., Armacost, R., "State Methodology", IEEE, 1997 of the Art in Multiple Response Surface Pahng, F., and Wallace, D., "Web-Based Collaborative Design Modeling and Decision Support", Proceedingsof 1998 ASME DETC, Atlanta, Georgia Plutowski, M., and White, H., "Selecting Concise Training Sets from Clean Data", IEEE Transactionson Neural Networks, Vol. 4, March 1993 Senin, N., Borland, N., and D. Wallace, 1997, "Distributed modeling of product design problems in a collaborative design environment." CIRP 1997 International Design Seminar Proceedings: Multimedia Technologies for Collaborative Design and Manufacturing,pg. 192-197 Sousa, I., Wallace, D., Borland, N., and J. Deniz (1999), "A learning surrogate LCA model for integrated product design", Proceedings of the 61h InternationalSeminar on Life Cycle Engineering. Kingston, Ontario, Canada: Queen's University, 1999, pp.209219. Vapnik, V., The Nature of StatisticalLearning Theory, Springer, 1993 MassachusettsInstitute of Technology - ComputerAided Design Laboratory