Automatic Creation of Neural Nets for use in Process Control Applications Sai Ganesamoorthi Entry level Software Eng. Fisher-Rosemount Systems Austin, TX 78759 Jay Colclazier Product Manager Fisher-Rosemount Systems Austin, TX 78759 Vasiliki Tzovla Senior Software Eng. Fisher-Rosemount Systems Austin, TX 78759 KEYWORDS Neural Networks, Pre-Processing, Training, Delayed Inputs, Software, function blocks, control process ABSTRACT Neural Networks are often used where a soft sensor is required to cross-validate the performance of a physical sensor or when the output must be determined by a series of lab analyses. Traditional approaches involve the use of a stand-alone application with little or no graphical help or development tools, often confusing the typical user. This paper describes an approach to structure the Neural Network as a function block and automate the complete process of collecting data, pre-processing, training, verifying the model and downloading the generated model to a real-time controller, with a GUI application. In simple words the concept extols the use of a ‘single-button’ approach provided by the application that works as part of a structured system, enabling users with no rigorous knowledge of Neural Networks to configure and run a Neural Network strategy for the process in question. The paper also details features of convenience and novelty such as automatic collection of data, a two stage pre-processing algorithm for identification of process input delays, automatic design of hidden nodes, automatic correction of errors/delays encountered in verifying samples by lab tests for online use, and options for fine tuning the resultant models. INTRODUCTION An intelligent sensor, more popularly known as a ‘Soft Sensor’, is based on the use of software techniques to determine the value of a process variable in contrast to a physical sensor that directly measures the process variable’s value. These sensors open a whole new world of possibilities and options that help to circumvent issues such as maintenance, cost and online use of physical sensors. A soft sensor is a highly trained neural network that can process inputs and spit out current values of the process variable, online, and can also be manipulated to predict future values. Classical approaches to the problem of designing and training a neural network for the actual process conditions involved a sequence of complex steps which often was a grueling experience for the normal user. More often than not it required a highly trained professional to create models. These models also had the drawback of not being able to constantly adapt to drifts in process inputs and other process conditions. 1 Copyright 2000 Instrument Society of America. All rights reserved This paper presents a novel approach that seeks to eliminate the drawbacks of earlier methods of neural network creation. Such an idea involved the concept of modeling a neural network as a function block (capable of execution in a controller) to which trained models can be downloaded. The training of models was done in a highly user friendly GUI application designed with the intent of simplifying things for the normal user. Salient features of the entire approach such as automatic collection/archiving of input/output data, a two stage algorithm for identifying key input variables influencing the output, automatic training with suitable values for key network parameters, automatic correction of errors and delays encountered during online use and modifying models are explained henceforth. SOFT SENSORS WITH NEURAL NETWORK MODELS Soft sensors find widespread use 1. In situations where the mounting and use of a physical sensor is practically not feasible, 2. as inexpensive on-line predictors of lab test results, 3. as cross-checking sensors for their online physical counterparts, 4. in applications where time is at a premium demanding immediate changes in process conditions, and 5. as an ‘if-then’ analyzer when the soft sensor is used to predict values into the future. To predict boiler emissions such as the amount of NOX, CO and particulates, quality parameters such as the asphalt pen index, batch end time, viscosity, polymer melt index, headbox pH are typical examples of soft sensor usage in the process control industry. In addition to determining values of process variables, soft sensors can also be used to provide a non-linear model for control and to identify key input variables that influence the process output. Soft sensors are nothing but Neural Networks that are highly parallel, computationally intense distributed information processing structures, with a non linear transfer function. While there are plenty of architectures for designing Neural Networks to serve a common purpose, a simple neural network can be envisioned with an input layer of nodes connected with weights to a hidden layer of nodes which in turn connects with weights to an output layer of nodes. Variations in the structure can be sought by changing the number of nodes in the individual layers, having multiple layer of nodes in the hidden layer. Typically the number of nodes in the input layer match the number of inputs and vice versa for the output layer. The input and the hidden layer have a bias neuron each with a constant source input of 1. The hidden layer nodes are modeled with a non-linear input-output transfer function. 2 Copyright 2000 Instrument Society of America. All rights reserved Input Layer Ouput Layer Hidden Layer N N N N N N Outputs Inputs Bias Neurons N Hidden Neurons with Non-Linear Transfer Function Fig 1: A simple Neural Network with two Hidden Layers The above is an example of a feed forward network because when fully connected, the output of each neuron in a layer acts as an input to all of the neurons in the next layer, with none sending outputs to neurons in their own layer or any previous layers. Once the network goes through the feed-forward propagation operation using an initial set of weights, output values are calculated and compared to the target values. The difference between the calculated values and the target values is the error of the network. To correct the network output the training program distributes the output error in a backward sense starting with the output layer, to each individual weight. Such an approach is called the error back propagation method. A simple implementation of the error back propagation network uses the gradient descent method, which always moves toward reducing the training error with a properly assigned learning rate or step size. However its approach suffers from a number of problems such as slow convergence and a fixed learning rate. Instead of using the gradient descent method we use the modified algorithm called the conjugate gradient descent algorithm. The conjugate gradient method essentially combines current information about the gradient with that of gradient from the previous iteration to change the weights. The resulting conjugate gradient training is actually adapting the learning rate to the best possible value, which results in much faster training. Another advantage of this algorithm is that the user does not need to worry about specifying the learning rate. MODES OF OPERATION As stated earlier, Neural Networks are structured as function blocks capable of execution in a controller. Once an engineer decides to apply NN for process output determination, he/she first creates a module containing a NN function block and ‘soft’ wires the inputs and outputs of the function blocks to appropriate process variables. The function block has a second output, the predicted output variable. This is the output of the block if the inputs applied to them had no delays associated with them. The 3 Copyright 2000 Instrument Society of America. All rights reserved prediction horizon of the predicted output is the maximum delay associated with any input in the bunch. These configured function blocks can be used in one of two possible configurations: 1. In cases where the process output is available only from lab analysis the NN block will be used in conjunction with a Manual Entry Block (MLE). The MLE block takes lab analysis values and the Fig. 2: Example of NN with manual lab entry sample time manually entered by the lab technician. The block insures that the value is within a valid range and provides it as input to the NN block. 2. In situations where an NN is used as a cross-validator for an unreliable analyzer, the NN output can be used as a backup to the analyzer for monitoring or control purposes. The output of the continuous analyzer is fed to the function block as shown in the block configuration below Fig. 3: Example of NN used with a continuous analyzer 4 Copyright 2000 Instrument Society of America. All rights reserved STEPS IN GENERATING A NN MODEL This section describes the various steps a user goes through to generate, download and execute a Neural Network Model in a completely automated fashion. Fig. 4 compares the traditional approach in generating a Neural Network model with the automated single button approach whereby the whole process starts and creates a Neural Network model for the training data selected. Our Approach Traditional Approach Obtain data from Lab Data Managment System Push Generate Button Format it into Input and Output files Ÿ Graph Input and out put data Select strategy for missing data View Outliers Ÿ Ÿ Automatic PreProcessing Automatic Design of Inputs and Delays Automatic Training Run Pre-Processing Download Guess Inclusion/Exclusion of Inputs Guess Delays Execution Calculate Sensitivities Automatic error correction Design number of Hidden Neurons Train Network Work on ways to export the model Port it to the controller Fig. 4: Comparison of our approach and traditional approach: 3x Faster than Traditional approach 5 Copyright 2000 Instrument Society of America. All rights reserved AUTOMATIC PRE-PROCESSING This step involves collection of data and pre-processing the data for outliers. The collection of real time data for potential inputs and outputs is facilitated by trending the variables on a scale changeable graph. Input and output data can also be obtained from a data archive. The user can move selectable bars that appear on the graph to select the region of operation. Upon selection of a time slice for training he/she can also deselect portions within the time slice that he deems unsuited for training. Such an option lets the user train a network with time slices scattered over a wide range of operation. Neural Networks are supposed to perform at their best when they encounter values that fall within the range of data that was used for training. So it is necessary to limit the range of training data within specified outlier limits. The software automatically screens the training data for outlier values by using the outlier limits of the configured function block values. If an input variable does not have preconfigured outlier values then it determines the maximum outlier to be the mean of the variable plus 3 times its standard deviation and vice versa for the minimum outlier. Fig. 5: Data Selection and Pre-Processing screen AUTOMATIC DESIGN OF INPUTS AND DELAYS This stage consists of selecting inputs and delays. It might not be best to use all the input variables configured on the function block. Instead it might be best to use some input variables with different time delays because an output could depend on the input over time. The Neural software provides a two-stage technique to determine the appropriate inputs and delays automatically. It does so by first calculating the correlation coefficients of each input to output with a fixed window size. This reveals the dependence, if there is any, of the output on the input over time. Peaks in these values are definite indicators of strong dependence of the output to the input at that particular time delay. The software actually parses these numbers to determine if there are peaks and determines delays to be used in the next stage. It also determines if the input has to be used at all. If the number of inputs to be used as determined by the previous stage is detected as one, the software starts an iterative process of adding inputs with zero delays one by one. Every time the sensitivities for 6 Copyright 2000 Instrument Society of America. All rights reserved each of the input and its delay to the output are computed. This sensitivity determination stage provides a picture of the overall participation of each input and delay to the output. If the sensitivity analysis returns values below certain limits, it iterates till it finds input with agreeable sensitivity values. If it is unsuccessful in doing so, it randomly chooses an input with zero delays. While the first stage looks upon the dependence of input to output one at a time the sensitivity determination stage presents the dependence of inputs to outputs taken all at once. Thus this stage provides an accurate list of inputs to be used along with their associated delays. Fig. 6: Sensitivity analysis diagrams for automatic design Inputs and Delays AUTOMATIC TRAINING Once the inputs and delays are determined the next step is to use a suitable Neural Network architecture for training. Determination of the number of hidden neurons is done by training the network with an increasing number of hidden neurons. As the program proceeds it constantly checks if there is a minimum of 10% reduction in the training error for each additional node used. It stops iterating when such a condition is reached. With this number of hidden layer nodes the software proceeds to training that actually creates the final neural net model. The training data is first randomly split into training samples and testing samples to guard against overtraining. The goal of training is for the neural network model to learn to predict, not to memorize the data presented. Many repetitions of running the training data through the network without testing are not adequate. Therefore, the Neural software uses a cross-validation (both training and testing) scheme to find the least test errors, without over-training or under-training. 7 Copyright 2000 Instrument Society of America. All rights reserved Fig. 7: Automatic training AUTOMATIC CORRECTION OF ERRORS The property of a process output stream predicted using a Neural Network and measured upstream conditions may be automatically corrected for error introduced by unmeasured disturbances and measurement drift. This correction factor may be calculated based on a continuous measurement or sampled measurement of the stream provided by an analyzer or lab analysis of a grab sample. Two approaches may be used to calculate the correction factor that must be applied to the NN prediction. Both are based on calculation of the predicted error using the time coincident difference between the uncorrected predicted value to the corresponding measurement value. Depending on the source of the error, a bias or gain change in the predicted value may be appropriate. To avoid making corrections based on noise or short term variations in the process, the calculated correction factor should be limited and heavily filtered e.g. equal to 2X the response horizon for a change in a process input. During those times when a new process output measurement is not available, the last filtered correction factor is maintained. An indication should be provided if the correction factor is at the limit value. Also a configurable filter is provided on the corrected prediction value to allow a customer to filter the noise in the input measurements. OUTPUT PREDICTION One of the ‘available nowhere’ feature of this product is the ability to use the trained Neural Network for predicting output into the future. Whenever a function block for a Neural Network is configured it always comes with a ‘Predicted Output’ along with the normal output. The ability to predict outputs into the future allows the user to perform ‘what-if’ analysis on his input variables and can make corrections immediately if that is not what he wants. It comes of great use in situations where the process delay time is very long and an error would cost a lot of time, effort and money. 8 Copyright 2000 Instrument Society of America. All rights reserved While the normal output is determined with all input delays included, the calculation for the predicted output is done with the assumption that all the delays associated with the inputs are zero. In other words, the predicted output has a prediction horizon equal to the maximum delay associated with the inputs. FEATURES FOR THE ADVANCED USER Though the Neural Software is basically designed with the normal user in mind, it also caters to the needs of advanced users who have the know-how in dealing with Neural Network models. Also, there is a small bit of chance that the automatic model generation might fail. This can be attributed to reasons such as bad data, high correlation between the inputs and/or slow error convergence. In these cases the user is advised to work with the software in the ‘Advanced User’ mode which gives a lot of flexibility in achieving the desired results. Some of the parameters that the user can tweak to obtain models of his choice are: 1. Ability to change the maximum and minimum outlier limits thereby opting for manual preprocessing 2. View time slices that are being excluded because of outliers. 3. View correlation coefficients and change the input/delay selection that the software comes up with. 4. Manually obtain sensitivities by tweaking the inputs and delays. 5. View graphs for both the correlation values and the sensitivity values 6. Set maximum number of training epochs 7. Obtain a bar chart plot for the train and test errors over a range of hidden layer neurons. 8. Manual entry of values for the number of nodes in the hidden layer 9. Training error limit which the software uses to establish the stop criteria for training. 10. Override random selection of weights for training by feeding in initial weights. 11. Percentage split of raw data into training and testing data 12. Random/Sequential approach to Train/Test data split. 13. Parameter to determine how much % improvement is worth using an extra hidden node. 14. Verify and view graphs of verification results 15. Re-train the network with different parameters. EXAMPLE The following section details an example of a thermo-chemical process called the Continuous Digester. The purpose of using a Neural Network for this example serves to calculate Kappa for the outlet stream. On line measurements of Kappa are highly unreliable, inaccurate and usually about an hour or two for off-line feedback analysis. It should also be noted that the time delay of the process is about 4 hours. In such a process the ability to look into the future comes as a handy tool to make corrections immediately. 9 Copyright 2000 Instrument Society of America. All rights reserved T S T T F F Measurements Used In Constructing NN A m p Kappa Prediction for Outlet Stream Fig. 8: Continuous Digester Process The above figure shows the continuous digester process in diagrammatic detail. Potentially this system has more than 40 inputs. But by using the Neural Application’s automatic design of inputs the number of inputs and its delays can be reduced to eight. Typically the whole process can be collapsed into the model as shown in Fig. 2. CONCLUSION The idea and feasibility of using soft sensors created with Neural Networks was presented with a detailed insight into the structure and working of Neural Networks. A comparison of traditional approaches vs. the one presented in this paper, points towards the overwhelming advantage of the new approach in terms of ease of use and functionality. The novel method of generating a Neural Network Model was explained in a sequence of steps, which can either be automated or run in a manual mode. To summarize, this paper can be seen as the forerunner for many more innovations set to follow in the field of automatic creation of soft sensors. REFERENCES 1. Fisher Rosemount Systems. “Installing and Using the Intelligent Sensor Toolkit”. User Manual for the ISTK on PROVOX 2. DeltaVTM Home Page: http://www.frsystems.com/ DeltaVTM. 3. “Getting Started with Your DeltaVTM Software,” Fisher-Rosemount Systems. 1998. 4. Qin, S.J (1995) Neural Networks for Intelligent Sensors and Contrrol – Practical Issues and some Solutions in Progress in Neural Networks: Neural Networks for Control edited by D. Elliott 10 Copyright 2000 Instrument Society of America. All rights reserved