KD_Automatic Creation of Neural Nets Process Control

Automatic Creation of Neural Nets for
use in Process Control Applications
Sai Ganesamoorthi
Entry level Software Eng.
Fisher-Rosemount Systems
Austin, TX 78759
Jay Colclazier
Product Manager
Fisher-Rosemount Systems
Austin, TX 78759
Vasiliki Tzovla
Senior Software Eng.
Fisher-Rosemount Systems
Austin, TX 78759
KEYWORDS
Neural Networks, Pre-Processing, Training, Delayed Inputs, Software, function blocks, control process
ABSTRACT
Neural Networks are often used where a soft sensor is required to cross-validate the performance of a
physical sensor or when the output must be determined by a series of lab analyses. Traditional
approaches involve the use of a stand-alone application with little or no graphical help or development
tools, often confusing the typical user.
This paper describes an approach to structure the Neural Network as a function block and automate the
complete process of collecting data, pre-processing, training, verifying the model and downloading the
generated model to a real-time controller, with a GUI application. In simple words the concept extols the
use of a ‘single-button’ approach provided by the application that works as part of a structured system,
enabling users with no rigorous knowledge of Neural Networks to configure and run a Neural Network
strategy for the process in question. The paper also details features of convenience and novelty such as
automatic collection of data, a two stage pre-processing algorithm for identification of process input
delays, automatic design of hidden nodes, automatic correction of errors/delays encountered in verifying
samples by lab tests for online use, and options for fine tuning the resultant models.
INTRODUCTION
An intelligent sensor, more popularly known as a ‘Soft Sensor’, is based on the use of software
techniques to determine the value of a process variable in contrast to a physical sensor that directly
measures the process variable’s value. These sensors open a whole new world of possibilities and
options that help to circumvent issues such as maintenance, cost and online use of physical sensors. A
soft sensor is a highly trained neural network that can process inputs and spit out current values of the
process variable, online, and can also be manipulated to predict future values.
Classical approaches to the problem of designing and training a neural network for the actual process
conditions involved a sequence of complex steps which often was a grueling experience for the normal
user. More often than not it required a highly trained professional to create models. These models also
had the drawback of not being able to constantly adapt to drifts in process inputs and other process
conditions.
1
Copyright 2000 Instrument Society of America. All rights reserved
This paper presents a novel approach that seeks to eliminate the drawbacks of earlier methods of neural
network creation. Such an idea involved the concept of modeling a neural network as a function block
(capable of execution in a controller) to which trained models can be downloaded. The training of
models was done in a highly user friendly GUI application designed with the intent of simplifying things
for the normal user.
Salient features of the entire approach such as automatic collection/archiving of input/output data, a two
stage algorithm for identifying key input variables influencing the output, automatic training with
suitable values for key network parameters, automatic correction of errors and delays encountered
during online use and modifying models are explained henceforth.
SOFT SENSORS WITH NEURAL NETWORK MODELS
Soft sensors find widespread use
1. In situations where the mounting and use of a physical sensor is practically not feasible,
2. as inexpensive on-line predictors of lab test results,
3. as cross-checking sensors for their online physical counterparts,
4. in applications where time is at a premium demanding immediate changes in process conditions, and
5. as an ‘if-then’ analyzer when the soft sensor is used to predict values into the future.
To predict boiler emissions such as the amount of NOX, CO and particulates, quality parameters such as
the asphalt pen index, batch end time, viscosity, polymer melt index, headbox pH are typical examples
of soft sensor usage in the process control industry. In addition to determining values of process
variables, soft sensors can also be used to provide a non-linear model for control and to identify key
input variables that influence the process output. Soft sensors are nothing but Neural Networks that are
highly parallel, computationally intense distributed information processing structures, with a non linear
transfer function.
While there are plenty of architectures for designing Neural Networks to serve a common purpose, a
simple neural network can be envisioned with an input layer of nodes connected with weights to a
hidden layer of nodes which in turn connects with weights to an output layer of nodes. Variations in the
structure can be sought by changing the number of nodes in the individual layers, having multiple layer
of nodes in the hidden layer. Typically the number of nodes in the input layer match the number of
inputs and vice versa for the output layer. The input and the hidden layer have a bias neuron each with a
constant source input of 1. The hidden layer nodes are modeled with a non-linear input-output transfer
function.
2
Copyright 2000 Instrument Society of America. All rights reserved
Input
Layer
Ouput
Layer
Hidden
Layer
N
N
N
N
N
N
Outputs
Inputs
Bias Neurons
N
Hidden Neurons with
Non-Linear Transfer Function
Fig 1: A simple Neural Network with two Hidden Layers
The above is an example of a feed forward network because when fully connected, the output of each
neuron in a layer acts as an input to all of the neurons in the next layer, with none sending outputs to
neurons in their own layer or any previous layers. Once the network goes through the feed-forward
propagation operation using an initial set of weights, output values are calculated and compared to the
target values. The difference between the calculated values and the target values is the error of the
network. To correct the network output the training program distributes the output error in a backward
sense starting with the output layer, to each individual weight. Such an approach is called the error back
propagation method.
A simple implementation of the error back propagation network uses the gradient descent method,
which always moves toward reducing the training error with a properly assigned learning rate or step
size. However its approach suffers from a number of problems such as slow convergence and a fixed
learning rate. Instead of using the gradient descent method we use the modified algorithm called the
conjugate gradient descent algorithm. The conjugate gradient method essentially combines current
information about the gradient with that of gradient from the previous iteration to change the weights.
The resulting conjugate gradient training is actually adapting the learning rate to the best possible value,
which results in much faster training. Another advantage of this algorithm is that the user does not need
to worry about specifying the learning rate.
MODES OF OPERATION
As stated earlier, Neural Networks are structured as function blocks capable of execution in a controller.
Once an engineer decides to apply NN for process output determination, he/she first creates a module
containing a NN function block and ‘soft’ wires the inputs and outputs of the function blocks to
appropriate process variables. The function block has a second output, the predicted output variable.
This is the output of the block if the inputs applied to them had no delays associated with them. The
3
Copyright 2000 Instrument Society of America. All rights reserved
prediction horizon of the predicted output is the maximum delay associated with any input in the bunch.
These configured function blocks can be used in one of two possible configurations:
1. In cases where the process output is available only from lab analysis the NN block will be used in
conjunction with a Manual Entry Block (MLE). The MLE block takes lab analysis values and the
Fig. 2: Example of NN with manual lab entry
sample time manually entered by the lab technician. The block insures that the value is within a valid
range and provides it as input to the NN block.
2. In situations where an NN is used as a cross-validator for an unreliable analyzer, the NN output can
be used as a backup to the analyzer for monitoring or control purposes. The output of the continuous
analyzer is fed to the function block as shown in the block configuration below
Fig. 3: Example of NN used with a continuous analyzer
4
Copyright 2000 Instrument Society of America. All rights reserved
STEPS IN GENERATING A NN MODEL
This section describes the various steps a user goes through to generate, download and execute a Neural
Network Model in a completely automated fashion. Fig. 4 compares the traditional approach in
generating a Neural Network model with the automated single button approach whereby the whole
process starts and creates a Neural Network model for the training data selected.
Our Approach
Traditional Approach
Obtain data from
Lab Data Managment
System
Push
Generate
Button
Format it into
Input and Output files
Ÿ
Graph
Input and out put
data
Select strategy for
missing data
View
Outliers
Ÿ
Ÿ
Automatic PreProcessing
Automatic Design
of Inputs and
Delays
Automatic
Training
Run
Pre-Processing
Download
Guess
Inclusion/Exclusion
of Inputs
Guess Delays
Execution
Calculate Sensitivities
Automatic
error correction
Design number of
Hidden Neurons
Train Network
Work on ways to
export the model
Port it to
the controller
Fig. 4: Comparison of our approach and traditional approach: 3x Faster than Traditional approach
5
Copyright 2000 Instrument Society of America. All rights reserved
AUTOMATIC PRE-PROCESSING
This step involves collection of data and pre-processing the data for outliers. The collection of real time
data for potential inputs and outputs is facilitated by trending the variables on a scale changeable graph.
Input and output data can also be obtained from a data archive. The user can move selectable bars that
appear on the graph to select the region of operation. Upon selection of a time slice for training he/she
can also deselect portions within the time slice that he deems unsuited for training. Such an option lets
the user train a network with time slices scattered over a wide range of operation.
Neural Networks are supposed to perform at their best when they encounter values that fall within the
range of data that was used for training. So it is necessary to limit the range of training data within
specified outlier limits. The software automatically screens the training data for outlier values by using
the outlier limits of the configured function block values. If an input variable does not have preconfigured outlier values then it determines the maximum outlier to be the mean of the variable plus 3
times its standard deviation and vice versa for the minimum outlier.
Fig. 5: Data Selection and Pre-Processing screen
AUTOMATIC DESIGN OF INPUTS AND DELAYS
This stage consists of selecting inputs and delays. It might not be best to use all the input variables
configured on the function block. Instead it might be best to use some input variables with different time
delays because an output could depend on the input over time. The Neural software provides a two-stage
technique to determine the appropriate inputs and delays automatically. It does so by first calculating the
correlation coefficients of each input to output with a fixed window size. This reveals the dependence, if
there is any, of the output on the input over time. Peaks in these values are definite indicators of strong
dependence of the output to the input at that particular time delay. The software actually parses these
numbers to determine if there are peaks and determines delays to be used in the next stage. It also
determines if the input has to be used at all.
If the number of inputs to be used as determined by the previous stage is detected as one, the software
starts an iterative process of adding inputs with zero delays one by one. Every time the sensitivities for
6
Copyright 2000 Instrument Society of America. All rights reserved
each of the input and its delay to the output are computed. This sensitivity determination stage provides
a picture of the overall participation of each input and delay to the output. If the sensitivity analysis
returns values below certain limits, it iterates till it finds input with agreeable sensitivity values. If it is
unsuccessful in doing so, it randomly chooses an input with zero delays. While the first stage looks upon
the dependence of input to output one at a time the sensitivity determination stage presents the
dependence of inputs to outputs taken all at once. Thus this stage provides an accurate list of inputs to be
used along with their associated delays.
Fig. 6: Sensitivity analysis diagrams for automatic design Inputs and Delays
AUTOMATIC TRAINING
Once the inputs and delays are determined the next step is to use a suitable Neural Network architecture
for training. Determination of the number of hidden neurons is done by training the network with an
increasing number of hidden neurons. As the program proceeds it constantly checks if there is a
minimum of 10% reduction in the training error for each additional node used. It stops iterating when
such a condition is reached. With this number of hidden layer nodes the software proceeds to training
that actually creates the final neural net model.
The training data is first randomly split into training samples and testing samples to guard against overtraining. The goal of training is for the neural network model to learn to predict, not to memorize the
data presented. Many repetitions of running the training data through the network without testing are not
adequate. Therefore, the Neural software uses a cross-validation (both training and testing) scheme to
find the least test errors, without over-training or under-training.
7
Copyright 2000 Instrument Society of America. All rights reserved
Fig. 7: Automatic training
AUTOMATIC CORRECTION OF ERRORS
The property of a process output stream predicted using a Neural Network and measured upstream
conditions may be automatically corrected for error introduced by unmeasured disturbances and
measurement drift. This correction factor may be calculated based on a continuous measurement or
sampled measurement of the stream provided by an analyzer or lab analysis of a grab sample. Two
approaches may be used to calculate the correction factor that must be applied to the NN prediction.
Both are based on calculation of the predicted error using the time coincident difference between the
uncorrected predicted value to the corresponding measurement value. Depending on the source of the
error, a bias or gain change in the predicted value may be appropriate.
To avoid making corrections based on noise or short term variations in the process, the calculated
correction factor should be limited and heavily filtered e.g. equal to 2X the response horizon for a
change in a process input. During those times when a new process output measurement is not available,
the last filtered correction factor is maintained. An indication should be provided if the correction factor
is at the limit value. Also a configurable filter is provided on the corrected prediction value to allow a
customer to filter the noise in the input measurements.
OUTPUT PREDICTION
One of the ‘available nowhere’ feature of this product is the ability to use the trained Neural Network for
predicting output into the future. Whenever a function block for a Neural Network is configured it
always comes with a ‘Predicted Output’ along with the normal output. The ability to predict outputs into
the future allows the user to perform ‘what-if’ analysis on his input variables and can make corrections
immediately if that is not what he wants. It comes of great use in situations where the process delay time
is very long and an error would cost a lot of time, effort and money.
8
Copyright 2000 Instrument Society of America. All rights reserved
While the normal output is determined with all input delays included, the calculation for the predicted
output is done with the assumption that all the delays associated with the inputs are zero. In other words,
the predicted output has a prediction horizon equal to the maximum delay associated with the inputs.
FEATURES FOR THE ADVANCED USER
Though the Neural Software is basically designed with the normal user in mind, it also caters to the
needs of advanced users who have the know-how in dealing with Neural Network models. Also, there is
a small bit of chance that the automatic model generation might fail. This can be attributed to reasons
such as bad data, high correlation between the inputs and/or slow error convergence. In these cases the
user is advised to work with the software in the ‘Advanced User’ mode which gives a lot of flexibility in
achieving the desired results.
Some of the parameters that the user can tweak to obtain models of his choice are:
1. Ability to change the maximum and minimum outlier limits thereby opting for manual preprocessing
2. View time slices that are being excluded because of outliers.
3. View correlation coefficients and change the input/delay selection that the software comes up with.
4. Manually obtain sensitivities by tweaking the inputs and delays.
5. View graphs for both the correlation values and the sensitivity values
6. Set maximum number of training epochs
7. Obtain a bar chart plot for the train and test errors over a range of hidden layer neurons.
8. Manual entry of values for the number of nodes in the hidden layer
9. Training error limit which the software uses to establish the stop criteria for training.
10. Override random selection of weights for training by feeding in initial weights.
11. Percentage split of raw data into training and testing data
12. Random/Sequential approach to Train/Test data split.
13. Parameter to determine how much % improvement is worth using an extra hidden node.
14. Verify and view graphs of verification results
15. Re-train the network with different parameters.
EXAMPLE
The following section details an example of a thermo-chemical process called the Continuous Digester.
The purpose of using a Neural Network for this example serves to calculate Kappa for the outlet stream.
On line measurements of Kappa are highly unreliable, inaccurate and usually about an hour or two for
off-line feedback analysis. It should also be noted that the time delay of the process is about 4 hours. In
such a process the ability to look into the future comes as a handy tool to make corrections immediately.
9
Copyright 2000 Instrument Society of America. All rights reserved
T
S
T
T
F
F
Measurements Used In
Constructing NN
A
m
p
Kappa Prediction
for Outlet Stream
Fig. 8: Continuous Digester Process
The above figure shows the continuous digester process in diagrammatic detail. Potentially this system
has more than 40 inputs. But by using the Neural Application’s automatic design of inputs the number of
inputs and its delays can be reduced to eight. Typically the whole process can be collapsed into the
model as shown in Fig. 2.
CONCLUSION
The idea and feasibility of using soft sensors created with Neural Networks was presented with a
detailed insight into the structure and working of Neural Networks. A comparison of traditional
approaches vs. the one presented in this paper, points towards the overwhelming advantage of the new
approach in terms of ease of use and functionality. The novel method of generating a Neural Network
Model was explained in a sequence of steps, which can either be automated or run in a manual mode. To
summarize, this paper can be seen as the forerunner for many more innovations set to follow in the field
of automatic creation of soft sensors.
REFERENCES
1. Fisher Rosemount Systems. “Installing and Using the Intelligent Sensor Toolkit”. User Manual for
the ISTK on PROVOX
2. DeltaVTM Home Page: http://www.frsystems.com/ DeltaVTM.
3. “Getting Started with Your DeltaVTM Software,” Fisher-Rosemount Systems. 1998.
4. Qin, S.J (1995) Neural Networks for Intelligent Sensors and Contrrol – Practical Issues and some
Solutions in Progress in Neural Networks: Neural Networks for Control edited by D. Elliott
10
Copyright 2000 Instrument Society of America. All rights reserved