Automated Approach to Development and Online Operation of Intelligent Sensors Vasiliki Tzovla and Ashish Mehta Emerson Process Management Vasiliki.tzovla@emersonprocess.com KEYWORDS Neural Networks, Soft Sensors, Prediction, Process Identification ABSTRACT In the recent past, neural networks have been used to create intelligent (or soft) sensors to predict key process variables that cannot be measured on-line, to pre-empt lab analysis delays and to validate physical sensors. However, they have had limited application due to two significant drawbacks: first, traditional neural net development is a fairly complicated task requiring extensive third party “expert” intervention; and second, due to their inherent nature, neural nets do not handle changes in process operation over time very well, often requiring retraining. In this paper we present an approach that addresses both problems. Neural Nets are represented as a function block in a structured system, and by using state-of-the-art techniques and intuitive GUI applications, the commissioning process becomes fast and easy. Automation of the following: data collection and pre-processing, identification of relevant process inputs and their delays, network design, and training, enable the process engineer to develop neural net strategies without the need for rigorous techniques. In online mode the actual process variables -for example those obtained as a result of lab analysis- are used as inputs to the neural net for automatic adaptation of its prediction in response to changes in process. The paper details these and other features of convenience. INTRODUCTION An intelligent sensor, more popularly known as a ‘soft sensor’, is based on the use of software techniques to determine the value of a process variable, in contrast to a physical sensor that directly measures the value of the process variable. These sensors open a whole new world of possibilities and options that help to circumvent issues such as maintenance, cost and online use of physical sensors. A soft sensor is a highly trained neural network based model that takes real process variables as inputs and predicts values of the process output variable online [1, 2, 3]. Classical approaches to the problem of designing and training a neural network for the actual process conditions involved a sequence of complex steps which often was demanding for the typical plant engineer / operator: it often required a highly trained professional to create models. These models also had the drawback of not being able to constantly adapt to drifts in process inputs and other process conditions. Figure 1: NN block predicting values otherwise obtained from A) Lab Analysis (left), and B) Sampled Analyzer (right) Neural networks are essentially non-linear function approximators that automatically adjust parameters known as weights in an iterative procedure of learning the underlying function; this procedure is known as training. Training a soft sensor model consists of presenting information to the model, comparing the output to a target value, adjusting network weighting, and repeating the training until an acceptable output is achieved. What makes soft sensors useful is the ability of the underlying model to infer in real-time a measurement that would be otherwise available only after significant delays from analyzers or lab tests. Urged by the industry’s need for development tools, several soft sensor suppliers and control system manufactures have expended considerable resources to create applications to support the process engineer and the plant operator. The ease with which a system may be implemented, commissioned and maintained though, is influenced to a large extent by the user interface supplied by the manufacturer and its capability to integrate with the control system. In many cases, the application and general acceptance of advanced control and monitoring tools within the process industry has been limited by ease of use issues. Commercial products have too often violated some of the very basic principles of good usability. As a result, typical process engineers and instrument technicians may have difficulty in addressing neural net based applications, while the plant operators may be faced with increasingly complex user interfaces which provide minimum or no support. This paper explores how some commonly accepted practices in user interface design have been successfully applied in a neural network (NN) application. Examples show how this tool makes it extremely easy and intuitive to configure and run an NN strategy, and how friendly and explanatory interfaces provide the plant operator with the support needed to increase the lifetime of the NN model. Simplicity, nonetheless, is achieved without any loss of functionality and sophistication. In particular, the needs of the expert user are addressed without sacrificing the ease with which a normal user may implement such advanced applications. CONFIGURING THE NEURAL NET BLOCK The implemented strategy uses what are commonly referred to as function blocks (in the Fieldbus control paradigm [4]), wherein each function block is a part (e.g., a subroutine) of an overall control system and operates in conjunction with other function blocks to implement control and monitoring loops within the process control system. Function blocks typically perform one of an input function (transmitter, sensor, etc.), a control and analysis function (PID, fuzzy logic, etc.), or an output function (e.g., a valve). It must, however, be noted that the control routines may take any form, including software, firmware, hardware, etc. Similarly, the control and analysis strategy could be designed and implemented using conventions other than function blocks, such as ladder logic, sequential function charts, etc. or using any desired programming language or paradigm. NN development is integrated as part of the control strategy, obviating the need for off-line mathematical data processing packages, or development of interfacing layers between the process control system and the NN application. The NN block is implemented as a drag-and-drop function block capable of execution in a controller of a scalable process control system [5]. As a soft sensor, the NN function block has one output, the process variable being predicted. The process variables that are expected to influence the output are ‘soft’ wired as block inputs from anywhere within the existing control system. Extensibility of the number of input parameters inherently handles the controller memory and performance requirements for widely varying multivariable processes. Properties such as the range and engineering units are automatically inherited from those of the referenced process variables. For creating the NN model, known set of input and output data has to be presented for training purposes. Input values are obtained via the ‘soft’ wired referenced variables. Connecting the process variable to be predicted to the NN function block provides the output sample values. Later, this connection is used for online adaptation of the predicted value. Typically this may take one of the following configurations: Lab Analyses: For cases where the output is available only through lab test and measurements, the NN block is used in conjunction with a Lab Entry (LE) block as shown in Figure 1A. Since process lab analysis values are not available in real time, in addition to the sample value an additional parameter shown as DELAY in the figure is also provided to the NN block through the LE block. This parameter denotes the time delay between grabbing the sample and obtaining its measured value. When online, the NN block provides a real-time continuous output prediction, including for times in between lab grab sample results. Continuous Measurement: When used along with analyzers for continuous measurement, crossvalidation or backup, the analyzer output is fed to the NN as shown in Figure 1B, where an Analog Input (AI) block is used for the analyzer. In this case, the delay parameter of the NN block represents the response time of the analyzer measurement. Once configured, the control module (configured control/analysis strategy comprising the NN and interconnecting blocks such as LE) is downloaded to the controlling device. In this manner, the NN becomes an integrated part of the existing control system with no additional requirements for any third party interfacing and integration layers. Furthermore, the configured process inputs, the output and the measurement delay parameter are automatically assigned to the control systems’ historian for archiving. The user configurable sampling rate provides flexibility for handling processes with different response times as well as optimizes memory requirements of the historian. MODEL DEVELOPMENT This section details the methodology and environment for the plant engineer to generate an NN model in an easy and intuitive fashion, with little or no understanding of its mathematical complexities. Data Collection/Retrieval: To create an accurate NN soft sensor, it is imperative that the data used to generate the model be representative of the normal operation of the plant. Morrison [1] elucidates some of the key issues in data handling such as inspection of data, removal of bad data, handling of missing data, defining data ranges and removing outliers (data values outside established control limits), and ensuring that there is sufficient variation in data over the desired region of operation. For the NN Missing Data Selected Data region (end) Outlier boundary for an input Data marked for exclusion Selected Data region (start) Trended process variables information Figure 2: Trend Display of NN block data block, as noted above, data is already stored in the historian, and so available to any application that may connect to it. However, it is possible that some quality historical data is available in other external sources and so retrieval of data from file (in standard file formats) is also provided. The application directly connects to the integrated control system, retrieves the data from the historian or other identified source and displays the process variables in a user-friendly graphical trend format, with features to scale, zoom, scroll and pan the trends for accurately defining the data to be selected. The data to be used for identification can be marked by dragging bars along the time scale of the trend, as well as deselecting regions of data that are unsuitable for training. If data is found to be missing, rather than interpolating, the user is provided with a visual indication of data slices that should be excluded. Figure 2 graphically shows these features. The model created by a neural net is based entirely and only on the data it is trained on. That implies that the data should be limited to valid operating ranges. The application automatically applies statistical methods such as use of mean and standard deviation to define the range of usable data. Users can also graphically mark the outlier limits as indicated in the Figure 2. At the same time, expert users can view the results of pre-processing, including outlier limits, quantity of data analyzed, mean and standard deviation of distribution, etc., and take appropriate action. Once data to be used is marked out by the user, the application processes it, automatically taking into account the information provided on outliers, ranges, bad and missing data. Another feature of convenience is automation of the time shifting of output values. As seen earlier, generally there is a time delay between the generation of the sample and availability of its measured value. There being temporal relationships between process variables, the data needs to be realigned so that the values are time coincident. Conventionally, users have to manually shift the data based on time stamping, or use an average value for the time shift when individual sample information is not available. Instead, using the delay parameter each output sample value is transparently shifted by its applicable delay to form the data set. The well-conditioned data after this vital pre-processing step is used in all subsequent operations. Correlation at different delays Sensitivity at delay with max correlation value Updated Sensitivity at specified delay Figure 3: Variation of correlation sensitivity values with input delays for A) automatically calculated delay (left), and B) Manually (by process expert) entered delay Model Input Configuration: One of the key problems in NN based process prediction is the lack of accurate a priori information on the variables that have an influence on the output. The trained NN would be more accurate if only variables that affect the output are included. It is also well known that process outputs depend on input variables over time, and so inputs at certain delays rather than their instantaneous values should be used. In fact it is the experience of the authors that the fundamental problem encountered by most intelligent sensor applications is the determination of time delays for inputs. Usually, this involves trial and error techniques with multiple passes of an algorithm that iteratively reduces the range of user specified possible delays, thereby relying heavily on hard to acquire solid process knowledge. On the contrary, the approach presented here requires the user to provide only the approximate process response time, a value and concept that most plant engineers/operators are very comfortable with. The NN application then employs a two-stage technique to automatically determine the appropriate input delays and the relative importance of the configured inputs. Algorithmic details are provided in [6]. Briefly, the cross-correlation coefficients between the output and the time shifted input is calculated for all input delays up to the response time. Peaks in these values are parsed as definite indicators of strong dependence of the output on the input at that delay. Figure 3A shows such a cross correlation plot for an input up to a delay horizon of 100 seconds, with the maximum occurring at 24 seconds, the identified delay for this input. In all subsequent processing, the data corresponding to this input is realigned in accordance with this delay. The application then computes the sensitivity of the output to the input at the determined delay. While the delay is calculated considering inputs individually, sensitivity analysis effectively compares the influence of the inputs on the output. In some sense it ranks the inputs in order of importance. It is not uncommon to find some variables that have significantly more or less influence than originally anticipated. Furthermore, usually only a few of the originally configured variables are key to the prediction of the output. Sensitivity values are expressed such that their sum total is one (1). The average sensitivity (mathematically, average sensitivity = 1/number of configured inputs) is a good indicator for comparison, and is used to eliminate inputs that are not significant. Figure 3A also plots a bar for the sensitivity value at the delay, and numerically compares it with the average. A simple, yet complete overview of the participation of all the inputs configured in the NN block is provided in Figure 4, which visually displays the individual as well as the average sensitivities. The length of the bar corresponds to the sensitivity of the particular input and the dashed line represents the average sensitivity. Four inputs have been eliminated as irrelevant to the output prediction using this information. The delay and sensitivity information can also be viewed in a tabular manner. The detailed information for each input is navigable in a Windows like format from the left pane of the Figure 4. Figure 4: Overall view of Configured Inputs Participation Expert users can tweak the input delays and the inclusion/exclusion of inputs based on their knowledge of the process. Both the graphical and tabular (numerical) displays provide editing capability. The sensitivities are then recomputed with the updated information. This is illustrated in Figure 3B, where the identified delay is manually changed to 70 seconds. As expected there is a reduction in the sensitivity value, in fact it is smaller than the average in this case. In this manner the NN application determines the most suitable input-output configuration while allowing users to embed their process knowledge. Training: Based on the delay and sensitivity analysis of the previous step, data is realigned to form the training set. At this stage the process variable values are maintained in engineering units. Some variables like temperature, for example, may have a range of a thousand degrees, while others like composition have a range of a few units. In a multivariable model, incremental changes from the two variables should be equalized so that learning is not biased to the EU range. Uniform scaling via normalization of input and output values equalizes the importance of inputs. Conventional neural net training requires deep understanding of the theory and behavior of these nonlinear engines, right from determining how many hidden nodes to use, to establishing the convergence criteria that will stop training. Recognizing that this is knowledge that only NN experts have, not the plant engineer who has to use the NN, the training procedure has been greatly simplified while maintaining sound underlying core technology as described in detail in [6]. Figure 5: Error vs. Epoch for automatic training of NN with incremental number of hidden nodes A suitable NN architecture is determined by training with an incremental number of hidden nodes. Network weights are randomly initialized to small values, and for cross-validation purposes, data is automatically split into training and test sets. Training and test error are computed at each pass of data (epoch) through the network. Figure 5 shows the training progress wherein for each hidden node number, the best combination of error (test and train) is determined. Spikes in error values correspond to re-initialization of the network when another hidden node is added. For clarity, the normalized values are converted back to engineering units and displayed along with the minimum error epoch number in the adjoining table. The test set is used to prevent the NN from overfitting the training data as the goal of the network is to learn to predict and not to memorize the training data. This dual approach establishes the most suitable network design by picking the optimal number of hidden neurons and corresponding weights, which is four in Figure 5 even though the NN was trained with 1-5 hidden nodes. [6] also describes several enhancements that help realize a fast learning robust NN. It must be noted here, that a single button click automates the last two steps of delay/sensitivity analysis and training for the normal user. In numerous experiments, the automatically calculated parameters have proven to generate satisfactory models. As for sensitivity analysis, the knowledgeable expert has the flexibility to modify training parameters such as minimum/maximum number of hidden neurons, percent of data split between test and train sets, random vs. sequential split of test and train data, etc. Once training is over, the weight parameters of the NN are de-normalized so that scaling the process variables is not needed when the model is online. At this stage an intelligent sensor capable of predicting the process variable of interest is available. Since the model may not be satisfactory, it is not directly downloaded into the controlling device. Download has to be initiated by the operator, but is merely a click away as the NN function block has been configured to run a downloaded model. Another desired functionality is the capability to compare and switch models generated over different data sets and/or with different parameters. Therefore, the process models for a particular NN block are automatically saved on creation, and remain in the database of the control system until they are explicitly deleted. Figure 4 for example, has three separate models for the same NN block, any of which may be put online. Pertinent details of the model can also be printed out for comparison/recording purposes. Verification: Once a model is developed, it is validated by applying real input data to it and comparing its output with the actual process output. Even though a test set is used during training, verification should always be carried out if additional data is available. The validation/verification data set must be Region II Region I Figure 6: Verification A) Actual and Predicted vs. Sample, and B) Actual vs. Predicted (right) different from the training and test data sets but should represent the data region the NN is expected to operate on. Figure 6 shows an example of model validation results. Graphical comparison as well as the root mean square error information is available. If the root mean square error per scan is not satisfactory, the model should not be put online. The Actual vs. Predicted plot at times is very useful in determining the modes of the process operation. Clusters indicate regions of operation, so multiple clusters might imply variations in plant operation such as those due to seasonal changes. From Figure 6B, it appears that the process has two regions of operation; the actual value plots in Figure 6A also corroborate this. For this example, it seems that the data for region II is limited and that may mean re-training with more data from that region. In some situations, separate models for the different modes would achieve better results. ONLINE OPERATION Once a suitable NN model is created, simply downloading to the controller integrates it into the existing control strategy. The model becomes part of the NN block and operates with mode and status information similar to other blocks. In the online mode, real process variables act as inputs to the NN model that generates a predicted output. This value may be used by an operator or in control to correct for changing conditions. Since the inputs are soft wired as references to the NN block, a clean and simple interface is maintained. The excluded inputs are not part of the model and online prediction, but they need not be removed for possible later use in re-training. Maintaining inputs as references allows such flexibility while minimizing confusion on the inputs excluded/included in a particular model. The same NN block, therefore, may have models with different inputs being used (depending on changing process conditions) as long as they are originally configured as input references. It also need not change if the input configuration of the model changes over time. The following features add to the lifetime and capability of the online NN model. Automatic Correction: The process output stream, predicted using the Neural Network and measured upstream conditions, is automatically corrected for error introduced by unmeasured disturbances and measurement drift [7]. This correction factor is calculated based on a continuous measurement or sampled measurement of the stream provided by an analyzer or lab analysis of a grab sample, the configurations of Figure 1. Two approaches have been used to calculate the correction factor that must be applied to the NN prediction. Both are based on calculation of the predicted error using the time coincident difference between the uncorrected predicted value and the corresponding measurement value. Depending on the source of error, a bias or gain (ratio) change in the predicted value is applied. External Reference to Input and And Scaling I O O O Feedforward Neural Net Model + + OUT_SCALE CORR_ENABLE MODE CORR_LIM SAMPLE OUT CORR_BIAS o DELAY FUTURE Delay CORR_FILTER Limit 0 Filter FOLLOW Figure 7: Automatic Correction mechanism To avoid making corrections based on noise or short term variations in the process, the calculated correction factor is limited and heavily filtered e.g. equal to 2X the response horizon for a change in a process input. During those times when a new process output measurement is not available, the last filtered correction factor is maintained. An indication is provided if the correction factor is at the limit value. The correction can be turned on or off, and the filter and limit values are configurable online, providing added flexibility as process conditions change. Figure 7 shows how the automatic adaptation mechanism works in conjunction with the NN model prediction. This correction eliminates the need for re-training the NN in the case of drifts and unmeasured disturbances and greatly enhances the lifetime of a running model. Of course, if the process undergoes significant changes, a new model should be created. Future Prediction: In typical NN applications, the process is characterized by large delay times. The plant operator needs to wait for the delay time to observe whether change in the input achieves the desired effect on the output. In many cases, this may result in out-of-spec product and significant losses in terms of time, money and effort before corrective action can be taken. The NN block provides a FUTURE parameter (Figures 1,7), that provides the ability to predict outputs in the future. It is calculated by setting the delays associated with the inputs to zero; i.e., assuming the process is at steady state with the current set of input values, and predicting the neural net output. This ability to predict the output in the future allows the user to perform “what-if” analysis and make real-time corrections for input changes. Range Handling: Unlike first principles based parametric models, neural nets are approximators that attempt to form the best mapping between the I/O data seen during training. In essence, this implies that due to their nonlinear nature, they will not do a good job of prediction over regions not seen during training. In a real process plant environment, this can cause very poor results as process variables tend to change over time. The training data may have been limited and so certain regions of plant operation, for example seasonal changes, were not included. Again taking advantage of the coupling between the NN function block and modeling application engine, the outlier limits that were either automatically calculated or defined by the process expert, are invoked in the online mode. The user is informed if input ranges are being violated while clamping them to the outlier limits and calculating the predicted output. Heuristics based on the number of such range violations and the relative importance of inputs with values beyond the control limits, determine whether the prediction is considered uncertain or bad. In the normal operation, all values are within the training range and the predicted output has a good status. Operators can make use of this status information when applying the predicted output elsewhere in the system. CONCLUSIONS The design and implementation of a new and simplified approach to the commissioning and development of neural net based intelligent sensors has been presented. By embedding the NN function block in the process control system and automating the model development, benefits of neural net techniques can be easily applied to a variety of processes without the overheads incurred in traditional implementations. An intuitive and user friendly GUI minimizes the engineering and implementation effort, while maintaining the underlying NN technology. Several enhancements to the online operation of the model increase its lifetime and maintainability even in the presence of unmeasured disturbances and process drift. This approach is instrumental in the development of the next generation of easy to use, integrated NN applications implemented in a scalable process control system [5]. REFERENCES [1] [2] [3] [4] [5] [6] [7] Morrison, S. and Qin, J., “Neural Networks for Process Prediction”, Proceedings of the 50 th Annual ISA conference, pp 443-450, New Orleans, 1994. Ganseamoorthi, S., Colclazier, J., and Tzovla, V., “Automatic Creation of Neural Nets for use in Process Control Applications”, in Proceedings of ISA Expo/2000, 21-24 August 2000, New Orleans, LA. Fisher Rosemount Systems. “Installing and Using the Intelligent Sensor Toolkit”. User Manual for the ISTK on PROVOX Instrumentation, 1997. US Patent Application, Fisher File No: 59-11211, Blevins, T., Wojsznis, W., Tzovla, V. and Thiele, D., “Integrated Advanced Control Blocks in Process Control Systems”. DeltaVTM Home Page: http://www.easydeltav.com Mehta, A., Ganesamoorthi S. and Wojsznis, W., “Identification of Process Dynamics and Prediction using Feedforward Neural Networks”, submitted for ISA Expo/2001, 10-13 September, 2001, Houston, TX. US Patent Application, Fisher File No: 59-11243, Blevins, T., Tzovla, V., Wojsznis, W., Ganesamoorthi, S. and Mehta, A., “Adaptive Process Prediction”.