Continuous Estimation Of Kerosene Cold Filter Plugging Point Using Soft Sensors Ivan Mohler1, Mirjana Novak1, Marjan Golob2, Željka Ujević Andirjić1, Nenad Bolf1 1 University of Zagreb, Faculty of Chemical Engineering and Technology, Department for Measurement and Process Control, Savska c. 16/5a, 10 000 Zagreb, Croatia. 2 University of Maribor, Faculty of Electrical Engineering and Computer Science Smetanova 17, Maribor, Slovenia. On the basis of continuous measurements of temperatures and flows of appropriate process streams, soft sensor models for the estimation of kerosene cold filter plugging point (CFPP) have been developed. Data preprocessing included: detection and outlier removal, generating additional output data by Multivariate Adaptive Regression Splines algorithm, detrending data and filtering data. Soft sensors are developed using linear and nonlinear identification methods. Model structures are optimized by Genetic Algorithm and ANFIS (Adaptive Neuro-Fuzzy Inference System) algorithm. Results of the Output Error (OE) model, Hammerstein–Wiener (HW) model and neuro-fuzzy model are shown. The best results are achieved with neuro-fuzzy model. 1. Introduction In industry plants, there is a need for continuous measurements and more effective process control which imposes the need for monitoring a large number of process variables using appropriate measuring devices. One of the frequent problems in monitoring and process control is the inability of continuous measurements and analysis of key process parameters, such as the compositions of process streams and product properties. However, it is possible to infer the states of difficult measurable variables by determination of their functional relationships with easily measurable secondary variables. Soft sensors as a part of virtual instrumentation are focused on the assessing the system state variables and product quality, thus replacing the physical sensors and laboratory analysis. The application of soft sensors for the estimation of non-available or hard-tomeasure process variables is very interesting in the process industry. Usually, there is a large number of continuously measured variables which may serve as inputs for the soft sensor (Bolf et al, 2008). Different model structures can be used to model real systems. In the field of industrial applications, the focus is on parametric (polynomial) structures in both linear and nonlinear versions (Fortuna et al, 2007). A dynamic soft-sensor based on Output Error method is developed to predict the immeasurable outputs of the quality variables (Chen et al, 2009). Adaptive Neuro-Fuzzy Inference System (ANFIS) based soft sensor models can be used to infer the important process variables that are hard or impossible to measure (Jassar et al, 2010, Tie et al, 2005). In the last decade, soft sensor applications for the distillation unit product properties have been studied extensively (Badhe et al, 2004, Chatterjee et al, 2003, Dam et al, 2006, Mao et al, 2005, Napli et al, 2010, Xuefeng, 2008). In this paper soft sensor models are derived based on experimental data obtained from the refinery crude distillation unit (CDU). Based on the available continuous measurements of temperatures and flows of appropriate process streams, soft sensors have been developed to estimate cold filter plugging point (CFPP) of the diesel fuel. Model structures are optimized by Genetic Algorithm (GA) and ANFIS (Adaptive Neuro-Fuzzy Inference System) algorithm. Output Error (OE) model, HammersteinWeiner (HW) model and neuro-fuzzy model are developed and presented. 2. Data preprocessing and model development Since the development of dynamic models demand an equal number of input and output data, additional output data (CFPP) were generated by Multivariate Adaptive Regression Splines algorithm (MARSpline). It operates as multiple piecewise linear regression, where each breakpoint estimated from the data defines the "region of application" for a particular linear equation (Matlab, 2009). Frequently used linear model for the on-line estimation is OE model: nu (1) yˆ (k ) Bi (q)ui (k nk ) I Fi (q) yˆi (k ) i 1 q is time-shift operator; ŷ(k) is the output at time k, u(k) is an input at time k, nu is the number of model inputs and nk is input delay expressed by the number of samples. (2) Bi (q) 1 b1q 1 b2 q 2 ... bnb q nb is polynomial matrix over q-1, Bi is the matrix of dimensions n(ŷ)∙n(u), b are the polynomial coefficients of polynomial matrix Bi(q) and nb is the number of past input samples. (3) Fi (q) 1 f1q 1 f 2 q 2 ... f nf q nf Fi is the matrix of dimensions n(ŷ)∙n(u), f are the polynomial coefficients of polynomial matrix Fi(q), nf is the number of past model output samples. Although linear dynamic models can in many cases be sufficient for real-life applications, industrial processes, especially in distillation column, are often highly nonlinear. While the linear model structure is fully defined by the chosen regressors, the nonlinear model structure additionally depends on nonlinear function characteristic. It is quite common situation that while the dynamics itself can be well described by a linear system, there are static nonlinearities at the input and/or at the output. A model with a static nonlinearity at the input is called Hammerstein model, and model with the output nonlinearity is Wiener model (Ljung, 1999). Fig. 1 shows a block diagram of the HW model structure, where: Figure 1. Structure of the Hammerstein-Wiener model. w(k) = f(u(k)), is a nonlinear function transforming input data u(k). w(k) has the same dimension as u(k). x(k) = (Bi(q)/Fi(q))w(k) is a linear transfer function, where Bi and Fi are polynomial matrices of the linear Output-Error model. x(k) has the same dimension as y(k). y(k) = h(x(k)) is a nonlinear function that maps the output data x(k) of the linear block to the system output. The nonlinear function of the HW model can be described by piecewise linear function parameterized by breakpoint locations (Matlab, 2009). OE and HW model order parameters (nb, nf, nk) and HW model number of nonlinear units (parameter n) were optimized using GA in MATLAB Global Optimization Toolbox. Coefficients of polynomial matrices Bi(q) and Fi(q) were determined by optimization methods integrated within the MATLAB System Identification Toolbox (Gauss-Newton, adaptive Gauss Newton, Levenberg-Marquardt or gradient Search method). Models were evaluated based on RMSE, AE and FIT defined by: AE 1 N N yˆi yi (5) i 1 N RMSE yˆi yi i 1 (6) N N FIT 1 2 yˆi yi 2 i 1 N yi yi 100 (7) 2 i 1 where y is the measured output, ŷ is the model output, N is the number of data, and y is the mean of y. Nonlinear neuro-fuzzy model was developed by Sugeno-type fuzzy system which has been implemented in the framework of a five-layered neural network structure (combination of fuzzy inference system (FIS) and neural networks (NN)). The structure of Sugeno FIS with n inputs, m input membership functions, r rules and one output formulated as a feed-forward NN consists of 5 layers (Jang, 1993). A typical first-order Sugeno fuzzy i rule is expressed in the following form: If x1 is A1i and x2 is A1i … and xn is Ani then y i a1i x1 a 2 x 2 ... a ni x n b (8) where the x1, x2 and xn are n input variables, A1i , A2i … Ani are corresponding fuzzy sets in i-th fuzzy rule and yi = a1i x1 + a2i x2 + … + ani xn + bi is the i-th output first-order polynomial. The output of 1st layer is the Gaussian membership function, with two parameters, which are used to determine the shape and position of membership function: xi ci 2 xi e 2 i2 (9) nd In the 2 layer, each node corresponds to one fuzzy inference rule by performing a fuzzy “and” operation (x) on the membership grades and r is number of fuzzy rules: xi ci 2 xi e 2 i2 (10) The 3rd layer normalizes the fuzzy rules: w j w j r w i i 1 (11) In the 4th layer the consequent part of the fuzzy rules are executed as a linear combination of the input variables: (12) y j a1 j x1 a2 j x2 a nj xn And final, in the 5th layer the output is calculated by as weighted average method: y w1 y1 w2 y2 wr yr (13) To initial FIS generation grid partition (GP) and Subtractive clustering (SC) methods are used. GP cluster each input variable into several class values to build up fuzzy rules, and it is only suitable for models with small number of inputs. SC is one-pass algorithm that estimates the number of clusters and the cluster centers in the data set (Chiu, 1994). Further, to train the ANFIS models, the optimization technique, error tolerance and the number of epochs are chosen which stops whenever the maximum epoch number is reached or the training error goal is achieved. For the consequent parameters training, least squares method is used and the approximation error is back-propagated through every layer to update the premise parameters. 3. Process description Since the crude distillation unit (CDU) is the first unit in the sequence of refinery processing, it is crucial that the quality of fractionation products (unstabilized naphtha, heavy naphtha, kerosene, light gas oil, heavy gas oil), is monitored and controlled. Heavy naphtha, kerosene and light gas oil fractions are used for blending of diesel fuel. One of its important quality properties is the cold filter plugging point which is defined as the temperature where the fuel filter plugs due to crystallization of mostly higher linear paraffinic hydrocarbons. Kerosene CFPP is determined by laboratory assays 4 times a day according to the EN 116 standard. Figure 2 shows the part of the distillation column with the diesel fuel components. Figure 2. Section of the column with diesel fuel production components.. The kerosene properties depend on outlet temperatures and flow rates of its neighbouring fractions. Based on the process analysis and process engineer experience the following variables were selected as the inputs for the estimation of CFPP: column top temperature - TR-6104 (TTOP), kerosene temperature - TR-6197 (TK), the light gas oil temperature - TR-6198 (TLGO), heavy gas oil temperature - TR-6199 (THGO), top pumparound temperature - TR-6103 (TPA) and top pumparound flow, FI-6130 (FPA). 4. Results and discussion The soft sensors models were developed on the basis of process measurements and laboratory analysis. The laboratory assays of CFPP were carried out four times a day. Input variables were measured and recorded continuously. The representative input data, i.e. the data with wide dynamic range was chosen for the model development. From Shannon's sampling theorem, sampling time of 5 minutes was chosen, also, outliers, means and trends were removed. Data were divided in two sets: 7000 for the model estimation and 3000 for the model validation. Estimation of the time delays (nk) among inputs and the output can be crucial for the model accuracy. Also, it is necessary to predetermine the size of the regression vector of OE and HW model, i. e. the parameters nb and nf. Model order parameters (nb, nf and nk) were determined using GA with the minimum of (100-FIT) as an objective function. 4.1 OE (Output-Error) model Table 1 shows the number of past input samples (nb), input delays (nk) and past predicted outputs (nf) for the OE model. Values of FIT = 69.31 %, RMSE = 0.5693 and AE = 0.4443 °C show that the OE model agree relatively well with splined data. Fig. 3 shows the comparison between the OE model output and the splined output for validation data. Ordinate refers to the detrended splined output values (CFPPdetr.). 4.2 Hammerstein-Wiener’s model with piecewise linear function The linear block of HW model is a matrix of transfer functions, which contains the previous input values and the previous model output values. Linear block of HW model is actually OE model. Model order parameters are given in Table 1. Input static nonlinear functions in HW model are presented with piecewise linear functions with different number of breakpoints (n), given in Table 1. The output nonlinearity consists of piecewise linear function with 9 breakpoints. The coefficients FIT = 77.15 %, RMSE = 0.4239 and AE = 0.3029°C prove very good agreement of HW model data and experimental data. Fig. 3 shows the comparison between HW model and splined output on the validation data set. Compared to the OE model, the HW model gives better results, as was expected, since the HW model is nonlinear and more complex. OE TTOP/°C TK/°C TLGO/°C THGO/°C TPA/°C FPA /th-1 Table 1 OE and HW model order parameters. Nb nk nf HW nb nk 5 1 4 8 0 TTOP/ °C 8 0 5 2 1 TK/ °C 1 1 1 6 1 TLGO/ °C 4 0 2 8 0 THGO/ °C 5 1 3 2 1 TPA/ °C -1 7 2 5 2 0 FPA /th CFPP / °C nf 3 2 6 8 1 8 - n 11 10 8 8 11 7 9 Figure 3. Comparison between measured and a) OE model data, b) HW model data 4.3 ANFIS model The ANFIS model was generated by subtractive clustering algorithm for extracting a set of rules that models the data behaviour. One of the advantages of subtractive clustering is that, unlike grid partitioning, it does not invoke the curse of dimensionality, since the number of fuzzy rules produced depends on the number of data clusters which depends on how close the data points are in the input-output space. As a result, this technique can is used with a larger number of inputs. Subtractive clustering method with neighborhood radius 0.35 partitions the training data and generates an FIS structure with two power seven (27) rules and seven membership functions for each input. For ANFIS model, the number of training epochs was varied between 10 and 240 and error tolerance was set to zero.Tuning of FIS parameters for the consequent part was performed by the Least Squares method (LS) and for the premise part was used backpropagated algorithm. After 240 epochs the performance of ANFIS model with SC on validation data set was FIT = 91.54 %. The difference between desired and predicted values for validation data set can be seen in Fig. 4. Figure 4. Validation of ANFIS model based on SC FIS initialization 5. Conclusion Soft sensor models for the estimation of cold filter plugging point of kerosene were developed. Developed models are based on continuous measurements of temperatures and flows of process streams of crude distillation unit and laboratory CFPP assays. Results obtained by the OE and HW models show very good agreement with splined data. It can be concluded that these models can be used as soft sensors in the real plant. Results obtained by the neuro-fuzzy ANFIS model show excellent agreement and minor deviations. Therefore, it can be concluded that ANFIS model is the best one from developed models, and it can be used for inferential process control. 6. References N. Bolf, M. Ivandić, G. Galinec, Soft sensors for crude distillation unit product properties estimation and control, in: R.J. Petton, D. Maquin (Eds.), 16th Mediterranean Conference on Control and Automation. IEEE, Ajaccio, 2008, pp. 1804-1809. L. Fortuna, S. Graziani, A. Rizzo, M.G. Xibilia, Soft Sensors for Monitoring and Control of Industrial Processes (Advances in Industrial Control), SpringerVerlag, London, 2007. C. Chen, S. Mo, X. Chen, Dynamic soft-sensor based on finite impulse response model for dual-rate system, in: CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference, 2009, pp. 2221-2226. S. Jassar, Z. Liao, L. Zhao, Machine learning and systems engineering, Lecture Notes in Electrical Engineering 68 (2010) 143-155. M. Tie, H. Yue, T. Chai, A hybrid intelligent soft-sensor model for dynamic particle size estimation in grinding circuits, in: J. Wang, X.F. Liao, Z. Yi (Eds.), Proceedings of the Second international conference on Advances in Neural Networks, Vol. 3, Springer, Heidelberg, 2005, pp. 871-876. Y. Badhe, J. Lonari, U. Sridevi, B.S. Rao, S.S. Tambe, B.D. Kulkarni, Hybrid process modeling and optimization strategies integrating neural networks/support vector regression and genetic algorithms: study of benzene isopropylation on Hbeta catalyst, Chem. Eng. J. 97 (2004) 115-129. T. Chatterjee, D.N. Saraf, On-line estimation of product properties for crude distillation units. J. Process Contr. 14 (2003) 61-77. M. Dam, D.N. Saraf, Design of neural networks using genetic algorithm for on-line property estimation of crude fractionator products, Comput. Chem. Eng. 30 (2006) 722-729. S. Mao, Z.-H.Xiong, Y.-M. Xu, A.-X. Zhuang, H.-L. Huang, L.-Q Wang, Research and application of soft sensor of the diesel oil solidifying point based on the neural network on a crude distillation unit, Control and Instruments in Chemical Industry 32 (2005) 11-14. G. Napoli, M.G. Xibilia, Soft Sensor design for a Topping process in the case of small data sets, Comput. Chem. Eng. 35 (2010) 2447-2456. Y. Xuefeng, Modified nonlinear generalized ridge regression and its application to develop naphtha cut point soft sensor, Comput. Chem. Eng. 32 (2008) 608-621. T. Hastie, R. Tibshirani, J.H. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, New York, 2001. L. Ljung, System Identification: Theory for the User, second ed., Prentice Hall, New Jersey, 1999. Matlab, 2009. The Language of Technical Computing, www.mathworks.com J.S.R Jang, ANFIS: adaptive-network-based fuzzy inference systems, IEEE Transactions on Systems, Man, and Cybernetics 23 (1993) 665-685. S. Chiu, Fuzzy model identification based on cluster estimation. J. Intell. Fuzzy Syst. 2 (1994) 267-278.