Computers and Chemical Engineering 26 (2002) 1201– 1211 www.elsevier.com/locate/compchemeng Measurement bias detection in linear dynamic systems Derrick K. Rollins *, Sriram Devanathan, Ma. Victoria B. Bascuñana Department of Chemical Engineering and Statistics, Iowa State Uni6ersity, Ames, IA 50011, USA Received 22 May 2000; received in revised form 28 February 2002 Abstract A new method to detect the existence of biased measured variables in dynamic processes is presented. Hence, this work presents a new Dynamic Global Test (DGT) and test procedure for dynamic gross error detection (GED) that brings to light certain of its attributes which have not hitherto (to our knowledge) been presented in GED literature. Recognition of these attributes leads to a scheme that enables identification of the type of biased measurement (e.g. flow or level). This approach is not computationally intensive and is applicable in the case of process leaks and multiple biased variables. Simulation results for the identification of the type of biased measurement (e.g. flow or level) and the estimation of the time of occurrence (ETOC) are given. The performance study in this work specifically varied the size of measurement bias (i ), the bias location (i ), the bias true time of occurrence (TTOC), the significance level (h), and the sample size (N). This study shows the proposed approach to be accurate in identifying the type of biased variable and its TTOC. The performance of the proposed scheme improves as N and i increase. © 2002 Elsevier Science Ltd. All rights reserved. Keywords: Dynamic systems; Gross error detection; Fault detection; Sensor validation Nomenclature 0 AVGD ETOC TTOC D, B E[q] F I k M m N n p q qi Qi a n × 6 null matrix average difference of the estimated TOC from the true TOC estimated TOC of the bias true TOC of the bias constraint matrices in generalized dynamic system the expected value of estimator q modified incidence matrix a n× n identity matrix the current time instant incidence matrix number of time instants in each moving window (m= 4 in this study) sample size number of nodes number of trials in which the type of bias is correctly identified number of successive windows in which Ho must be rejected twice or more for conclusion of flow bias (q= 5 in this study) vector of measurement errors in Qi vector of flow measurements at time instant i * Corresponding author. Tel.: + 1-515-294-7642; fax: + 1-515-294-2689. 0098-1354/02/$ - see front matter © 2002 Elsevier Science Ltd. All rights reserved. PII: S 0 0 9 8 - 1 3 5 4 ( 0 2 ) 0 0 0 3 6 - 4 D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 1202 Qi* Ri STDD V VQ VW 6 wi W* i x*i Xi X* i vector of true values at time instant i vector obtained by transformation of measurement model standard deviation of STDD covariance matrix of measurement errors covariance matrix of measurement errors on flows Q covariance matrix of measurement errors on volumes W number of flows vector of measurement errors on Wi vector of true total mass values at time instant i a (n+6)× 1 vector of true values of unknown variables at time instant i a 2(n +6)×1 vector of measured values at time instant i a 2(n +6)×1 vector of true values Greek letters h level of significance ii power function W,j vector of measurement biases at time instant j Q,j vector of measurement biases at time instant j Dj vector of measurement biases at time instant j i vector of measurement errors at time instant i W,i vector of measurement errors at time instant i for total mass variables Q,i vector of measurement errors at time instant i for total flow variables Erri vector of measurement errors Li vector of process leaks at time instant i ¦Ri vector comprising the elements of the expected value of Ri J noncentrality parameter SRi the variance– covariance matrix of Ri S the variance matrix F constraint matrix the upper (100h)th percentile of the 2 distribution 2n,h Other symbols ‘is distributed’ N2(n + 6) a 2(n +6) variate normal distribution Superscript T transpose 1. Introduction A gross measurement error is made when the measurement of a variable deviates far from its true value. This article addresses situations when large dynamic systematic deviations (i.e. measurement biases) are the cause of the gross errors. Causes of biased measurement include instrument malfunction and miscalibration. When measurements are significantly biased, data reconciliation (DR) (the adjustment of process variables to improve their accuracy to satisfy material and energy balance constraints) may possibly give estimates (of process variables) that are more inaccurate than the measured values. Hence, detecting the presence of biased measurements and their true time of occurrence (TTOC) are important steps in obtaining accurate reconciled values. Historically, dynamic DR can be traced back to the early 1960s and 1970s (Kalman, 1990; Gertler & Almasy, 1973; Willsky & Jones, 1974). Some of the most widely used DR and gross error detection (GED) techniques for dynamic processes are based on the Kalman filtering (KF) technique (Kalman). KF has been used D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 for data smoothing and parameter estimation (Bellingham & Lee, 1977; Newman, 1982; Watanabe & Himmelblau, 1982). These techniques have been developed for linear dynamic systems using a weighted leastsquares objective function. In using these approaches, the parameters and measured inputs are estimated hypothetical states and so it is difficult to obtain suitable values for such a system. In addition, sensor malfunction detection methods based upon KF have been used (Bellingham & Lee; Newman; Watanabe & Himmelblau) for linear dynamic systems. However, these approaches have not been extended to handle measurement biases in a general (unrestrictive) fashion. A significant contribution to current methods in GED is the generalized likelihood ratio (GLR) approach (Willsky & Jones, 1974). The GLR method has been used for GED in steady state (Narasimhan & Mah, 1987) and dynamic processes (Narasimhan & Mah, 1988). Narasimhan and Mah used their GLR method to identify various causes of gross errors. However, their work was restricted to pseudo steady state conditions (i.e. variation around a nominal point) and does not seem to be applicable during a transition state (i.e. a period of change from one steady state to a new steady state). Kao, Tamhane, and Mah (1992) have presented a composite procedure for detecting and identifying gross errors in serially correlated data for pseudo steady state processes. Measurements are serially correlated when their errors are related to past values. This phenomenon is common in modeling chemical processes because of feedback loops, material recycling, etc. They proposed a prewhitening step to validate the assumptions of Gaussian measurement and process noises, followed by statistical process control chart techniques. It involves the use of statistical tests to detect the presence of gross error, followed by the application of GLR method to identify and estimate the magnitudes of the gross errors. However, the method is restricted to pseudo steady state and its performance in the presence of multiple biases is unclear. It also appears that the autocorrelated structures of the data, which can be different for each variable, must be known. The measurement error reconciliation method (Khuen & Davidson, 1961; Swenker, 1964) has been generalized to transient conditions by Almasy (1990). This method, which has been used for dynamic DR, has been termed dynamic balancing, and is based upon linear conservation equations to reconcile the measured states. Estimates are obtained for flow and inventory variables by applying the KF to the balance model. Almasy, however, does not address the problem of GED. Darouach and Zasadzinski (1991) developed a recursive optimal solution technique in weighted leastsquares for DR of transient systems characterized by a 1203 generalized linear dynamic model (i.e. singular model). A unique feature of this modeling approach is that KF is not applicable. Rollins and Devanathan (1993) showed that the Darouach’s and Zasadzinski’s estimates could be very accurate but computationally intensive. In addition, Darouach and Zasadzinski do not address the topic of GED, and their approach does not appear to be extendable to this situation. Motivated by these limitations, Rollins and Devanathan developed a constrained least-squares DR approach that provides accurate and unbiased estimates for process variable and is computationally less intensive than the Darouach and Zasadzinski approach. Ramamurthi, Sistu, and Bequette (1991), Kim, Leibman, and Edgar (1991), Leibman, Edgar, and Lasdon (1992) have presented schemes for nonlinear dynamic DR (NDDR). Ramamurthi et al. presented a successively-linearized horizon-based estimation (SLHE) strategy for the estimation of variables and physical parameters in transient systems. They showed the SLHE strategy to be significantly more accurate than the extended KF and computationally more efficient than the nonlinear programming (NLP) approach of Leibman et al.. Kim et al., developed a sequential error-in-variables method for DR and parameter estimation. They compared their nonlinear dynamic errorin-variables method (NDEVM) to other conventional least-square techniques and to estimation using orthogonal collocation and showed improved performance. The NDEVM does not appear to be rigorous in estimating process variables when multiple biases are present and it does not address detection and identification of biased measurements. Leibman et al. (1992) have presented a new method for NDDR based on NLP techniques. They showed this method to be superior to KF, both in its ability to cope with inequality constraints and in nonlinear situations. Leibman et al. commented that the main disadvantages of the approach are the requirement of an accurate process model and the problem of intensive computations. In their approach, biased measurements are treated as parameters to be estimated. In addition, the approach does not seem to be designed to handle multiple biases. Ramamurthi et al. (1991), Kim et al. (1991) and Leibman et al. seem to have made significant progress in DR for transient processes but only moderate progress in GED. While it is important, from a practical point of view, to develop GED strategies for transient processes, steady state techniques are perhaps the foundation from which to build methods suitable under a wider range of operating conditions. Traditionally, methods used to detect biases in steady state systems have involved a Global Test (GT). A GT conclusion that no biases are present will obviate location identification. Thus, a GT is not designed to make conclusions about D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 1204 the location of any biased measurement that is present. However, a GT can also serve as a final verification of global closure after process variables are corrected. Thus, there is strong motivation to have a GT technique for accurate detection when conditions are dynamic. Specifically, this article presents a GT to determine the bias type (flow or inventory variable) and the TTOC of the bias in dynamic systems. To our knowledge there has been no prior work dealing with GT for dynamic systems. For simplicity, this article deals with the situations where the assumptions of no process leaks and no drifting biases are valid (i.e. the proposed technique is not restricted by these assumptions). We also assume in this particular situation that all relevant variables are measured. The purpose of this article is to present a novel scheme for accurate detection of measurement biases. We are hopeful that this scheme can possibly be extended to enable identification of the specific locations of measurement biases. We introduce the proposed technique by first presenting the process models when measurement biases are present and the equations used. Following this discussion, we present the results of a Monte Carlo simulation study to evaluate the proposed approach. 2. Process models The physical model, presented first, represents material balance constraints. This is followed by the measurement model, a statistical expression for the measurements. A detailed development of these models is given in Appendix A, for a simple process network, containing 2 nodes (interconnecting units) and 5 flows. At the ith time instant, a total mass balance on each of the n nodes (Rollins & Devanathan, 1993) gives n × 6 *6 × 1 ×1 −W*n +W* Qi i i − 1 +M = −[In × n − Mn × 6] n = −Dx*+ Bx*= [ −D B] i i = FXi*= Li, n Wi* W*i − 1 +[In × n 0n × 6] Q*i Q*i − 1 n xi* x*i − 1 (1) where M is the process constraint matrix, i= 2, …, k and n × (n + 6) D =[I − M], Bn × (n + 6) = [I 0], + 6) × 1 x*(n = i n W* i , Qi* (2) (3) (4) Fn × 2(n + 6) = [D B], + 6) × 1 X*2(n = i (5) n xi* . x* i−1 (6) Wi* is a n×1 vector of true and unknown total mass in n nodes at time instant i, Qi* is a 6× 1 vector of true and unknown total mass flow rates for the 6 streams at time instant i, Li is a n× 1 vector of process leaks, and k is the current time instant. The measurement model that applies to Eq. (1) is, + 6) × 1 X*2(n = X*+ Erri, i i (7) where i =2, …, k and n Æ W,i Ç Ã Ã i Erri = à q,i à , Ãw,i − 1 à i − 1 È q,i − 1 É (8) Var(Erri )= S2(n + 6) × 2(n + 6), (9) j N(n + 6)(j, V), V(n + 6) × (n + 6) = (10) n VW 0 , 0 VQ (11) Var(w, j )= VW, (12) Var(q, j )= VQ, (13) + 6) × 1 (n = j n nW,×j1 , 6Q,×j1 (14) where j= 1, …, k. Note that VW and VQ are assumed to be known although this is not a restrictive assumption. When VW and VQ are unknown, the sample estimates can be used (if the sample size is large). However, the distribution of the test statistic for the GT (given below) would be different (see Rollins & Davis, 1993). Also, the sample size is assumed to be one for convenience. If the sample size is greater than one, the measurement vector contains measurement means and the measurement variance–covariance matrix is taken to be V/n. Note that, Xi N2(n + 6)(Xi*+ i, S), (15) and Æ W,i Ç Ã Ã + 6) × 1 2(n = à Q,i Ã, i ÃW,i − 1 Ã È Q,i − 1 É (16) where N2(n + 6) is used to represent a 2(n + 6) variate normal distribution. D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 3. The transformed measurement model The vector Ri given below (also known as the transformed vector of measurements) plays a central role in the formulation of the Dynamic Global Test (DGT) and this section gives its distributional properties. Let, Rni × 1 =FXi = FX*+ FErri =Li +FErri, i (17) with FX* =L by Eq. (1). Therefore, E[Ri ]=¦Ri =Li +Fi, (18) and Var(Ri )= FSFT =SRi. (19) Thus, Ri Nn (¦Ri, SRi ). (20) 4. The dynamic globe test scheme (DGTS) The GT is a test designed to determine if any j, j= 1, …, n+ 6 is nonzero and is developed by using the equations in the transformed measurement model. For a null hypothesis of Ho: ¦Ri =0 and an alternate hypothesis of Ha: ¦Ri "0, the following test is used: reject Ho in favor of Ha if and only if (see Mardia et al., 1979), RTi SR−i 1Ri ] 2n,h, (21) where i denotes the time instant, n is the number of nodes and 2n,h is the upper (100h)th percentile of the 2n distribution. The probability of rejecting Ho when Ho is false (i.e. the power function for the test) is given by, 1− ii = P[RTi SR−i 1Ri ] 2n,h ¦Ri ] = P[noncentral 2n ] 2n,h J2i ], (22) with J2i =¦ TRi(SR−i 1)¦Ri, (23) where Ji is called the noncentrality parameter. From Eq. (18), we see that the DGT can also be used to detect the presence of process leaks. However, in this article, we restrict our attention to situations where there are no process leaks. Note that when Li =0, ¦Ri =Fi = − W,i +MQ,i +W,i − 1, (24) by Eqs. (5), (16) and (18). Thus, it appears that, under dynamic conditions, this test can be used to determine the time measurements become biased. Also, Eq. (24) indicates that it appears possible to distinguish biases in W from Q. This distinction appears to be possible because W,i can cancel out of Eq. 52, since − W,i + W,i − 1 can be zero if W remains constant over two consecutive time instants. Also, note from Eq. (24) that 1205 while the procedure address biases in only flow and inventory variables (and when these two types of biases do not occur simultaneously) there is no limit to the number of biases that can be detected of each type. To our knowledge, a dynamic condition GT has not been presented before this work. Furthermore, an additional significant contribution of this DGT is that it has the potential to indicate the type of biased variables: flow or inventory. We exploited the canceling attribute of Eq. (24) and developed the following procedure to differentiate between a nonzero W and a nonzero Q (this study assumes that only one measured variable can become biased at any time instant). This procedure tests the null hypothesis (at each time instant) in a moving ‘window’ of m successive time instants, starting with the first m time instants. That is, each window is size m and contains the results of hypothesis tests of the past m−1 time instants and the current one. If the null hypothesis is rejected at least twice in each of q consecutive windows, the conclusion is that there is a bias in a flow measurement. The estimated time of occurrence (ETOC) is concluded to be the time instant of the first rejection in the first of the q windows meeting this condition of the conclusion of having a flow bias. On the other hand, if Ho is rejected only once in two consecutive time windows, then the conclusion is a bias in a level measured variable. If no rejection of Ho occurs, then the conclusion is no bias in any measured variable. For each analysis, the algorithm first tests for flow biases, if none are found, it evaluates the presence of a level bias. From preliminary studies, we selected m=4 and q= 5 (not shown for space considerations), and held these values constant throughout this simulation study. In these preliminary studies the parameters m, q, and the number of rejections required within a window were all varied in a systematic manner. Consequently, the effect of the changes on the ability of the DGTS to detect the bias type and estimate the TOC was noted. Thus, these are the parameter values for which (for the given process) the performance appeared to be the best. In Appendix B, we have explained how an investigator may determine the parameter values for his/her problem. To illustrate this approach (with m= 4 and q=5), consider the hypothetical example given in Table 1. First, suppose that a bias occurred in a flow variable at TTOC = 4. Since there are two or more rejections in each window from the time instant 5 onwards, the conclusion is that a flow variable is biased and that it occurred at time instant 4 (since this is the time instant that the first detection occurred in the windows from time instants 5 to 9). Hence, for this case, the ETOC and the TTOC agree. Now consider a second case. Suppose there is a rejection of Ho at time instant 4 (denoted by R*) and that none of the windows had two D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 1206 or more rejections; the conclusion for this case would be that one of the level variables is biased and that the bias occurred at time instant 4. Thus, as in the first case, ETOC and TTOC are the same. This study assumes that once a measurement bias has occurred. It will continue to exist (with the same magnitude) until it is identified and removed. Thus, this procedure may distinguish biases in inventory variables (i.e. in W’s) from biases in flow variables (i.e. in Q’s), since W,i can cancel out of Eq. (24) after one time instant has elapsed. Therefore, allowing for incorrect conclusions by hypothesis testing, the scheme assumed that two rejections (of Ho) separated by several time instants indicate a bias in W, and that, several rejections close together (in time) indicate a bias in Q. We will refer to this technique and its use of our DGT as the DGTS. 5. Performance studies This section examines the accuracy of the DGTS in determining the type of biased measurement and the TOC. Below we define measures of performance in the simulation study. The performance measure for false conclusion for unbiased variables is the average type I error (AVTI). It is defined as, Table 1 Example illustrating how the TOC is determined Time instant Result in each window Conclusion 1 2 3 N/A N/A N/A None None None 4 5 6 7 8 9 (A (A (A (R (R (R 10 A A R R R R A R R A R R R*) R) A) R) R) A) (R R A R) Continue Continue Continue Continue Continue Bias in flow (TOC=4) Stop This table is used to illustrate how the ETOC is determined for biased flow and inventory variables. First, a flow bias occurs at time instant 4, note that there are two or more rejections of Ho in each of 5 ( = q) successive windows starting at the window at time instant 5. Thus, the estimated TOC is the time instant of the first rejection in the first window that had two or more rejections, which is time instant 4. The second case is for a biased level variable. When the DGTS finds that the criterion for a biased flow is not met, it will mark the first time Ho is rejected as the estimated TOC. For example, suppose that the first rejection occurred at time instant 4 (denoted by R*) and that none of the windows had two or more rejections. Then the conclusion would be that a level bias occurred at time instant 4. Here, A represent the case when the hypothesis that W and Q are equal to zero is not rejected; and R denotes the case when the hypothesis that W and Q are equal to zero is rejected. AVTI = c of trials identifying the wrong type of bias . c of simulation trials (25) One simulation run1 consisted of 1000 simulation trials, where each trial consisted of a complete set of artificially (stochastically) generated data from the same set of conditions (1000 trial runs were chosen because we found this size to be sufficiently large enough to give exceptional accuracy). In this study, either a flow or level measurement becomes biased at the same time during each simulation trial. For each trial, one of the following three conclusions were made: a level bias was present, a flow bias was present, or neither a flow nor level bias was present. Thus, the AVTI can range from 0 to 1. The performance measure for correct detection of biased variables is the overall power (OP) and is defined as, OP = c of biased variables correctly identified . c of biased variables simulated (26) Note that, since either a level or flow variable will be biased for 1000 trials in each run, when the type for the run is always identified as either flow or level bias, OP and AVTI will add to one for that run. The third and fourth quantities that we determine in this study give a measure of DGTS’s ability to estimate accurately the TOC of the bias. The mean difference between the estimated TOC and the true TOC is determined by the simulation study. This quantity is called the average difference (AVGD)2 and is given by, %pi = 1 (ETOCi − TTOC) AVGD = p , (27) where the ETOCi ’s are the estimated TOC’s, and p is the number of trials in which the type of bias has been identified correctly. The standard error of AVGD is called STDD and is determined by, 1 For example, one such trial could be generating values for the measured variables under the following conditions: (1) flow measurement bias; (2) h =0; (3) sample size of one; (4) bias located in stream c2; (5) magnitude of bias = 6. These values of the measured variables are generated 1000 times under these exact conditions and that constitutes one run. 2 The AVGD may be positive or negative. Suppose TTOC = 45 and ETOC= 35. A large number of such values for TTOC would lead to a negative AVGD. This does not mean that the GTS predicts in advance the occurrence of the bias. This simply means that the GTS leads us to believe that the bias occurred at time instant 35, whereas, the bias actually occurred at time instant 45. D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 Fig. 1. Process network. Table 2 Effect of Q (flow measurement is biased); h= 0.1, N = 1, location = 2 Q TTOC AVTI OP AVGD STDD 6 8 10 5 5 5 0 0 0 1 1 1 −0.14 −0.36 −0.39 1.24 0.97 0.94 Table 3 Effect of W (level variable is biased); h= 0.1, N= 1, location = 2 W TTOC AVTI OP AVGD STDD 6 8 10 5 5 5 0.07 0.07 0.07 0.93 0.93 0.93 2.95 0.30 −0.39 6.45 4.04 2.94 Table 4 Effect of measurement bias location (Q, flow measurement biases), N= 1, TTOC=5 Q Location h AVTI OP AVGD STDD 6 6 6 6 1 2 3 4 0.10 0.10 0.10 0.10 0 0 0 0 1 1 1 1 −0.32 −0.14 −0.16 −0.19 1.93 1.24 1.23 1.13 6 6 6 6 5 6 7 8 0.10 0.10 0.10 0.10 0 0 0 0 1 1 1 1 −0.51 −0.21 −0.17 −1.12 2.05 1.14 1.17 2.52 10 10 10 10 1 2 3 4 0.05 0.05 0.05 0.05 0 0 0 0 1 1 1 1 −0.14 −0.17 −0.19 −0.20 0.61 0.56 0.56 0.62 10 10 10 10 5 6 7 8 0.05 0.05 0.05 0.05 0 0 0 0 1 1 1 1 −0.17 −0.21 −0.21 −0.13 0.65 0.66 0.66 0.73 STDD = D . mined, consider the following hypothetical examples with conditions: number of simulation trials is 100; type of bias simulated is flow. Suppose that these 100 simulation trials resulted in the following conclusions: 80 trials identified as flow bias; 15 trials identified as level bias; five trials with no bias detected. Thus, for this example, the AVTI = 15/100 = 0.15; and the OP = 80/100 =0.80. We now discuss the study to evaluate the DGTS. With the conditions for each run held constant from run to run; this study created data by varying the following conditions: (1) the type of bias; (2) the magnitude of the bias; (3) the TTOC; (4) the bias location; (5) the level of significance (h); and (6) the sample size (N). (As stated previously each simulation run consisted of an evaluation of 1000 trials of simulated data.) This study used true values from Darouach and Zasadzinski (1991) for the inventory (i.e. level) variables and set all the measurement variances for W’s and Q’s to 1.0. Note that increasing N has the same effect as decreasing the measurement variance. Finally, it examined flow and level measurement biases separately and determined AVTI, OP, AVGD and STDD for each run. Fig. 1 shows the process network used in the study taken from Darouach and Zasadzinski, and used by Rollins and Devanathan (1993) in a previous study. 6. Results of the simulation study %pi = 1 (ETOCi −TTOC)2 p 1207 (28) As discussed earlier using the example in Table 1, the best value of AVGD and STDD is 0.0 for both flow and level bias, i.e. when ETOCi =TTOC for all i. To illustrate how the AVTI and the OP are deter- This section presents results of the study to evaluate the DGTS. Tables 2 and 3 contain the results for varying measurement bias in a flow and level variable, respectively. Three values for W and Q were used: 6, 8 and 10. Table 2 shows that AVGD is close to 0.0 and that STDD decreases with increasing Q as expected. Table 3, in case with varying W, shows similar performance characteristics. Specifically, both AVGD and STDD decrease with increasing W. Comparing Tables 2 and 3, one sees that the cases with nonzero Q perform better (compare the STDD’s) than the cases with nonzero W. Next, the effect of bias location is investigated. Tables 4 and 5 show the effect of measurement bias location for each variable. When Q = 6, Table 4 reveals, STDD is higher when variable 1, 5 or 8 is biased; all three of these streams are associated with only one node, as opposed to the other streams which are associated with two nodes. In contrast, for all other cases in Tables 4 and 5, the location does not seem to cause a significant difference in performance. The effect of TTOC is best revealed in Tables 6 and 7. As shown, both STDD and the absolute value of D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 1208 AVGD increase as TTOC increases. In other words, the ETOC’s are closer to TTOC when TTOC is relatively small. Thus, it appears that the accuracy of the ETOC depends on the number of observations collected. The more data the analysis used, the more accurate the ETOC. Table 5 Effect of measurement bias location (W, level measurement biases), N= 1, TTOC= 5 W Location h AVTI OP 6 6 1 2 0.10 0.10 0.07 0.07 0.92 0.93 1.62 1.95 6.14 6.45 6 6 3 4 0.10 0.10 0.07 0.07 0.93 0.93 1.34 1.44 5.70 5.80 10 10 1 2 0.10 0.10 0.08 0.07 0.92 0.93 −0.44 −0.39 1.32 1.94 10 10 3 4 0.10 0.10 0.08 0.08 0.92 0.92 −0.49 −0.46 0.97 1.27 AVGD STDD Table 6 Effect of measurement bias TTOC (Q, flow measurement biases), h=0.1, N =1, location =2 Q TTOC AVTI OP AVGD STDD 6 6 6 5 25 45 0.0 0.0 0.2 1.0 1.0 0.8 −0.14 −2.22 −5.32 1.24 7.24 14.46 10 10 10 5 25 45 0.0 0.0 0.0 1.0 1.0 1.0 −0.39 −2.05 −4.32 0.94 6.50 12.88 Table 7 Effect of measurement bias TTOC (W, level measurement biases), h= 0.1, N =1, location= 2 W TTOC AVTI OP AVGD STDD 6 6 6 5 25 45 0.07 0.08 0.07 0.93 0.92 0.93 1.93 −14.29 −33.81 6.45 9.14 13.21 10 10 10 5 25 45 0.08 0.08 0.07 0.92 0.92 0.93 −0.39 −14.54 −33.48 1.94 8.66 13.03 Table 8 Effect of h-level (Q, flow measurement biases), TTOC = 5, N =1, location= 2 Q h AVTI OP AVGD STDD 6 6 0.10 0.05 0 0 1 1 −0.14 −0.04 1.24 1.47 10 10 0.10 0.05 0 0 1 1 −0.39 −0.17 0.94 0.56 Table 9 Effect of h-level (W, level measurement biases), TTOC = 5, N =1, location =2 W h AVTI OP 6 6 0.10 0.05 0.06 0.08 0.93 0.95 1.95 5.30 6.45 10.45 10 10 0.10 0.05 0.07 0.01 0.93 0.98 −0.39 0.21 1.95 3.53 AVGD STDD Table 10 Effect of N (Q, flow measurement biases), h = 0.1, TTOC= 5, location =2 Q N AVTI OP AVGD STDD 6 6 6 1 5 10 0 0 0 1 1 1 −0.20 −0.13 −0.05 1.30 0.38 0.23 10 10 10 1 5 10 0 0 0 1 1 1 −0.46 0.00 0.00 1.06 0.00 0.00 Tables 8 and 9 show the effect of h on AVTI, OP, AVGD, and STDD. When flow variables are biased, as h decreases, AVGD and STDD decrease. The result appears reasonable because, for a lower value of h, there is a smaller probability of misidentifying situations with no biased measurement. However, when the magnitude of the bias is smaller (Q = 6) and thus, more difficult to detect, larger values for h give better AVGD and STDD performance as supported by Table 8. For a fixed bias in a level measurement (Table 9), AVTI appears to be unaffected or weakly affected by small values of W and more affected by larger values of W. The large value affect appears to be valid since decreasing h decreases the probability of two (or more) false rejections (of Ho) occurring close together in time. In addition, Table 9 shows that the ETOC appears to be more variable (as related to the TTOC) for smaller h. The final effect that this study addressed was the sample size (N). The N measurements for each time instant for each variable were averaged. As stated earlier, increasing N has the same effect as lowering sampling variance. The results are given in Tables 10 and 11. Table 10 shows that, when a flow bias is present, larger sample size gives a better AVGD and a smaller STDD, as expected. When a level bias is present, Table 11 shows a quick approach to high power (OP=1.0) and to an accurate ETOC (i.e. small AVGD and STDD) as N increases. Thus, this analysis demonstrates that the technique can be very accurate for both types of variables when measurement error variances are low or N is sufficiently large. Additionally, the analysis presents evidence to indicate that the estimators used are reasonable ones (effect of large sample size). Based on this study, the DGTS appears to D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 be an effective way to detect the presence of measurement biases and to distinguish level biases from flow biases for dynamic processes. Performance improves with large l, larger N, and longer time periods after the bias has occurred. 1209 To illustrate the derivation of the linear dynamic model, consider the simple process given above. A material balance around node 1 (i.e. tank 1) gives the following equation: q1*+ q2*− q3*= Dw* dw* 1 1 = . dt Dt (A1) With Dt = 1, and Dw1 = w1,i * − w* 1,i − 1, we get two balance equations corresponding to the two nodes: 7. Closing remarks In this article, the DGTS, a technique for GED for dynamic processes, was developed for the case when flow and inventory measurement biases are present. The technique appears to have much promise in accurately detecting biased measurements, distinguishing biased flow and inventory variables, and determining the time variables become biased. Future challenges could be extending this approach to identification of specific nodes and variables that are biased and to situations where constraints are nonlinear. We are currently evaluating these conditions in research. In addition, we are also studying ways to accurately estimate process variables once the biases have been identified since an important task is DR. − w* 1,i +w* 1,i − 1 + q* 1,i +q* 2,i − q* 3,i = 0, −w2,i * +w* * +q4,i * − q5,i * = 0, 2,i − 1 + q3,i where w* j,i represents the true value of the total mass in tank j at time instant i, and w* j,i represents the true value of the mass flow rate in stream j at time instant i. In the presence of process leaks, the above equations can be modified as, − w1,i * +w* * +q2,i * − q3,i * = u1,i, 1,i − 1 + q1,i − w* 2,i +w* 2,i − 1 + q* 3,i +q* 4,i − q* 5,i = u2,i, (A3) where uj,i represents the magnitude of the leak in node j at time instant i. Then, with 8. Uncited reference W*= i Scheffe, 1959 (A2) Æ q* 1,i Ç Ã q* à à 2,i à Qi*= à q3,i * Ã, à q* à à 4,i Ã È q* 5,i É n w1,i * , w* 2,i Acknowledgements Li = n u1,i , u2,i (A4) and with We are grateful to the National Science Foundation for partial support of this research under Grant No. CTS-9310095. M= 1 0 1 0 −1 1 0 1 0 −1 n (A5) we have Appendix A − Wi*+ W* i − 1 +MQi*= Li = − [I2 × 2 − M2 × 5] n n Wi* W* i−1 + [I2 × 202 × 5] . Q* Q * i i−1 (A6) Now, using the notation, Table 11 Effect of N (W, level measurement biases), h= 0.1, TTOC= 5, location= 2 D= [I − M], W N AVTI OP STDD Eq. (A5) can be written as, 6 6 6 1 5 10 0.07 0.00 0.00 0.93 0.51 0.41 2.09 0.00 0.00 6.91 0.00 0.00 − DX*+ BX* i i − 1 =[D B] 10 10 10 1 5 10 0.08 0.00 0.00 0.92 1.00 1.00 −0.32 0.00 0.00 2.07 0.00 0.00 = FXi*= Li, AVGD where B= [I 0], X*= i n Wi* . Q* i (A7) n Xi* X* i−1 (A8) n D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 1210 F = [− D B], Xi* = X*i . Xi*− 1 (A9) Appendix B The following illustrates how the relationship between the parameters (window size m, number of rejections in each window t, and the number of windows q) and the performance of the DGTS can be obtained. We show the derivation for the case of t = 2. Thus, the DGTS concludes that there is a bias in one or more flow variables when (call this event Z) there are at least two rejections (of Ho: ¦Ri =0 in favor of Ha: ¦Ri " 0) in each of q consecutive windows of size m. Let Ak = {at least two rejections of Ho in window k, k= 1, …, q}. Let Bi,k = {no rejection of Ho at any time within window i }, B2,k = {exactly one rejection in window i }, Cj = {no rejection at time j, where j = 1, …, m}. Note that Cj is independent of Ck, for all j" k; j, k= 1, …, m}. Additionally, we use the convention notation of h =Ô[Type I error] and i =Ô[Type II error], Thus, Ô[Cj Ho is true] = 1 −h, and Ô[Cj Ho is false]= i. Case 1 (Truth: No bias in any of the flow 6ariables). The probability of concluding that there is at least one flow variable that is biased is given by Ô[Z] = Ô[A1 S A2 S···S Aq ]. We know that P[Ak ]= 1 − P[A c3]. Here, P[A ck]= P[B1,k @ B2,k ]= P[B1,k ]+ P[B2,k ] (since B1,k and B2,k are mutually exclusive events). Now, note that, Ô[B1,k ]=Ô[C1 S C2 S ···S Cm ]= Ô[C1]Ô[C2]…Ô[Cm ] = (1 − h)m. (B1) Now, B2,k can occur in ‘m’ ways because the one rejection can be in any one of the ‘m’ slots (time instants) in the ‘k’th window. Thus, Ô[B2,k ]=mh(1−h)m − 1. Ô[B1,k ]= Ô[C1 SC2 S···SCm ] = Ô[C1]Ô[C2]…Ô[Cm ]= i m, Ô[B2,k ]= m(1− i)i m − 1, (B5) and P[A c3]= i m + m(1− i)i m − 1. Therefore, using Bonferroni’s inequality we get, Ô[Z] ]1− Ô[A1]+ Ô[A2]+ ···+ Ô[Aq ] = 1− q{i m + m(1− i)i m − 1}. (B6) The relationship between i and the magnitude of the bias (l) is given by 1− i= P[RTi SR−i 1Ri ] 2n,h¦Ri ] = P[noncentral 2n ] 2n,hJ2i ], (B7) with J2i = ¦ TRiSR−i 1¦Ri, (B8) where Ji is called the noncentrality parameter and ¦Ri and l are related as shown in Eq. (24). Thus, it is possible to study the effect of varying parameters t, q, m and l on the performance of the DGTS. However, for any given process we suggest, if possible, that the investigator determine the ‘best’ values of these parameters by simulation study as that would be simpler. This is as what we have done as described in the body of this paper. References (B3) A1, A2, …, Aq are not independent events and so for convenience we use Bonferroni’s inequality (see Bain & Engelhardt, 1992) to get, Ô[Z]]1 −{Ô[A1]+Ô[A2]+ ···+Ô[Aq ]} =1− q{(1−h)m + mh(1 − h)m − 1}. Ô[Cj j] g]= i, (B2) Hence, P[A c3]= (1− h)m +mh(1 −h)m − 1. Ô[Z] for the most likely event, i.e. when there are at least two rejections of Ho in each of q consecutive windows (of size k), where the first time instant in the first of these windows is time ‘g’. Under these conditions, (B4) Case 2 (Truth: Bias in one of the flow 6ariables at the time instant ‘g’). Since there are numerous possible ways in which Z can occur, we show the calculation of Almasy, G. A. (1990). Principles of dynamic balancing. American Institute of Chemical Engineering Journal, 39, 9. Bellingham, B., & Lee, F. P. (1977). The detection of malfunction using a process control computer: A Kalman filtering technique for general control lops. Transactions of IchemE, 55, 253. Darouach, M., & Zasadzinski, M. (1991). Data reconciliation in general linear dynamic systems. American Institute of Chemical Engineering Journal, 37 (2), 193. Kalman, R. E. (1990). New approach to linear filtering and prediction problems. J. Basic Eng. ASME, 82, 35. Kao, C. S., Tamhane, A. C., & Mah, R. S. H. (1992). Gross error detection in serially-correlated process data 2: dynamic systems. Industrial and Engineering Chemistry Research, 31, 254. Kim, I. W., Leibman, M. J., & Edgar, T. F. (1991). A sequential errors in variable method for nonlinear dynamic systems. Computers and Chemical Engineering, 15, 663. D.K. Rollins et al. / Computers and Chemical Engineering 26 (2002) 1201–1211 Khuen, D. R., & Davidson, H. (1961). Computer control. Chemical Engineering Progress, 57 (6), 44. Leibman, M. J., Edgar, T. F., & Lasdon, L. S. (1992). Efficient data reconciliation and estimation for dynamic processes using nonlinear programming techniques. Computers and Chemical Engineering, 16 (11 – 12), 963. Narasimhan, S., & Mah, R. S. H. (1987). Generalized likelihood ratio methods for gross error identification. American Institute of Chemical Engineering Journal, 33 (9), 1514. Narasimhan, S., & Mah, R. S. H. (1988). Generalized likelihood ratios for gross error identification in dynamic processes. American Institute of Chemical Engineering Journal, 34 (8), 1321. Newman, R. S. (1982). Robustness of Kalman filter-based fault detection methods. Ph.D. Diss. Imperial College, London, UK. 1211 Ramamurthi, Y., Sistu, P. B., & Bequette, B. W. (1991). Data reconciliation and gross error detection in dynamic processes. Los Angeles, CA: AIChE Annual Meeting. Rollins, D. K., & Devanathan, S. (1993). Data reconciliation in dynamic systems with linear constraints. American Institute of Chemical Engineering Journal, 39 (8), 1330. Scheffe, H. (1959). The analysis of 6ariance. New York: Wiley. Swenker, A. G. (1964). Ausgleichung von Messergebnissen in der Chemischen Industrie. Acta IMENKO (p. 29). Budapest. Watanabe, K., & Himmelblau, D. M. (1982). Instrument fault detection in systems with uncertainties. Int. J. Systems Sci., 13 (2), 137. Willsky, A. S., & Jones, H. L. (1974). a generalized likelihood ratio approach to state estimation in linear systems subject to abrupt changes. Proc. IEEE Conf. Decision and Control (p. 846).