FAULT DETECTION AND DIAGNOSIS VIA IMPROVED MULTIVARIATE STATISTICAL PROCESS CONTROL NOORLISA BINTI HARUN UNIVERSITI TEKNOLOGI MALAYSIA FAULT DETECTION AND DIAGNOSIS VIA IMPROVED MULTIVARIATE STATISTICAL PROCESS CONTROL NOORLISA BINTI HARUN A thesis submitted in fulfilment of the requirements for the award of the degree of Master of Engineering (Chemical) Faculty of Chemical and Natural Resources Engineering Universiti Teknologi Malaysia JULY 2005 ii iii In the Name of Allah, Most Gracious, Most Merciful. I humbly dedicate to… my beloved family members my best friend those who has shaped my life and those who has influenced my life on the right path. iv ACKNOWLEDGEMENT Firstly, I would like to take this opportunity to thank my respectful supervisor, Assoc. Prof. Dr. Kamarul Asri Ibrahim who has relentlessly given guidance, constructive ideas and invaluable advice, enabling me to achieve the objective of this dissertation. In addition I am very grateful to my family members, no amount of gratitude could repay their kindness of being there as well as the patience to put up with my whim. I also would like to thank my fellow course mates and friends for the assistances, advices and ideas in producing a resourceful, fruitful and practicable research. May God bless all of you. v ABSTRACT Multivariate Statistical Process Control (MSPC) technique has been widely used for fault detection and diagnosis (FDD). Currently, contribution plots are used as basic tools for fault diagnosis in MSPC approaches. This plot does not exactly diagnose the fault, it just provides greater insight into possible causes and thereby narrow down the search. Hence, the cause of the faults cannot be found in a straightforward manner. Therefore, this study is conducted to introduce a new approach for detecting and diagnosing fault via correlation technique. The correlation coefficient is determined using multivariate analysis techniques, namely Principal Component Analysis (PCA) and Partial Correlation Analysis (PCorrA). An industrial precut multicomponent distillation column is used as a unit operation in this research. The column model is developed using Matlab 6.1. Individual charting technique such as Shewhart, Exponential Weight Moving Average (EWMA) and Moving Average and Moving Range (MAMR) charts are used to facilitate the FDD. Based on the results obtained from this study, the efficiency of Shewhart chart in detecting faults for both quality variables (Oleic acid, xc8 and linoleic acid, xc9) are 100%, which is better than EWMA (75% for xc8 and 77.5% for xc9) and MAMR (63.8% for xc8 and 70% for xc9). The percentage of exact faults diagnoses using PCorrA technique in developing the control limits for Shewhart chart is 100% while using PCA is 87.5%. It shows that the implementation of PCorrA technique is better than PCA technique. Therefore, the usage of PCorrA technique in Shewhart chart for fault detection and diagnosis gives the best for it has the highest fault detection and diagnosis efficiency. vi ABSTRAK Proses Kawalan Statistik Multipembolehubah (MSPC) digunakan secara meluas untuk mengesan dan mengenalpasti punca kesilapan. Pada masa kini, carta penyumbang digunakan untuk mengenalpasti punca kesilapan dalam MSPC. Carta ini tidak dapat mengenalpasti punca kesilapan dengan tepat yang mana ia sekadar menunjukkan kemungkinan besar punca kesilapan dan membantu memudahkan pencarian punca kesilapan. Oleh itu, punca kesilapan tidak dapat ditentukan secara langsung. Justeru, kajian ini telah dijalankan bagi memperkenalkan satu pendekatan baru untuk mengesan dan mengenalpasti punca kesilapan melalui teknik korelasi. Pekali korelasi ditentukan melalui kaedah Analisis Statistik Multipembolehubah iaitu Analisis Komponen Utama (PCA) dan Analisis Korelasi Separa (PCorrA). Turus penyulingan multikomponen industri digunakan sebagai operasi unit. Model turus ini dibangunkan menggunakan perisian Matlab 6.1. Teknik carta individu yang terdiri daripada Shewhart, Exponential Weight Moving Average (EWMA), dan Moving Average dan Moving Range (MAMR) digunakan bagi mengesan dan mengenalpasti punca kesilapan. Berdasarkan kepada keputusan yang diperolehi, kecekapan carta Shewhart dalam mengesan kesilapan untuk kedua-dua pembolehubah kualiti (Asid oliek, xc8 dan asid linoleik, xc9) adalah 100% yang mana lebih baik berbanding dengan carta EWMA (75% bagi xc8 and 77.5% bagi xc9)dan carta MAMR (63.8% bagi xc8 and 70% bagi xc9). Peratusan mengenalpasti punca kesilapan secara tepat menggunakan teknik PCorrA bagi carta Shewhart adalah 100% manakala PCA adalah 87.5%. Ini menunjukkan bahawa penggunaan teknik PCorrA adalah lebih baik berbanding teknik PCA. Oleh itu, penggunaan teknik PCorrA dalam carta Shewhart bagi tujuan mengesan dan mengenalpasti punca kesilapan adalah yang terbaik dengan peratusan kecekapan yang paling tinggi. vii TABLE OF CONTENTS CHAPTER 1 TITLE PAGE TITLE PAGE i DECLARATION ii DEDICATION iii ACKNOWLEDGEMENT iv ABSTRACT v ABSTRAK vi TABLE OF CONTENTS vii LIST OF TABLES xii LIST OF FIGURES xiv LIST OF SYMBOLS xvi LIST OF ABBREVIATIONS xx LIST OF APPENDICES xxi INTRODUCTION 1.1 Introduction 1 1.2 Research Background 2 1.3 Objectives of the Research 4 1.4 Scopes of the Research 4 1.5 Contributions of the research 5 1.6 Chapter Summary 5 viii 2 LITERATURE REVIEW 2.1 Introduction 7 2.2 Definitions of Fault, Fault Detection and Fault Diagnosis 8 2.3 Multivariate Statistical Analysis, MSA 9 2.4 Review on Principal Component Analysis, PCA 10 2.4.1 Extension of Principal Component Analysis, PCA method 2.4.1.1 Multi-Way PCA 2.4.1.2 Multi-Block PCA 15 2.4.1.3 Moving PCA 16 2.4.1.4 17 Dissimilarity, DISSIM 2.4.1.5 Multi-Scale PCA 18 2.4.2 PCA Adopted in Industries 19 2.4.3 20 PCA as Controller 2.4.4 Theory of PCA 2.5 Review on Partial Correlation Analysis, PCorrA 2.5.1 Theory of PCorrA 2.6 Chapter Summary 3 14 21 23 26 27 IMPROVED STATISTICAL PROCESS CONTROL CHART 3.1 Statistical Process Control, SPC 29 3.2 Traditional Statistical Process Control, SPC chart 30 3.2.1 Shewhart Chart 31 3.2.2 Exponential Weight Moving Average, EWMA Chart 35 3.2.3 Moving Average and Moving Range, MAMR Chart 39 3.3 Improve Statistical Process Control Chart 43 3.4 Chapter Summary 45 ix 4 DYNAMIC SIMULATED DISTILLATION COLUMN 4.1 Introduction 46 4.2 Process Description of Precut Column 48 4.3 The Information of Pre-cut column 49 4.4 Degree of Freedom (DoF) Analysis 51 4.5 Formulations of the Distillation Column Models 56 4.5.1 4.5.2 Mass Balance Equations 56 4.5.1.1 Bottom column model 58 4.5.1.2 Trays model 59 4.5.1.3 Reflux drum model 61 4.5.1.4 Pumparound drum model 62 Equilibrium Equations 63 4.5.2.1 66 Bubble Point Calculations 4.5.3 Summation Equation 68 4.5.4 Heat Balance Equation 68 4.5.5 Hydraulic Equations 69 4.6 Dynamic Simulation of the Study Column 70 4.7 Controller Tuning 74 4.7.1 Bottom Liquid Level Controller 75 4.7.2 Bottom Temperature Controller 76 4.7.3 Top Column Pressure Controller 78 4.7.4 Sidedraw Tray Liquid Level Controller 79 4.7.5 Pumparound Temperature Controller 80 4.7.6 Pumparound Flow Rate Controller 81 4.7.7 Reflux Flow Rate Controller 82 4.7.8 Controller Sampling Time, TAPC 82 4.8 Dynamic simulation Program Performance Evaluation 83 4.9 Chapter Summary 84 x 5 METHODOLOGY 5.1 Introduction 86 5.2 Variable Selection for Process Monitoring 87 5.3 Statistical Process Control Sampling Time, TSPC 90 5.4 Normal Operating Condition (NOC) Data Selection and 92 Preparation 5.4.1 Data pretreatment 94 5.4.2 Correlation Coefficient via Multivariate Analysis 94 Techniques 5.4.3 5.4.2.1 Principal Component Analysis, PCA 95 5.4.2.2 Partial Correlation Analysis, PCorrA 101 Control limits of Improved Statistical Process 103 Control, ISPC chart 5.5 Generated Out of Control, OC data 105 5.6 Fault Coding Monitoring Framework 109 5.7 Performance Evaluation of the FDD Method 111 5.7.1 Fault Detection Efficiency 111 5.7.2 Fault Diagnosis Efficiency 112 5.8 Chapter Summary 6 113 RESULTS AND DISCUSSIONS 6.1 Statistical Process Control Sampling Time, TSPC 116 6.2 Normal Operating Condition, NOC Data 117 6.2.1 Correlation coefficient via Principal Component 119 Analysis, PCA 6.2.2 Correlation coefficient via Partial Correlation 123 Analysis, PCorrA 6.3 Parameters to Design Improved Statistical Process Control 124 chart 6.4 Fault Detection and Diagnosis Method Performance 6.4.1 False alarm Analysis 127 127 xi 6.4.2 Fault Detection Efficiency Results and Results 129 Discussions 6.4.3 Fault Diagnosis Efficiency Results and Results 132 Discussions 6.5 Chapter Summary 7 136 CONCLUSIONS AND RECOMMENDATIONS 7.1 Introduction 138 7.2 Conclusions of Improved SPC framework 139 7.3 Recommendations to Future Research 141 LIST OF REFERENCES 142 APPENDICES 154 xii LIST OF TABLES TABLE NO. TITLE PAGE 4.1 Specification of the pre-cut column 49 4.2 Components composition and conditions of the pre-cut column 50 4.3 Number of variables 52 4.4 Number of equations 53 4.5 Control loops properties 55 4.6 Cohen-Coon controller settings 75 4.7 The step change results for bottom liquid level controller 76 4.8 Properties of bottom liquid level P controller 76 4.9 The step change results for bottom temperature controller 77 4.10 Properties of bottom temperature PID controller 77 4.11 The step change results for top column pressure controller 78 4.12 Properties of top column pressure PI controller 78 4.13 The step change results for sidedraw tray liquid level controller 79 4.14 Properties of sidedraw tray liquid level P controller 79 4.15 The step change results for pumparound temperature controller 80 4.16 Properties of pumparound temperature PID controller 81 4.17 Properties of pumparound flow rate P controller 81 4.18 Properties of reflux flow rate P controller 82 5.1 APC variables for each control loops 87 5.2 Normal correlation coefficient for process variables 88 5.3 SPC variables 88 xiii 5.4 The control limits of Improved SPC chart 104 5.5 Parameters to design EWMA 105 5.6 Description of sensor fault, valve fault and controller fault 107 5.7 Improved SPC chart statistic for FDD 108 5.8 Coding system for fault detection 109 5.9 Coding system for fault diagnosis 109 5.10 Fault coding system for Shewhart Chart 110 5.11 Fault coding system for EWMA Chart 110 5.12 Fault coding system for MAMR Chart 110 6.1 Time constant for each control loops 116 6.2 The skewness and kurtosis value of NOC data 118 6.3 Summary of the NOC data 118 6.4 Eigenvectors Matrix 119 6.5 Eigenvalues, variation explained by each PC and cumulative 119 variation 6.6 The correlation coefficient using different cumulative variation 121 6.7 The percentage difference for two PCs retained and three PCs 122 retained compared to six PCs 6.8 Simple correlation matrix 123 6.9 The correlation coefficient result using PCorrA 124 6.10 Constant to design Shewhart range chart 124 6.11 FDD result using different L and β to design EWMA chart 125 6.12 FDD result using PCA with different window size to design 126 MAMR chart 6.13 FDD result using PCorrA with different window size to design 126 MAMR chart 6.14 False alarm result corresponding to oleic acid composition 128 6.15 False alarm result corresponding to linoleic acid composition 128 6.16 The known fault causes 129 6.17 The known out of control, OC conditions 130 xiv LIST OF FIGURES FIGURE NO. TITLE PAGE 2.1 Multi-block PCA monitoring framework 15 3.1 Shewhart range and Shewhart individual chart during normal 34 condition 3.2 Shewhart range and Shewhart individual chart during faulty 34 condition 3.3 EWMA control chart under normal condition 38 3.4 EWMA control chart under faulty condition 38 3.5 Moving average and range control chart under normal condition 42 3.6 Moving average and range control chart under faulty condition 42 3.7 The implementation of correlation coefficient, Cjk in SPC chart 44 4.1 Pre cut column 48 4.2 Distillation column control loop 54 4.3 Distillation column schematic diagram 57 4.4 Bubble point calculation procedure 67 5.1 The procedure to calculate and implement TSPC 92 5.2 Example Scree plot 97 5.3 The procedures to determine correlation coefficient, Cjk via PCA 100 method 5.4 The procedures to determine correlation coefficient, Cjk via 103 PCorrA metho 5.5 The outlined of FDD algorithm 115 xv 6.1 The autocorrelation plot 117 6.2 Histogram plot of NOC data 118 6.3 Scree plot 120 6.4 Fault detection efficiency using different Improved SPC chart 131 6.5 Percentage of faults occurs in different region detected by each 132 Improved SPC chart 6.6 Fault diagnosis situation using PCA for oleic acid composition 133 6.7 Fault diagnosis situation using PCorrA for oleic acid 134 composition 6.8 Fault diagnosis situation using PCA for linoleic acid 134 composition 6.9 Fault diagnosis situation using PCorrA for linoleic acid composition 135 xvi LIST OF SYMBOLS aij - Wilson constant n - Number of retained principal component A - Square matrix Ab - Bottom column area At - tray active area Ai,Bi,Ci - Antoine constant for component i A*,B*,C*,D* - Constant for liquid heat capacity A2, D.'001 , D.'999 - Constant for MA and MR control limit β - Weighting factor B - Bottom flow rate Cj,k - Correlation coefficient between variable j and variable k δ - Shifted value in standard deviation unit D - Distillate flowrate FDetect - Fault Detection FDiagnose - Fault Diagnosis ∆H vap - Heat of vaporization ∆H vap, n - Heat of vaporization at normal boiling point hN - Liquid enthalpy leaving each tray hb - Enthalpy for bottom stream how - Over weir height H, h - CuSum control limits HN - Vapor enthalpy leaving each tray xvii I - Identity matrix Kc - Controller Gain Kp - Static Gain, L28 - Liquid flowrate at tray 28 LH, P - Pumparound drum height Lf - Liquid feed flowrate LH, Re - Reflux drum liquid level LH, B - Bottom liquid level L - Width of the control limit M - Number of observations MB - Bottom molar hold-up MW - Molecular weight MRe - Reflux drum molar hold-up MP - Pumparound drum molar hold-up MN - Molar hold up at tray N N - Number of stages p - Number of observations P - Pumparound flowrate Pci - Critical pressure of component i Pi sat - Vapor pressure for component i Ptot - Total pressure of the system ρ - Liquid density QR - Reboiler heat duty Q - Number of variables for response variables R - Correlation matrix rj,k - Correlation value between variable j and variable k Re - Reflux flowrate R - Universal gas constant rk - Autocorrelation coefficient xviii S - Sidedraw flowrate S - Variance-Covariance matrix sj,k - Covariance value of variable j and variable k s - NOC standard deviation T - Temperature of the system Tci - Critical temperature of component i Tri - Reduced temperature of component i Tb,i - Boiling point for component i TAPC - APC sampling time TSPC - SPC sampling time t, u - Latent vectors τ - Time Constant τD - Derivative Time Constant τI - Integral Time Constant θ - Dead time V - Eigenvector matrix v - Eigenvector or loading vectors V - Vapor flowrate Vci - Critical volume for component i Vi - Liquid molar volume of component i WL - Weir length ω - Critical accentric factor of component i w - Un-normalized eigenvectors x - NOC mean x1 - Shifted mean xi - ith observation in the process xij - Value of the i-th row and j-th column matrix xijs - Standardized variable x - Liquid phase mole fraction xix xS - Sidedraw mole fraction xRe - Reflux mole fraction xP - pumparound liquid mole fraction xD - Distillate mole fraction γi - Liquid phase activity coefficient for component i y - Vapor phase mole fraction zi - The ith EWMA statistic Zci - Critical compressibility factor for component i φi - Vapor fugacity coefficient for component i Λij - Wilson binary interaction parameter for component i and j xx LIST OF ABBREVIATIONS APC - Automatic Process Control DoF - Degree of Freedom EWMA - Exponential Weight Moving Average FDD - Fault Detection and Diagnosis ISPC - Improved Statistical Process Control MA - Moving Average MESH - Mass, Equilibrium, Summation, Heat MR - Moving Range MSA - Multivariate Statistical Analysis MSPC - Multivariate Statistical Process Control NIPALS - Non-iterative Partial Least Squares NOC - Normal Operating Condition OC - Out of Control PCA - Principal Component Analysis PCorrA - Partial Correlation Analysis PV - Process Variable QV - Quality Variable SPC - Statistical Process Control USPC - Univariate Statistical Process Control xxi LIST OF APPENDICES APPENDIX TITLE PAGE A Column profile comparison 155 B Parameter selection to design EWMA chart 158 C Parameter selection to design MAMR chart 163 D Number of fault causes 170 CHAPTER 1 INTRODUCTION 1.1 Introduction A major technological challenge facing the processing industries is the need to produce consistently high quality product. This is particularly challenging in high demanding situations where processes are subject to varying raw material properties, changing market needs and fluctuating operating conditions due to equipment or process degradation. The need to provide industry with techniques, which enhance process performance, will require new methodologies to be adopted that are capable of being used across a spectrum of industrial processing operations and on a wide range of products (Simoglou et al., 2000). Modern industrial processes typically have a large number of operating variables under closed loop control. These loops used to compensate for many types of disturbances and to counteract the effect of set point change. This is necessary to achieve high product quality and to meet production standards. Although these controllers can handle many types of disturbances, there are faults in the process such as line blockage, line leakage, sensor fault, valve fault and controller fault that cannot be handled adequately. In order to ensure that process is operating at normal operating condition as required, faults must be detected, diagnosed and removed. These activities, and their management, are called as Statistical Process Control, SPC (Miletic et al., 2004). 2 The combination of Statistical Process Control (SPC) charts and multivariate analysis approach is used for Fault Detection and Diagnosis (FDD) in the chemical process. Principal Component Analysis (PCA) and Partial Correlation Analysis (PCorrA) are the techniques used in this study. A precut multicomponent distillation column that has been installed with controllers is used as the study unit operation. Improved Statistical Process Control method is implemented to detect and diagnose various kinds of faults, which occur in the process. 1.2 Research Background Statistical methods are used to monitor the performance of the process over time in order to detect any process shifts from the target. Most of Statistical Process Control (SPC) techniques involve operations on single response variables such as weight, pH, temperature, specific gravity, concentration and pressure. This traditional SPC is used to monitor and verify that the process remains in statistical control based on small number of variables. The state of statistical control means that the process or product variables remain close to the target. Normally the fault in the process is seek through the usage of SPC chart, i.e., the final product quality variables. Measuring quality variables alone are not enough to describe the process performance (Kourti et al., 1996). In this study, both quality variables and process variables are monitored. This can be done by developing the correlation coefficient between quality variables and process variables using multivariate analysis technique. In traditional SPC, once the quality variables detected out of statistical control signal, it is then left up to process operators and engineers to try to diagnose the cause of out of control using their process knowledge (MacGregor and Kourti, 1995). Chemical processes are becoming better instrumented with the advances of process sensors and data acquisition systems. There are massive amount of data being collected continuously and these variables are often correlated. Multivariate statistical analysis has been developed in order to extract useful information from process data and utilize it for process monitoring (Kresta et al., 1991). Normally, 3 multivariate statistical analysis technique is combined with SPC chart using Hotelling’ s T2 and Q statistics to plot the graph for fault detection and diagnosis. Even though this conventional method provides good results for fault detection, there is difficulty in applying this method to diagnose faults because it does not provide a causal description of the process that can be used as the basis for fault diagnosis. Currently, contribution plots are used as the basic tools for fault diagnosis in the multivariate statistical approaches. However, this plot cannot exactly diagnose the cause but it just provides greater insight into possible causes and thereby narrows the search. The cause of the faults cannot be found in a straightforward manner. To overcome the limitation of contribution plot using multivariate analysis technique to diagnose fault, Improved SPC chart is introduced in this study. Multivariate analysis approach is applied in the SPC realm procedure to detect and diagnose the faulty condition in different approach. The multivariate analysis method is used to develop the control limits of SPC charts. The correlation coefficient calculated from multivariate analysis technique is applied to improve the SPC chart. The quality variables data is incorporated in the control chart during the faulty condition for fault detection purpose, while the process variables which has been correlated with the quality variables is used for fault diagnosis. Therefore, the Improve SPC charts applied is not only for quality variables but also for process variables. By monitoring the process variables, the cause of faulty condition, which affects the quality variables, could be identified directly. This will help the operator to take proper action immediately after faults exist in the process. 4 1.3 Objectives of the Research 1. To determine the performance of the Improved Statistical Process Control charts; which are Shewhart individual and Shewhart range, Exponential Weight Moving Average (EWMA) and Moving Average and Moving Range (MAMR) chart for fault detection. 2. To utilize multivariate analysis technique, Principal Component Analysis (PCA), and Partial Correlation Analysis (PCorrA), for developing the relationship between the process variable(s) and the quality variable(s) for fault diagnosis. 1.4 Scopes of the Research The scopes of the research are: 1. A simulated precut multicomponent distillation column model from the literature is used as a case study. The model consists of controllers with various kinds of disturbances and operating conditions. 2. A set of Normal Operating Condition (NOC) data and a set of Out of Control (OC) data are generated using this simulated distillation column. 3. Normal operating statistical model is developed using the multivariate techniques, PCA and PCorrA via correlation coefficient approach between the quality variable(s) and the process variable(s). 4. The Shewhart individual and Shewhart range, Exponential Weight Moving Average (EWMA), and Moving Average and Moving Range (MAMR) charts are developed. 5. The faulty condition is incorporated into the process model in order to see the performance of the control charts to detect the fault(s) and to find the cause of the fault(s). 5 6. The results of fault detection and diagnosis are then discussed further. 1.5 Contributions of the Research In this research, a new approach known as Improved Statistical Process Control for fault detection and diagnosis is introduced. Multivariate analysis technique, Principal Component Analysis, PCA and Partial Correlation Analysis, PCorrA are used to develop the correlation coefficient between quality variables and process variables. This technique is incorporated in the Improved SPC chart as a fault diagnosis tools to find the cause of the faulty condition. Improved SPC charts are applied for both quality variables and process variables which have been correlated with quality variables of interest. 1.6 Chapter Summary This thesis contains seven chapters. The first chapter comprises of the introduction of the research, research background, objectives of the research, scopes of the research and contributions of this research. Chapter 2 reviews the multivariate analysis techniques and the concept of multivariate analysis technique used in this study, which are Principal Component Analysis (PCA) and Partial Correlation Analysis (PCorrA). Chapter 3 consists of the concept of Statistical Process Control, SPC chart, detail explanation on SPC charts; Shewhart, EWMA and MAMR and the way to construct these charts. This chapter also presents the implementation of multivariate analysis technique in SPC chart to develop Improve SPC chart. Chapter 4 expresses the reasons for choosing distillation column as the case study. This chapter also presents the dynamic modeling of the columns, formulation 6 of dynamic simulating algorithm, establishment of dynamic simulation program, controller tuning and explanation on Automatic Process Control, APC time sampling. The performances of the written simulation program are compared with HYSYS simulator. Chapter 5 mainly contains the procedure for fault detection and diagnosis. This chapter contains variables selection to perform process fault detection and diagnosis, determination of SPC sampling time, generating of normal operating condition data, out of control data and the development of correlation coefficient using PCA and PCorrA to relate the quality variables with process variables. Chapter 6 presents the result and discussion of the research. The results are systematically presented. Discussions are made on the performance of each Improved SPC chart to detect known fault in the process and the efficiency of the proposed method to conduct fault diagnosis with application of multivariate statistical analysis. Chapter 7 contains the conclusions of the research and suggestions to the future SPC research activities. CHAPTER 2 LITERATURE REVIEW 2.1 Introduction Malfunctions of plant equipment, instrumentation and degradation in process operation increase the operating costs of any process industries. More serious consequences are gross accident such as explosion. Although major catastrophes and disasters from chemical plant failures may be infrequent, minor accidents are very common, occurring on a day to day basis, resulting in many occupational injuries, illnesses, and costing the society billions of dollars. The petrochemical industry losses approximately $20 billion annually due to poor management in abnormal detection events (Venkatasubramanian et al., 2003a). Chen et al. (2004) also highlighted that the US-based petrochemical industry could save up to $10 billion annually if abnormal process behavior could be detected, diagnosed and appropriately dealt with. Effective monitoring strategy for early fault detection and diagnosis is very important not only from a safety and cost viewpoints, but also for the maintenance of yield and the product quality in a process. This area has also received considerable attention from industry and academia alike and many researches have been conducted and many industrial applications exist. 8 2.2 Definitions of Fault, Fault Detection and Fault Diagnosis In general, fault is deviations from the normal operating behavior in the plant that are not due to disturbance change or set point change in the process, which may cause performance deteriorations, malfunctions or breakdowns in the monitored plant or in its instrumentation. Himmelblau (1978) defines a fault as a process abnormality or symptom, such as high temperature in a reactor or low product quality. Faults can be categorized into the following categories (Gertler, 1998): 1. Additive process faults Unknown inputs acting on the plant, which are normally zero. They cause a change in the plant outputs independent of the known input. Such fault can be best described as plant leaks and load. 2. Multiplicative process faults These are gradual or abrupt changes in some plant parameters. They cause changes in the plant outputs, which also depend on the magnitude of the known inputs. Such faults can be best described as the deterioration of plant equipment, such as surface contamination, clogging, or the partial or total loss of power. 3. Sensor faults These are discrepancies between the measured and actual values of individual plant variables. These faults are usually considered additive (independent of the measured magnitude), though some sensor faults (such as sticking or complete failure) may be better characterized as multiplicative. 4. Actuator faults These are discrepancies between the input command of an actuator and its actual output. Actuator faults are usually handled as additive though, some kind (such as sticking or complete failure) may be described as multiplicative. While in many cases the classification of a particular fault as additive or multiplicative follows naturally from its nature, sometimes it may also be arbitrary. 9 Fault detection is a monitoring process to determine the occurrence of an abnormal event in a process, whereas fault diagnosis is to identify its reason or sources. The detection performance is characterized by a number of important and quantifiable benchmarks namely: 1. Fault sensitivity: the ability of the technique to detect faults of reasonably small size. 2. Reaction speed: the ability of the technique to detect faults with reasonably small delay after their occurence. 3. Robustness: the ability of the technique to operate in the presence of noise, disturbances and modeling errors, with few false alarms. 2.3 Multivariate Statistical Analysis, MSA Multivariate Statistical Analysis (MSA) is based on development of multivariate regression model in which multiple input variables are related to one or several outputs. Multivariate analysis technique is combined with statistical process control, SPC charts to form Multivariate Statistical Process Control (MSPC) (Wise and Gallagher, 1996; Ralston et al., 2001; Yoon et al., 2003). MSA is applied to extract useful information from process data and utilize it for process monitoring using control chart (Kano et al., 2000a). This method takes the interdependence and large volume of process characteristics into consideration. MSA may be seen as a response to the complexity and quantity of information available in a modern computerized installation for FDD on plant (Lennox et al., 1999). Chemical processes and associated control systems have become very complex due to product quality demand, economics, safety, and environmental constraints. As a result, these processes are heavily instrumented and more process data is readily available. Therefore, the application of MSA couple up with SPC chart for monitoring, fault detection and diagnosis is appropriate since this method is based on operating data that is readily available. In chemical processes, data based approaches rather than model-based approaches have been widely used for process 10 monitoring, because it is often difficult to develop detailed physical models (Kano et al., 2000a; Yoon and MacGregor, 2000). MSA only requires a good database of normal historical data, and the models are quickly and easily built from this. MSA has traditionally employed static linear analysis with normally distributed variables. Recent research showed that the application of MSA is not limited for linear process but also applicable for nonlinear and process transition (Duchesne et al., 2002). The application of this method in various fields has been widely published and many industrial applications exist. This chapter reviews the literature on the theory and application of Principal Component Analysis (PCA) and Partial Correlation Analysis (PCorrA). Both PCA and PCorrA are multivariate analysis techniques used in this study. 2.4 Review on Principal Component Analysis, PCA In this section, the overview of the multivariate statistical technique used, namely Principal Component Analysis (PCA) is presented. Principal Component Analysis is also known as projection to latent structures (Jackson, 1991; Daszykowski et al., 2003). Most of the important information from historical data collected during normal operating process was projected into low dimensional model that have the most variance. As new data comes available, they are compared with this normal low dimensional model to detect any abnormal behavior. PCA is best suited for analysis of steady-state data containing linear relationships between the variables. This method has since been used extensively in analytical chemistry, social and economic science, and more recently in engineering systems. The applications of this approach in various industrial processes have been reported in literature by several authors and becoming more common in published research. 11 PCA for process monitoring and fault detection and diagnosis has been widely studied for continuous, batch and semibatch processes. Ralston et al. (2001) applied PCA on the iron chloride waste stream to produce high purity sodium chloride. Q residual was used to detect abnormal variance of a test data set from the PCA model while the diagnosis was done using the Q residual and Q contribution plots. Instead of using overall residual to compensate masking errors, Ralston et al. (2001) used confidence limits on single residual. However, this method has a drawback in which it required great deal of data to sort out for each variable to calculate the confidence limits. PCA method is also applied for other continuous processes such as solvent recovery plant (Lennox et al., 1999) and reactors in a chloro-carbon production plant (Goulding et al., 2000). Both used Hotelling’s T2 statistic and Q statistic for fault detection and contribution plots for faults diagnosis. Yoon and MacGregor (2001) introduced fault signature to diagnose fault causes applied for Continuous Stirred Tank Reactor, CSTR system with feedback control. For semibatch processes, Simoglou et al. (2000) has applied PCA in industrial fluidised-bed catalytic reactor for process monitoring for fault detection and diagnosis. Other researchers used extension of PCA method that is multi-way PCA for semibatch and bath process. This method is discussed in details in Section 2.4.1. PCA method has been actively used in recent years for research on sensor fault detection and diagnosis. A data driven system model using PCA method was adopted for detecting and diagnosing sensor faults in nuclear power plants (Upadhyaya et al., 2003). The FDD approach has been applied to a steam generator system. PCA model is used for fault isolation. The first principal component of this model is used as a fault signature. For a given test case, the residual vector is computed using the data-driven model. This vector then projected on to the selected PC direction and the corresponding cosine of the angle is determined. If this measure is close to unity (> 0.9), then the fault is isolated. This procedure is repeated for all the fault directions. The drawback of this technique is that, it requires sufficient fault data for each case as a fault signature for fault diagnosis. 12 Jiji et al. (2003) used PCA to monitor networked Early Warning Fire Detection (EWFD) systems in Navy ship for both event detection and determination of source location and growth of fire rate. Typical fire detection systems monitor the individual fire detectors and display their responses. Since smoke detectors could not discriminate between actual fires and the smoke byproducts, Hotelling’s T2 statistics and the Q-statistics were employed for event detection. Subsequently, contribution plots were used to determine the source location, rate of growth and to discriminate between actual fires and their byproducts in adjacent departments. A key benefit of using MSPC method is its faster detection of smoldering fires while maintaining rapid detection of flaming fires. Reduction in detection times as much as 5 minute was observed for the five smoldering fires tested. Principal Component Analysis (PCA) method was developed to detect and diagnose sensor faults in Air Handling Units (AHU) (Wang and Xiao, 2004). Sensor faults were detected using the Q statistic (Squared Prediction Error, SPE). They were isolated using the Q statistic and Q contribution plot supplemented by simple expert rules. The PCA method in detecting and diagnosing sensor faults of the AHU were tested and evaluated by simulation test. Tests show that the PCA method is a valuable tool in AHU process monitoring, fault detection and isolation. PCA method can detect the faulty condition in various applications and can detect any change of relationship between variables. However, PCA method cannot directly indicate which variables are responsible since it does not provide a causal description of the process that can be used as the basis for fault diagnosis. Due to this, the PCA method is not very intuitive and does not always lend operators and engineers to easy interpretation (Singh and Gilbreath, 2002). Currently, contribution plots are used as the basic tools for fault diagnosis in the multivariate statistical approaches (Kourti and MacGregor, 1996; Kourti et al., 1996). These plots are typically very powerful in diagnosing simple faults, but much less successful at providing a clear isolation of complex faults. The variable making a large contribution to the Q statistic is indicated to be the potential fault source (Wang and Xiao, 2004). The contribution plot helps to reduce the possible fault sources, and thus focuses on a smaller range of measurements among all variables and thereby greatly narrows the research (MacGregor and Kourti, 1995; Yoon and MacGregor, 13 2000). Narrowing the possibilities of which variable is highly correlated with the fault using contribution plot does not equivocally diagnose the cause of the faults (Chiang et al., 2000). Several studies have been conducted to overcome the limitation using contribution plot for fault diagnosis. Yoon and MacGregor (2001) proposed fault signature developed from prior faults to diagnose faults. A fault signature consists of directions of movement of the process in both the PCA model space and in the orthogonal residual space during the period immediately following its detection of the fault. The fault signature bank or catalog would simply be the collection of these fault signatures that are available to date. Once a new fault occurs its signature is compared with those in the fault bank in order to identify the most likely cause. However, this method requires sufficient fault data to load in fault signature bank to cover all possible faults in a process. Wang et al. (2002) proposed an improved PCA that uses two new statistics of Principal-Component-Related Variable Residual, PVR and Common Variable Residual, CVR to replace Q statistics in conventional PCA. The process data vector in residual subspace is projected to another two subspace and the information is represented by PVR and CVR statistics. However, the drawback of this method is calculating control limits for the new statistic is similar with the Q statistic in conventional PCA. This result in the fault causes cannot be identified in a straightforward manner. PCA was also applied in data analysis to find the cause of the problem occurred in the process. Santen et al. (1997) used score and loading plot via PCA method to access the problem of hot spot in reactors of Shell petrochemical process. The outcome showed that maldistribution in catalyst bed the most important factor cause high temperature in some of adiabatic Shell reactors. Solving hot spot problems has improved the performance of the process and generate extra savings and benefits in the order of 2 millions dollars per annum. 14 2.4.1 Extension of Principal Component Analysis, PCA method 2.4.1.1 Multi-Way PCA Conventional PCA is best for analyzing a two-dimensional matrix of data collected from a steady state process, containing linear relationships between the variables. Since these conditions are often not satisfied in practice, several extensions of PCA have been developed. Nomikos and MacGregor (1994) proposed Multi-Way PCA, which allows the analysis of a multi-dimensional matrix. Multi way method organized the data into time ordered block which each represent a single sample or process run. Three dimensional array data (I, batch samples x J, process variables x K, time) is decomposed to two dimensional array (I x JK) data for easier analysis (Wise and Gallagher, 1996). Projection of these three dimensions data into two dimensions makes this method suitable and widely applied for batch processes. Nomikos and MacGregor (1994) used simulated data obtained from a semibatch reactor to monitor the process. Multi-Way PCA was applied to industrial batch polymerization reactor using Hotelling’s T2 chart for fault detection and contribution plots for fault diagnosis (MacGregor and Kourti, 1995; Nomikos, 1996; Kourti et al., 1996). Multi-Way PCA was also applied using process data collected from an industrial fed-batch fermentation process (Lennox et al., 2001). Lopes and Monezes (1998) applied this method for an industrial antibiotic production processes to detect faulty batches. Martin et al. (1999) used batch polymerization reactor to illustrate the implementation of Multi-Way PCA. Instead of using T2 statistic, M2 statistics was used to determine the confidence bound for data not normally distributed. Martin and Morris (1996) proposed M2 statistics that an empirical density based approach which the bounds calculated is based on density estimation. Under the Multi-Way PCA monitoring framework, Q statistic is used as the fault detection tool to detect abnormal batch variation whereas contribution plots are used as the fault diagnosis tool to isolate the faulty process variables that are responsible for the out-of-control situation. The disadvantage of contribution plot is that, they cannot isolate the causes automatically without the presence of confidence limit. The plant personnel need to decide whether the out-of-control situation is due to single or multiple faulty process variables. 15 2.4.1.2 Multi-Block PCA Extensions of basic PCA to handle very large processes via Multi-Block PCA were made by MacGregor et al. (1994). This method permits easier modeling and interpretation of a large matrix by decomposing it into smaller matrices or blocks. The Multi-Block PCA enables plant wide monitoring. The monitoring framework for fault detection and diagnosis can be viewed in Figure 2.1. Process Variables Unit Operation 1 Unit Operation 2 Block 1 Unit Operation 3 Block 2 Block T2 Block 3 • • • Unit Operation N • • • Block N Block SPE Out of control Contribution Plots Figure 2.1: Multi-block PCA monitoring framework The process variables are divided into several blocks with respective to specific unit operation. Block Hotellings’ T2 control chart and Q statistic control chart are used to detect out-of-control situation while contribution plots are used to isolate faulty process variables that cause the out-of-control situation. This approach relies on contribution plots for fault diagnosis; the shortcomings of this method still exist and could not offer complete fault isolation. 16 2.4.1.3 Moving PCA The concept of Moving Principal Components Analysis is based on the idea that a change of correlation in between process variables can be detected by monitoring the directions of principal components (Kano et al., 2000a; Kano et al., 2001b). In order to evaluate the change of direction of each principal component, an index based of the inner product between two principal components is defined. The index proposed in MPCA contained information on the current PCs directions with reference PCs directions to detect any non-conformance situation to the reference models. Previous known fault data sets are used to construct PCA models corresponded to various kinds of fault situation to diagnose the fault cause. The drawbacks of this method are as follow: a) There are a lot of PCA models, which representing various past fault situations are needed. b) Sufficient data have to be collected in order to construct PC models for each fault situation. 17 2.4.1.4 Dissimilarity, DISSIM Kano et al. (2000d) proposed monitoring method based on process data distribution known as DISSIM. DISSIM method is based on the idea that a change of operating condition can be detected by monitoring a distribution of time-series data, which reflects the corresponding operating condition. The degree of dissimilarity between data sets is determined in DISSIM method (Kano et al., 2000a; Kano et al., 2000b; Kano et al., 2001a). Dissimilarity index was defined to evaluate the difference between two data quantitatively. Dissimilarity index control chart is used for fault detection. This index contains the information of the current data distribution with reference data distribution. For fault diagnosis, each historical known fault data set is used to construct the known fault PCA models, which represents specific known fault data distribution. Besides, a similarity index is introduced to compare the current fault data distribution to each previous known fault PCA models. The proposed method is limited to PCs, which have similar variances because the index cannot function well if the PCs are changed. Other drawbacks of this method are sufficient data is required to construct every known fault PCA models and non-ability to isolate new fault. For fault diagnosis purpose, a contribution of each process variable to the dissimilarity index is introduced for identifying the variables that contribute significantly to an out-of-control value of the index (Kano et al., 2000c). However, there is difficulty to identify exact fault causes for the process, which has many feedback control loops and the process variables are complicatedly related to each other. 18 2.4.1.5Multi-Scale PCA Bakshi (1998) developed Multi-Scale Principal Components Analysis, MSPCA by combining PCA and wavelet analysis. PCA has the ability to extract the relationship between the process variables and de-correlate cross correlation while wavelet analysis has the ability to extract events at different scales, compress deterministic features in a small number of relatively large coefficients, and approximately decorrelate a variety of stochastic processes (Bakshi, 1999). MS-PCA methodology determines separate PCA models at each scale to identify the scales where significant events occur. MS-PCA method has been applied for fault detection in industrial Fluidized Catalytic Cracker Unit, FCCU. Results showed that MS-PCA detects the faulty condition faster than conventional PCA using Hotelling’s T2 statistic and Q statistic but the weakness of this method is that he didn’t propose fault diagnosis method. Kano et al. (2000a) applied MS-PCA to monitor problems of a simple two dimension matrix array data obtained from Tennessee Eastman Challenge process. Other researchers, Misra et al. (2002) proposed the combination of PCA and wavelet analysis. In essence, the MS-PCA approach is the same as proposed by Bakshi (1998). However, some differences have been introduced in their study such as multi-scale fault identification technique to identify the type of fault and sensor validation approach to serve as an early warning in case a fault of large magnitude is present. An industrial gas phase tubular reactor system used in this work for process fault diagnosis and sensor fault detection. The outcomes showed that the proposed method was able to detect and identify faults and abnormal events earlier than the conventional PCA approach. The disadvantage of this method is that, it requires basic understanding of the physical and chemical principles governing the process operation to help in clustering the highly correlated variables together before constructing the PCA model. Multi-scale fault identification does not provide the limits for contribution plot. 19 2.4.2 PCA Adopted in Industries The multivariate statistical analysis approach has been in widely accepted and utilized by industry. Yoon et al. (2003) discussed the application of MSPC to a continuous process at Honeywell, Geismar, LA. They illustrated the approach and its benefits, that is improved process understanding, improved product quality, and improved bottom line. MSPC was used for detecting of caking of HF furnaces and predicting fuel gas flow rate to HF furnace. Based on normal operation data, PC (Principal Component) model for furnace caking was developed. Furnace caking detected using Distance to the Model in the X Plane (DModX) and Hotelling’s T2 statistics and contribution plots used to diagnose the detected abnormal event. The caking model has been good tool used by operators and engineers as a quick monitoring method for each furnaces operation. The economic benefits were substantial. Return On Investment (ROI) was estimated to be less than two months in the first year. Miletic et al. (2004) discussed another application of PCA method in industry involving continuous and batch process. Multivariate statistical analysis is proven to be a useful technique for Dofasco and Tembec in many applications. Dofasco Inc. is a fully integrated steel producer, serving customers throughout North America with high quality of rolled and tubular steels and laser welded blanks from its facilities in Canada, the United States and Mexico. Tembec Inc. is a leading integrated Canadian forest products company principally involved in the production of wood products, market pulp and papers. PCA method has been applied for a continuous slab caster (Dofasco) for fault detection. The deployment of the PCA-based multivariate SPC monitoring system has had a significant impact on operating performance. Dofasco has enjoyed its highest productivity levels with an on-average 50% reduction in breakouts of its first caster, over the last six years. The used of PCA in batch cooks monitoring in sulfite pulp digester (Tembec) for fault detection has allowed operators to compensate quickly for small variations in the process. The Multivariate statistical applications developed at Dofasco and Tembec have delivered valuable business results that can be related to company profits. The Multivariate statistical applications is proven to be robust to many practical matters such as missing data, 20 and is proven to be amenable to various maintenance and updating strategies. Multivariate statistical analysis provides models and control chart schemes that are readily understood by production personnel once training are provided. 2.4.3 PCA as Controller Chen et al. (1997) designed multivariate statistical controller to decrease the variation in product quality without real time quality measurements. The cross correlation and autocorrelation among the variables are taken into consideration to model the dynamic PCA. Principal Component, PC scores are calculated at a new sampling time and the difference between previous data and current data is calculated to determine Squared Prediction Error, SPE. SPE control limit is determined using reference data. If the SPE is above its control limits, the statistical controller should be turned off because the structure of the process variables has changed. The multivariate statistical controller utilizes set point in a conventional control system to improve control performance. This method is applied in a continuous process, binary distillation column and Tennessee Eastman Challenge process. A multivariate statistical controller is built to replace the composition controller in binary distillation column case study. Results showed that statistical controller could detect and compensate disturbance faster than PI controller. A large variation due to the disturbances is shrunk using multivariate statistical controller and steady state offset is eliminated using the combination of statistical controller and feedback controller. Souza et al. (2004), proposed the joint use of two methodologies, Statistical Process Control, SPC and Engineering Process Control, EPC for controlling the quality of products and services. Statistical Process Control, SPC perform process monitoring using measurements obtained in the process without prescribing a control action. On the other hand, Engineering Process Control, EPC allows for the prescription of changes in variables involved in the process, in order to make them the closest possible to the desired target. Multivariate proportional feedback controller is proposed for ceramic materials company that produces several types of 21 tiles. The error that was made in forecasting the disturbance is modeled according to Exponentially Weighted Moving Average (EWMA) statistics. The value of the smooth constant (lambda) offers the smallest forecast error of the series of disturbance errors adjusted by EWMA statistics. The proposed method allows for the adjustment of the productive process to be made while the parts or products are still on the production line. 2.4.4 Theory of PCA Principal Components Analysis is a multivariate technique, which is used to explain the variance-covariance structure through the linear combinations of the original variables. The advantages of this method rely on data reduction and information extraction. Because of this reason, PCA is commonly classified as data reduction technique. Mathematically, PCA relies on eigenvector decomposition of the covariance or correlation matrix of the process variables. Let p is the number of variables with m measurements, which are collected and arranged in matrix form. The multivariate data set can be presented by m x p matrix where the rows represent observations and the columns represent variables. For notation of brevity, the bold capital letter signifies as matrix, and bold small letter is a vector and italic letter means scalar values. The following matrix data, X consist of m observations and p variables, ⎡ x11 ⎢ x21 Xm,p = ⎢ ⎢M ⎢ ⎢⎣ xm1 x12 K x1 p ⎤ ⎥ x22 K x2 p ⎥ ⎥ M M ⎥ xm 2 L xmp ⎥⎦ 2.1 A set of original variables x1, x2, …, xp is transformed to a set of new variables PC1, PC2, …, PCp. The newly formed variables are referred to as Principal Components (PCs). The mathematical representations that describe the transformation are shown as follow, 22 PC1 = x1v1,1 + x2 v1, 2 + K + x p v1, p PC2 = x1v2,1 + x2 v2, 2 + K + x p v 2, p 2.2 M PC p = x1v p ,1 + x2 v p , 2 + K + x p v p , p Equation 2.2 is simplified to: PC p , m = X m , p V p , p 2.3 V is the eigenvector matrix, which consists of eigenvector v1, v2, …, vp. Each eigenvector vi is a column vector which contains the arrangement of elements as vi = [vi,1 vi,2 … vi,p]. The eigenvectors matrix can be presented as following, Eigenvectors matrix, V =[v1 v2 ⎡ v1,1 ⎢v ⎢ 2,1 ⎢ . … vp] = ⎢ . ⎢ ⎢ . ⎢v1, p ⎣ v1, 2 v2, 2 . . . v2 , p ... v1, p ⎤ ... v2, p ⎥⎥ ... . ⎥ ... . ⎥ ⎥ ... . ⎥ ... v p , p ⎥⎦ 2.4 PC is the principal components scores matrix. A particular principal component score, pci is obtained via the product of data matrix, X (m,p) to the particular eigenvector vi, which is shown in the following equation: pc i ( mx1) = X ( mxp ) v i ( px1) 2.5 Equation 2.5 is used to determine the principal component score matrix PC, which contains m scores for each principal component: PC ( m , p ) = X ( m , p ) V( p , p ) 2.6 The numbers of PCs are equal to number of variables in the original data set. The first PC accounts for the greatest variation in the data while the second PC accounts for maximum variation that is not captured by the first PC, and so on. The 23 original variables can also be represented in terms of newly formed variables as follows X ( m, p ) = PC ( m , p ) V(Tp , p ) 2.7 In order to maintain the uniqueness of this technique for data reduction, only several Principal Components will be used to represent most of the original data variation. If A Principal Components are decided to retain with A < p, then Equation 2.7 can be written as: X ( m , A) = PC ( m , A) V(TA, A) + E ( m , p − A) 2.5 2.8 Review on Partial Correlation Analysis, PCorrA The purpose of PCorrA method is to determine the correlation between two variables while keeping the other variable constant. PCorrA has been applied in various research fields such as statistical analysis, food science, environmental issues, horticulture, cardiology and psychiatry. Ibrahim (1997) has introduced Partial Correlation Analysis (PCorrA) to be applied in chemical process data by developing Multivariate Statistical Process Control (MSPC) scheme known as Active SPC methodology. The method is similar to those used in the feedforward Automatic Process Control (APC). Active SPC was applied to nonlinear simulated Continuous Stirred Tank Reactor (CSTR). Foucart (2000) proposed a decision rule using PCorrA regression method to discard a principal component from the set of predictors variables (independent variables). The selection of Principal Components to retain depends on the small mean of the squared residuals. 24 Tsumoto et al. (2002) used PCorrA to analyze acid amino sequences. Partial correlation used to examine the contribution of the mutated sites (35 types of mutated amino-acid) on acid amino sequence i.e VH and VL chains of partially mutated antilysozyme antibody HyHEL-10 to the change of thermodynamic measurements (measured values: temperature, pH, combining coefficients Ka, free energy change DG, enthalpy change DH, entropy change TDS, and change of specific heat DCp). The relations between change of selected thermodynamic measurements and locations of mutated sites were determined by keeping some of the thermodynamic measurements constant. The results demonstrated that two or three sites on each of the VH and VL chains have high correlation to the change of thermodynamic condition. Shirley et al. (2003) investigated the sensitivity of the population dynamics model to the input parameters by analyzing the response of total population size and total number of infected individuals in simulated badger populations over 20 years to variations in the model inputs. An individual based spatially-explicit model was designed to investigate the dynamics of badger population and TB epidemiology in real landscape. The relationships between input parameters and model output were analyzed using partial correlation. Parameters with high partial correlation coefficients give large effect on the model output and that parameter is identified as key driving parameters at relatively low numbers of simulations. Quemerais et al. (1998) investigated the factors that had the most influence on the mercury concentrations in water using partial correlation. When the effects of dissolved iron and manganese removed, dissolved mercury shows no relation to dissolved organic carbon. On the contrary, when the effect of dissolved organic carbon, DOC removed, dissolved mercury concentrations are related to dissolve iron. The observed mercury partition in the waters of the St. Lawrence River enhanced by the partial correlation between the distributions coefficients. Partial correlation analyses indicate that mercury strongly related to iron and manganese, while organic carbon being indirectly related. 25 Wissemeier and Zuhlke (2002) conducted a study on the relationship between tipburn with growth data and climatic variables. From the regression coefficient results, tipburn severity highly correlated with the sum of irradiation, the head fresh weight, the sum of temperature, the sum of maximum daily irradiation and the day light length 23 days prior to harvest were obtained. However, sum of irradiation had the highest correlation coefficient with tipburn. Moreover, the sum of irradiation remained a highly significant predictor for tipburn if other variables were statistically eliminated by partial regression analysis. The intercorrelation between variables statistically eliminated using partial regression analysis and this result the significance of every independent variable predicting tipburn lost if the impact of the sum of irradiation is eliminated statistically. Hashimoto et al. (2003) study the effect of basic Fibroblast Growth Factor, bFGF on the pathophysiology of schizophrenia. bFGF is a multifunctional growth factor that has been implicated in a variety of neurodevelopmental processes. Analysis of partial correlation coefficients showed that the increased bFGF levels might not be attributable to antipsychotic medication for schizophrenic patients. Onat et al. (1999) investigated the relationship of waist circumference (WC) and waist-to-hip ratio (WHR) with a number of established risk factors and their relevance to cardiovascular morbidity. The study was conducted on random sample of Turkish general adult population. Partial correlation coefficients on univariate analysis was done to determine the correlation between four variables i.e blood pressure and plasma lipids and either WC or WHR, controlled for age. The results showed that the correlation was highly significant but moderately weak in genders, men and women. 26 2.5.1 Theory of PCorrA Partial correlation is the correlation of two variables while controlling for a third or more other variables. The partial correlation presents the incremental predictive effect of one process variable from the collective effect of all others and is used to identify the process variables that have the greatest incremental predictive power when a set of process variable is ready in the regression variate (DeVor et al., 1992). If the two variables of interest are y and x and the control variables are z1, z2 ... zn, then the corresponding partial correlation coefficient is ryx z1, z 2..., zn . The order of partial correlation depends on the number of variables that are being controlled for. The first order partials equation after removing the effect of variable z1 is r yx z1 = ryx − r yz1 rxz1 1− r 2 yz1 1− r 2.9 2 xz1 where, ryx z1 = Partial coefficient of y and x after controlled z1 ryx = Correlation coefficient between y and x ryz1 = Correlation coefficient between y and z1 rxz1 = Correlation coefficient between x and z1 The second order partial after controlled the effect of z1 and z2 is ryx z1z 2 = ryx z1 − ryz 2 z1rxz 2 z1 1 − ryz2 2 z1 1 − rxz2 2 z1 = ryx z 2 − ryz1 z 2 rxz1 z 2 1 − ryz2 1 z 2 1 − rxz2 1 z 2 2.10 Reapply the Equation 2.10 to compute the next order partial equation. The highest order possible for partial is n=N–1 2.11 27 where, n = The highest partial order N = Numbers of process variables and the general equation for higher order partial is ryx z1, z 2, z 3,K, zn = ryx z 2, z 3,K, zn − ryz1 z 2, z 3,K, zn rxz1 z 2, z 3,K, zn 1 − ryz2 1 z 2, z 3,K, zn 1 − rxz2 1 z 2, z 3,K, zn 2.12 As shown mathematically in Equation 2.12, this correlation is calculated by separating the group of process variables into subgroup in which one or more variables are held constant before determining the correlation among the other variables. 2.6 Chapter Summary Multivariate statistical analysis, MSA is based on development of multivariate regression model in which multiple input variables are related to one or several outputs. Multivariate analysis technique is combined with statistical process control, SPC charts to form Multivariate Statistical Process Control, MSPC. Multivariate analysis techniques like Principal Components Analysis, PCA and Partial Correlation Analysis, PCorrA are used in this study. Review on PCA method shows that this method has been widely used in various applications such as process monitoring, fault detection, fault diagnosis, process controllers and data analysis. PCA for process monitoring and fault detection and diagnosis has been widely studied for continuous processes and batch and semibatch processes. The advantage of this method relies on data reduction and information extraction from historical database. A set of original variables is transform to a set of new variables called as Principal Component, PC. The number 28 of PCs used is less than number of original variables. This multivariate statistical analysis approach has been in widely accepted and utilized by industry. Several companies have started utilizing this method. The extensions of PCA such as MultiWay PCA, Multi-Block PCA, Moving PCA, Dissimilarity and Multi-Scale PCA also have been discussed. PCorrA is another multivariate analysis technique. The purpose of PCorrA method is to determine the correlation between two variables while keeping constant the other variables. This correlation is done by separating the group of process variables into subgroup in which one or more variables are held constant before determining the correlation among the other variables. PCorrA has been applied in various research fields such as statistical analysis, food science, environmental issues, horticulture, cardiology and psychiatry. CHAPTER 3 IMPROVED STATISTICAL PROCESS CONTROL CHART 3.1 Statistical Process Control, SPC Statistical Process Control (SPC) monitors the performance of a process over time in order to verify the process whether it is in a state of statistical control or in a state out of control. A state of control exists when the process variables close to its desired values and only common cause variation exist. Meanwhile a state out of control is exists if process variables fall beyond the control limits and it shows that assignable causes of variation exist in a process. If the process is in control, it is assumed that the products produced from these processes will also be in control (Doty, 1996). Common causes are small random changes in the process that cannot be avoided. These small differences are due to the inherent variation present in all processes. They consistently affect the process and its performance. Variation of this type is only removable by making a change in the existing process. Assignable causes, on the other hand, are large variation in the process that can be identified as having specified cause. Assignable causes are causes that are not part of the process on a regular basis. This type of variation arises because of specific circumstances. Sources of variation can be found in the process itself, the material used, the operator’s actions, or the environment (Summer, 2000). 30 SPC is an excellent tool for process monitoring which can be applied to any process. There are many statistical tools that can be used and the most widely applicable are the “Seven Basic Tools”. They are histogram, check sheet, pareto chart, cause and effect diagram, defect concentration diagram, scatter diagram and control chart. 3.2 Traditional Statistical Process Control, SPC chart The function of Statistical Process Control (SPC) chart is to compare the current state of the process against Normal Operating Condition (NOC). The NOC condition exists when the process or product variables remain close to their desired values or in statistical control. In contrast, the Out of Control (OC) occurs when fault appears in the process. The fault or malfunction is designated when the process departs from an acceptable range of observed variables. This happens when the process is having degradation in performance and experiencing malfunction of the equipment due to a failure. Malfunction on a piece of equipment is usually relatively easy to detect compared to process degradation. Thus, early detection of faults either failure or degradation is needed to reduce the occurrence of sudden failure on equipment, to reduce personnel accidents and to assist in the operation of maintenance program. Control charting, the featured tool of SPC is used to distinguish between common cause variation and assignable cause variation in order to prevent overreaction and underreaction to the process. Control charts approach is based on the assumption that a process subjected to a common cause variation will remain in a state of statistical control under which process remain close to target. Therefore, by monitoring the performance of a process over time, OC events can be detected as soon as they occur. If the causes for such events can be diagnosed and the problem can be corrected, the process is driven back to its normal operation (Venkatasubramanian et al., 2003). 31 Individual Shewhart and Shewhart range, Exponential Weight Moving Average, EWMA, and Moving Average and Moving Range, MAMR are examples of SPC charts. They are used for individual data. The following section discussed the theory of each SPC chart and the method to construct these charts. 3.2.1 Shewhart Chart Shewhart chart represents the current observation data excluding the previous information of the process data. It is usually necessary to monitor both mean value of the variables and its variability. In this study, individual data for quality variables and process variables is used in designing Shewhart chart. Current data is plotted to monitor the process over the time. Both average and standard deviation were determined using historical data, which represent the desired condition for the process operation. Let x1, x2, …, xm denote individual data for variable with m observations, then the average of the variable x is, x= x1 + x2 + K + xm m 3.1 where, x = Individual data x = Arithmetic average m = Number of observation The center line and control limits for individual Shewhart chart was determined using Equation 3.2, 32 UCL = 3s CL = x 3.2 LCL = −3s where, UCL = Upper control limit LCL = Lower control limit CL = Centre line s = Standard deviation Shewhart charts used in this study have control limits ± 3-standard deviation. Three out of 1000 data will fall out of the limits or 99.73% of data fall within these limits. The probability of data to fall beyond this control limits is very small to occur. Process variability is monitored by plotting values of the sample range, R on a control chart. The sample range, R is the difference between the maximum value and the minimum value of the sample. In this study, sample range, R is calculated for every two observations. Let R2, R3,…, Rm be the ranges for m observation. Then, the average of sample range, R is R= R2 + R3 + K + Rm m where, R = Sample range R = Average of samples range m = Number of observations 3.3 33 The center line and control limits on the Shewhart range chart are as follows: UCL = D.'001 R CL = R 3.4 LCL = D ' .999 R ' D.'001 and D.999 are the constants that can be found in (Oakland, 1996). where UCL = Upper control limit LCL = Lower control limit CL = Centre line R = Average of samples range ' D.'001 , D.999 = Constant Figure 3.1 and Figure 3.2 show the Shewhart individual and Shewhart range chart during NOC and OC respectively. 34 Figure 3.1: Shewhart range and Shewhart individual chart during normal condition Out of control Out of control Figure 3.2: Shewhart range and Shewhart individual chart during faulty condition 35 3.2.2 Exponential Weight Moving Average, EWMA Chart The Exponential Weight Moving Average (EWMA) is a statistic tool for monitoring the process that averages the data in a way that gives less and less weight to data as they further removed in time. The decision regarding the state of control of the process at any time depends on the EWMA statistic, which is exponential weighted average of all prior data, including the most recent data. This type of chart has some very attractive properties such as (Bower, 2000): 1. All of the data collected over time may be used to determine the control status of a process. 2. EWMA schemes may be applied for monitoring standard deviations in addition to the process mean. 3. There exists the ability to use EWMA schemes to forecast values of a process mean. 4. The EWMA methodology is not sensitive to normality assumptions. The EWMA statistic, zi for each variables with m observations, i=1,2, … , m, is defined as zi = βxi + (1 − β ) zi −1 3.5 For i = 1, the EWMA statistics is z1 = βx1 + (1 − β ) z0 3.6 The values of zo can be the target values or average of the variable during normal operating condition. 36 where, zi = EWMA statistics zo = Average of normal data or target value x = Individual data β = Weighting factor m = Number of observations From Equation 3.5, it shows that the information from the past observation is combined with the current observation. The result make EWMA has the ability to detect a small shift in the process. The EWMA statistic, zi is a weighted of all previous data, substituted in the right–hand side of the equation above to give zi = βxi + (1 − β )[βxi −1 + (1 − β ) zi −2 ] = βxi + β (1 − β ) xi −1 + (1 − β ) 2 z i −2 3.7 The EWMA control procedure can be made sensitive to a small or gradual drift in the process by selecting appropriate value of weighting factor, β . In general, the values of β in the interval 0.05 ≤ β < 0.25 work well in practice, with 0.05, 0.1, and 0.2 being popular choices (Montgomery, 1996). A good rule of thumbs is to use smaller values of β to detect smaller shifts. It is found that the width of control limit, L=3 (the usual 3 standard deviation limit) works reasonably well, particularly with the larger value of β . Hunter (1989) suggested using β = 0.4 which the weight given to the current and previous observation data match closely to the weight given to observations by Shewhart chart. The advantage of using small value of β is it will reduce the width of the limits in order to give good performance of the control chart (Montgomery, 1996). 37 The variance of the EWMA statistics, s z2i is given by ⎛ β s z2i = s 2 ⎜⎜ ⎝2−β ⎞ ⎟⎟[1 − (1 − β ) 2i ] ⎠ 3.8 The center line and control limits for the EWMA control chart as follows: UCL = x + Ls β 2−β [1 − (1 − β ) 2i ] CL = x 3.9 LCL = x − Ls β 2−β [1 − (1 − β ) 2i ] where UCL = Upper control limit LCL = Lower control limit CL = Centre line x = Arithmetic average L = Width of the control limit s = Standard deviation Importantly, the EWMA structure is insensitive to normality. This makes the EWMA chart an attractive candidate in general when addressing small changes in a process. Figure 3.3 and Figure 3.4 show the pattern of EWMA chart under NOC and OC respectively. 38 Figure 3.3: EWMA control chart under normal condition Out of control Figure 3.4: EWMA control chart with faulty condition 39 3.2.3 Moving Average and Moving Range, MAMR Chart The Moving Average and Moving Range (MAMR) control charts are used in handling data, which are difficult, or time consuming to obtain. In the chemical industry, the nature of certain production processes and analytical method entails long time interval between consecutive results. The alternative of moving average and moving range control charts uses the data differently and is generally preferred for the following reasons (Oakland, 1996): By grouping the data together, reacting to individual results and over 1. control is less likely. It is more meaningful because it uses the latest piece of data. 2. The EWMA chart uses weighted averages as the chart statistic. But Moving Average, MA is another type of time-weighted control chart based on simple, unweighted moving average and assumes that the past and present data are equally important. Therefore, the MA for i = 1, 2, … m observation in a window of size w is defined as MA = xi + xi −1 + K + xi − w+1 w 3.10 where MA = Moving average statistics x = Individual data w = Moving window size m = Number of observations The Moving Range, MR chart is used in order to control process spread. This control chart is likely to be very sensitive to assumption of normality, independence and to the presence of autocorrelation (Wetherill and Brown, 1991). MR statistic is similar to MA statistic. 40 A range of observation data in every time window w is taken over instead of the average. MRi=max[xi-w+1]–min[xi-w+1] 3.11 At time period i= w+1, the oldest data in the moving average and moving range set is dropped and the newest one added to the set. The subgroup for every time moving is the same. In general, a large window size, w will result in a large detection delay, while a small w will result in a less sensitive fault detection statistic. Therefore, appropriate value of w must be chosen to suit with the normal operating condition, NOC data. If x denotes the target value or the arithmetic average used as the center line of the control chart, then the 3 standard deviation control limits for MA is, UCL = x + A2 R LCL = x − A2 R 3.12 and for the MR chart the control limits is defined as, UCL = D.'001 R LCL = D.'999 R 3.13 41 where UCL = Upper control limit LCL = Lower control limit CL = Centre line x = Arithmetic average R = Average of samples range A2 = Constant for moving average ' D.'001 , D.999 = Constant The average of sample range, R is determined using normal operating data. The sample range, R for moving chart is the difference between the maximum value and the minimum value of the sample in moving window. The way to calculate the sample range, R is the same with MR statistics. For every moving window, the sample range, R is calculated. A2, D.'001 and D.'999 are the constant depends on window size, w that can be found in (Oakland, 1996). In general, the magnitude of the shift of interest and w are inversely related. Smaller shifts would be guarded against more effectively by longer span moving averages, at the expense of quick response to large shift. Figure 3.5 and Figure 3.6 show the pattern of MAMR chart under NOC and OC respectively. 42 Figure 3.5: Moving average and range control chart under normal condition Out of control Out of control Figure 3.6: Moving average and range control chart with faulty condition 43 3.3 Improved Statistical Process Control Chart Traditional SPC chart is often applied by constructing separate control charts for each of the selected variables. These charts give signal out of statistical control if an assignable cause of variation exists in the process. SPC is implemented by monitoring either quality characteristics or process characteristics. Traditional SPC approach is deficient for at least three reasons. First, the fault in the process is seeked through the usage of SPC chart, i.e., the final product quality variables. Measuring quality variables alone is not enough to describe the process performance (Kourti et al., 1996). Second, observations of multiple process or product characteristics are assumed to be unrelated. Almost any measure of product quality is dependent on a sequence of preceding processes. Hence, measures of process characteristics are often correlated (Singh and Gilbreath, 2002). Third, in traditional SPC, once the quality variables showed out of statistical control signal, it is then left up to process operators and engineers to try to diagnose the cause of out of control using their process knowledge and one at a time inspection of process variables (MacGregor and Kourti, 1995). In order to improve the traditional SPC chart to detect and diagnose assignable cause occur in the process involving massive data, multivariate analysis technique such as Principal Component Analysis, PCA and Partial Correlation Analysis, PCorrA are used to incorporate all these data to be analyzed and included in constructing the SPC chart. PCA is modified and used to determine the cross correlation between quality variables and process variables while PCorrA is used to determine the correlation between a quality variable and selected process variable after removing the effect of other process variables. The correlation coefficient, Cjk is calculated using both multivariate techniques in order to improve traditional SPC charts. Let xj be the quality variables and xk be the process variables. The relationship between x j and x k can be written as, x j = Cjk x k 3.14 44 The control limit for quality variable, x j in general is LCL< x j <UCL 3.15 where UCL and LCL is upper control limit and lower control limit respectively. Substituting Equation 3.14 into Equation 3.15 and rearrange it gives, UCL LCL < xk < C jk C jk 3.16 Figure 3.7 shows the use of correlation coefficient corresponding to process variables. Once fault has been detected using quality variables, the correlation coefficient is used to construct the Improved SPC chart using process variables to identify the variables that cause the fault. Process variable Quality variable Process Process variable Quality variable Fault diagnose UCL/Cjk LCL/Cjk Fault detect UCL LCL Cjk Figure 3.7: The implementation of correlation coefficient, Cjk in SPC chart 45 The quality variables data is incorporated in the control chart during the faulty condition for fault detection purpose, while the process variables which have been correlated with the quality variables is used for fault diagnosis. For example, when statistical data for oleic acid, xc8 fall beyond the control limits, it shows that fault is detected. Then, the next task is to determine the cause of the fault. Process variables charts are monitored for this task. If statistical data for feed flowrate fall beyond the limits, then it is concluded that this variable cause the fault. The Improve SPC charts are applied not only for quality variables but also for process variables. Both quality variables and process variables are monitored continuously to detect and diagnose faults. By monitoring the process variables, which have been correlated with quality variables, the cause of faulty condition, which affects the quality variables, could be identified directly. This will help the operator to take proper action immediately after the faults exist in the process. 3.4 Chapter Summary Statistical Process Control (SPC) is an excellent tool for process monitoring which can be applied to any process. SPC only requires a good database of normal historical data, and the models are quickly and easily built from this. New data is then compared with the normal operation model to detect a change in system. The control chart is used as a Fault Detection and Diagnosis (FDD) tools in SPC. The function of SPC chart is to compare the current state of the process against Normal Operating Condition (NOC). The control chart will give signal out of statistical control if assignable cause variation or fault exists in the process. Shewhart individual, Shewhart range, Exponential Weight Moving Average (EWMA), and Moving Average and Moving Range (MAMR) are SPC charts to be used to detect and diagnose fault in the process. 46 Traditional SPC chart has some drawbacks in detecting and diagnosing fault in data rich environment because it ignores the correlation among the variables. Therefore, multivariate analysis techniques such as Principal Component Analysis (PCA) and Partial Correlation Analysis (PCorrA) are applied to build linear relationship between quality variables and process variables to improve traditional SPC chart. PCA is used to determine the cross correlation between quality variables and process variables while PCorrA is used to determine the correlation between a quality variable and selected process variable after removing the effect of other process variables. The correlation coefficient using these multivariate analysis techniques is applied on control charts for fault diagnosis purpose. Both quality variables and process variables are monitored in this study. The cause of faulty condition, which affects the quality variables, could be identified in a straightforward manner and help the operator to take proper action immediately after the faults occur in the process. CHAPTER 4 DYNAMIC SIMULATED DISTILLATION COLUMN 4.1 Introduction The proposed FDD algorithm can be applied to any kind of unit operations in chemical industry. However, distillation column is chosen in this study because of a few reasons. Distillation accounts for approximately 95% of the separation systems used by the refining and chemical industry (Hurowitz et al., 2003). It has a major impact upon the product quality, energy consumption, and plant throughout of these industries. Maintaining the product quality within specification is the usual policy in industry and it is very difficult to control the process at the desired target. The sensitivity of product quality to disturbances decreases as impurities are reduced. According to Shinskey (1984), 40 percent of the energy consumed in a plant allocated to distillation. Distillation control is a challenging endeavor due to (Hurowitz et al., 2003): 1. The inherent nonlinearity of distillation. 2. Severe coupling present for dual composition control. 3. Non-stationary behavior. 4. The severity of disturbances. 48 Since distillation column involves multivariable and has non-stationary behavior in nature, multivariate analysis is an appropriate technique for FDD in the process to avoid any degradation process and to maintain the product quality. A precut column from the Palm Oil Fractionation Plant is choosen as a study column. A simulated model is developed based on the information from this pre-cut column using Matlab. 4.2 Process Description of Precut Column The fatty acid mixtures, caproic acid (C6), caprylic acid (C8), capric acid (C10), lauric acid (C12), myristic acid (C14), palmitic acid (C16), stearic acid (C18), oleic acid (C18), and linoleic acid (C18) enter the precut column in liquid form at tray 14. The light components (C6-C10) are condensed by direct contact in the pumparound section and recovered as distillate. Then, the distillate product is pumped to storage. The heavy components (C12-C18) recovered at the bottom of pre cut column. These components will be passed to other system in the fractionation unit for further fractionation process. Figure 4.1 shows the schematic diagram of pre cut column. Figure 4.1: Pre cut column 49 4.3 The Information of Pre-cut column The summaries of the specification of the precut column, components composition and conditions are presented in Tables 4.1 and 4.2 respectively. Table 4.1: Specification of the pre-cut column Variable Specification Number of actual tray 28 Actual feed position Tray 14 from the top Plate spacing 50 cm Plate diameter 120 cm Plate type Sieve tray Type of reboiler Total Reflux rate 10 kmol/hr Pumparound rate 49.6 kmol/hr Sidedraw rate 63.6 kmol/hr Reboiler duty 2.158 x 106 kJ/hr Weir length 0.912 m Weir height 0.076 m 50 Table 4.2: Components composition and conditions of the pre-cut column Component Composition (kmol/kmol) Feed stream Distillate stream Bottom stream 0.002 0.023 0 0.051 0.528 0 0.043 0.438 0.001 0.518 0.006 0.574 0.156 0 0.173 0.068 0 0.075 Stearic acid, C-18 0.014 0 0.013 Oleic acid, C-18-1 0.121 0 0.134 Linoleic acid, C-18-2 0.020 0 0.023 Temperature 483.15 K 442.55 K 509.55 K Pressure 2.5 bar 0.11 bar 0.13 bar Hexanoic acid, C-6 (Caproic acid) Octanoic acid, C-8 (Caprylic acid) Decanoic acid, C-10 (Capric acid) Dodecanoic acid, C-12 (Lauric acid) Tetradecanoic acid, C-14 (Myristic acid) Hexadecanoic acid, C-16 (Palmitic acid) 51 4.4 Degree of Freedom (DoF) Analysis There is a set of equation called as MESH equations involved for the dynamic simulation, which consist of mass balance equation, M, equilibrium equation, E, summary equation, S and heat balance equation, H. Another equation is hydraulic equation to describe the behavior of the column bottom, normal tray, feed tray, reflux drum, pumparound drum, sidedraw tray and the reflux tray. The number of independent equation must be equal to the number of independent variables in order to get a unique solutions (Stephanopoulos, 1984). This rule must be followed to avoid either redundant equations or variable conditions because these conditions cause infinite number of solutions or no solution at all. The equation for DoF is DoF = Number of independent variables – Number of independent equations 4.1 For multicomponents distillation column, the number of variables and equation is determined to perform DoF. It is presented in Table 4.3 and Table 4.4 as in Luyben (1963). 52 Table 4.3: Number of variables Variables General Form No. of Variables 2MN 2 x 9 x 29 Tray liquid flow, L N 29 Tray vapor flow, V N 29 Tray hold up, Molar N 29 Reflux flow rate 1 1 Distillate flow rate 1 1 Sidedraw flow rate 1 1 Pumparound flow rate 1 1 Liquid stream flow rate 1 1 Pumparound drum hold up 1 1 Reflux drum hold up 1 1 2M 2x9 Bottom flow rate 1 1 Bottom hold up 1 1 Vapor flow from reboiler 1 1 2M 2x9 2(N+1) 2 x 29 2 2 Reflux drum temperature and pressure 2 2 Pumparound stream temperature 1 1 Reboiler duty 1 1 Cooler duty 1 1 Tray composition (vapor and liquid) Side draw composition Bottom composition Trays pressure and temperature Pumparound drum temperature and pressure Total 720 Notes: Number of trays, N = 28 +1 (total reboiler) = 29 Number of components, M = 9 53 Table 4.4: Number of equations Equations Total mass balance (tray, General Form No. of Equations N+3 31 (N + 3) x (M – 1) 31 x 8 2(N + 3) 2 x 31 N +3 31 M(N + 3) 9 x 31 N+3 31 reflux drum, pumparound drum, bottom) Component balance, liquid composition Summation equations for vapor and liquid composition Hydraulic (liquid flowrate) Equilibrium equation (vapor composition) Heat balance Total 682 Degrees of Freedom 720 – 682 = 38 There are seven control loops installed in the simulated column model to ensure the stable operation of the column. Figure 4.2 shows the schematic diagram of the study column with the control loops. The control variables, manipulated variables and measured disturbances for each control loops are listed in Table 4.5. 54 To vent CL 6 CL 3 Cooling Utility In Di TIC CL 5 Pumparound Point FIC PIC P LIC FIC HX-2 Reflux Point Sd Re CL 4 F Pump-2 CL 2 Hot Utility In HX-1 Hot Utility Out TIC LIC Pre-Cut Column B CL 1 Pump-1 Controller TIC: Temperature Indicator Controller FIC: Flow Indicator Controller LIC: Level Indicator Controller PIC: Pressure Indicator Controller Equipment HX: Heat Exchanger Stream B: Bottom Flow Rate Re: Reflux Flow Rate Sd: Sidedraw Flow Rate P: Pumparound Flow Rate F: Feed Flow Rate Di: Distillate Flow Rate Symbol CL: Control Loop Figure 4.2: Distillation column control loop CL 7 Cooling Utility Out 55 Table 4.5: Control loops properties Control Control Manipulated Measured loop variable variable disturbance 1 Bottom column Bottom Liquid flowrate at liquid level flowrate tray 28 Bottom Hot utility Liquid flowrate at temperature flowrate tray 28 Top column Vapor flowrate Vapor flowrate at pressure to vent tray 2 Reflux Reflux Sidedraw flowrate flowrate flowrate Pumparound Distillate Pumparound flowrate flowrate point Side draw tray Distillate Liquid flowrate at liquid level flowrate tray 1 Pumparound Cold utility Pumparound temperature flowrate point 2 3 4 5 6 7 Seven control loops is equivalent to seven independent equations. Therefore, DoF is equal to 31 (38 – 7 control loops). In order to define the system completely, 31 independent variables are needed to be specified. The defined variables are: 1. 28 trays pressure. Pressure varies linearly from top tray to bottom column 2. Reflux flowrate 3. Pumparound flowrate 4. Reboiler duty 56 4.5 Formulations of the Distillation Column Models Mathematical models for distillation columns can be classified into two categories (Hurowitz et al., 2003). The first category is ‘rigorous’ models which consider the details of composition, temperature and flow on each plate. The second category is based on some kind of interpolation which considers an overall description of the column using fewer variables. In this study, rigorous column model is chosen. The mathematical models used in dynamic simulation should be as general as possible so that it can handle a wide range of problems and processes. The model starts from basic mass and energy balances as well as the equilibrium relationship between phases. Various forms of MESH equations have appeared in the literature. The arrangement of the sets of equations depended on the methods adopted in solving the problems. In addition, the equations can be written in steady state form. Other terms like tray efficiency, pump-around system, chemical reaction and thermal efficiency can also be introduced in these MESH equations. Besides these MESH equations (Wang and Henke, 1966), correlations are also needed to determine equilibrium ratio, K’s, liquid enthalpy, h’s, vapor enthalpy, H’s, Francis weir equations and hydraulic equations. The following section systematically presents the distillation column dynamic modeling. 4.5.1 Mass Balance Equations The mass balance equations involve total flow rate balance and components flow rate balance for each tray. The equations are written in derivative form in order to reveal the dynamic behaviors of the column. In order to get a better picture of mass balance for reflux drum, pumparound drum, normal tray, sidedraw tray, reflux tray, feed tray and bottom column, a distillation column schematic diagram is presented in Figure 4.3. 57 To vent Tray 1 V2, y2,i Tray 2 V3, y3,i Tray 3 Pumparound P, xP,i Sidedraw S, xs,i Liquid to pumparound L2, x2,i L, xL,i Reflux Re, xRe,i Distillate, D Cooler Tray 13 Tray 14 Liquid feed, Lf, xf, i Inlet trays schematic Tray 15 Tray N-1 VN, yN, i LN-1, xN-1, i Tray N VN+1, yN+1, i LN, xN, i Tray N+1 Normal tray schematic Tray 28 L29 V29 Reboiler LH, B Bottom Column (Tray 29) Bottom flow rate, B Bottom molar hold up, MB Bottom tray schematic Figure 4.3: Distillation column schematic diagram 58 4.5.1.1 Bottom column model The bottom column models consist of the following equations: i) Bottom flow rate has a very close relationship with the bottom liquid level. Bottom flow rate, B = ii) 16.778ρL H , B 0.5 4.2 MW The bottom molar hold-up relates to the bottom liquid level. Bottom molar hold-up, MB = L H , B AB ρ 4.3 MW 9 Liquid density, ρ = ∑ x i ρ i 4.4 i =1 iii) Bottom height derivative expression dL H , B dt iv) = L N −1 − V N − B Aρ ( b ) MW 4.5 Components mass balance in term of mole fraction derivative expression dx N ,i dt = L N −1 x N −1,i − V N y N ,i − Bxi Aρ L H ,b ( b ) MW 4.6 where: ρ = liquid density N = number of trays MW = molecular weight yi = component vapor mole fraction LH, B = bottom liquid level MB = bottom molar hold up xi = component liquid mole AB = Bottom column cross sectional fraction VN = vapor flow rate at tray-N area B = bottom flow rate 59 4.5.1.2 Trays model The pumparound tray (N=1) model involves the following equations: i) Total mass balance dM 1 = P + V 2 − L1 − V1 dt ii) 4.7 Component mass balance dxi Px P ,i + V2 y 2,i − L1 x1,i − V1 x1,i = dt MN 4.8 The sidedraw tray (N = 2) models involve the following equations: i) Total mass balance for sidedraw dM 2 = L1 + V3 − S − L2 − V 2 dt ii) 4.9 Component mass balance for sidedraw dx i L1 x1,i + V3 y 3,i − Sx s − V 2 y 2,i − L2 x 2,i = dt MN 4.10 The reflux tray (N = 3) models involve the following equations: i) Total mass balance for reflux dM 3 = L2 + V 4 + Re − L3 − V3 dt ii) 4.11 Component mass balance for reflux dx i L2 x 2,i + V 4 y 4,i + Re x Re,i − L3 x 3,i − V3 y 3,i = dt MN 4.12 60 The normal trays (N = 4 to 13 and 15 to 28) models involve the following equations: i) Total mass balance dM N = LN −1 + VN +1 − LN − VN dt ii) 4.13 Component mass balance dxi LN −1 xN −1,i + VN +1 y N +1,i − VN y N ,i − LN xN ,i = dt MN 4.14 The feed trays (N = 14) models involve the following equations: i) Total mass balance for liquid feed dM N = LN −1 + VN +1 + L f − LN − VN dt ii) 4.15 Component mass balance for liquid feed tray dxN ,i dt = LN −1 xN −1,i + VN +1 y N +1,i + L f x f ,i − VN y N − LN xN ,i MN 4.16 where: Re = reflux flowrate P = pumparound liquid flowrate xRe = reflux mole fraction xP = pumparound liquid mole fraction MN = molar hold up at tray N LN = liquid flow rate at tray-N S = sidedraw flowrate xi = component liquid mole fraction xS = sidedraw mole fraction VN = vapor flow rate at tray-N Lf = liquid feed flow rate yi = component vapor mole fraction xf = liquid feed mole fraction N = number of trays 61 4.5.1.3 Reflux drum model The reflux drum models are similar to the bottom column model. The reflux drum models consist of the following equations: i) The liquid stream flow rate has a very close relationship with the reflux drum liquid level. Liquid flowrate, L = ii) 15.02 ρL H ,Re 0.5 4.17 MW The reflux drum molar hold-up relates to the reflux drum liquid level. Reflux drum molar hold-up, MRe = L H , Re ARe ρ 4.18 MW 9 Liquid density, ρ = ∑ x i ρ i 4.19 i =1 iii) Reflux drum height derivative expression dL H , Re dt iv) = S − Re− L A ρ ( Re ) MW 4.20 Components mass balance in term of mole fraction derivative expression dx N ,i dt = Sx s ,i − Re x Re,i − Lx L ,i A ρ L H , Re ( Re ) MW 4.21 where: ρ = liquid density Re = reflux flowrate MW = molecular weight xRe = reflux mole fraction S = sidedraw flowrate L = liquid flow rate xS = sidedraw mole fraction LH, Re = Reflux drum liquid level ARe = Reflux drum cross sectional area xi = component liquid mole fraction MRe = reflux drum molar hold up 62 4.5.1.4 Pumparound drum model The pumparound drum model consists of the following equations: ii) The distillate flow rate has a very close relationship with the pumparound drum liquid level. Distillate flowrate, L = ii) 1.12 ρL H , P 0.5 4.22 MW The pumparound drum molar hold-up relates to the pumparound drum liquid level. Pumparound drum molar hold-up, MP = L H , P AP ρ MW 4.23 9 Liquid density, ρ = ∑ x i ρ i 4.24 i =1 v) Pumparound drum height derivative expression dL H , P dt vi) = L−P−D A ρ ( P ) MW 4.25 Components mass balance in term of mole fraction derivative expression dx N ,i dt = Lx L ,i − Px P ,i − Dx D ,i A ρ LH , P ( P ) MW 4.26 where: ρ = liquid density P = pumparound liquid flowrate MW = molecular weight xP = pumparound liquid mole fraction LH, P = pumparound drum liquid level D = distillate flow rate AP = pumparound cross sectional area xD = distillate mole fraction MP = pumparound drum hold up 63 4.5.2 Equilibrium Equations Equilibrium equations involve using appropriate thermodynamic equations to determine the phase equilibrium between vapor and liquid phases. Indirectly, the Eequations are used to determine the liquid phase compositions, vapor phase compositions and temperature or bubble point of the tray. The thermodynamic equations used for describing phase equilibrium and bubble point calculations are based on the Wilson correlation (Walas, 1985) for the liquid phase and the Virial correlation (Smith et al., 1996) for the vapor phase. The Virial correlation is used to calculate the fugacity coefficient of the vapor phase because the system pressure is less than atmospheric pressure. The correlation gives a good approximation in this operating region (Smith et al., 1996). The Wilson correlation is used to determine the activity coefficient of the liquid phase since the system exhibits non-ideality behavior (Reid et al., 1987). The Virial correlation is determined using the following equations. ln φ k = ⎤ Ptot ⎡ 1 n n ⎢ B kk + ∑ ∑ y i y j (2δ ik − δ ij )⎥ RT ⎣ 2 i =1 j =1 ⎦ 4.27 δ ik = 2Bik − Bii − Bkk 4.28 δ ij = 2 Bij − Bii − B jj 4.29 where δii = 0, δkk = 0, δik = δki Bij = RTcij Pcij ( Bijo + ω ij Bij1 ) 4.30 where, Bijo = 0.083 − 0.422 Trij1.6 4.31 Bij1 = 0.139 − 0.172 Trij4.2 4.32 64 Trij = ω ij = T Tcij 4.33 ωi + ω j 4.34 2 Tcij = Tci Tcj Pcij = Z cij = Vcij 4.35 Z cij RTcij 4.36 Vcij Z ci + Z cj 4.37 2 ⎛ 3 Vci + 3 Vcj ⎞ ⎟ =⎜ ⎜ ⎟ 2 ⎝ ⎠ 3 4.38 Ptot = total pressure of the system Vci = critical vol. for component i T = temperature of the system Tri = reduced temperature of component i R = universal gas constant n = number of components Tci = critical temperature of component i yi = mole fraction of component i Pci = critical pressure of component i φi = vapor fugacity coefficient for component i Zci = critical compressibility factor for component i ω = critical accentric factor of component i 65 The Wilson correlation is determined using the following equations: n xk Λ ki n ln γ i = 1 − ln ∑ x j Λ ij − ∑ j =1 k =1 ∑x Λ j =1 Λ ij = ⎛ − aij exp⎜⎜ Vi ⎝ RT Vj 4.39 n j kj ⎞ ⎟⎟ ⎠ 4.40 where, Λij = 1 for i=j Λij ≠ Λji γi = liquid phase activity coefficient for Λij component i xi = liquid mole fraction for component i aij = Wilson constant = Wilson binary interaction parameter for component i and j Vi = liquid molar volume of component i The relationship between the liquid mole fraction and vapor mole fraction for component i is shown below yiφi Ptot = xi γ i Pi sat 4.41 where, Pi sat = vapor pressure for component i Equation 4.41 is the fundamental vapor liquid equilibrium that will be used in this study. The bubble point calculation is based on this equation. 66 4.5.2.1 Bubble Point Calculations The tray temperature depends on the tray pressure, liquid compositions and vapor compositions. The bubble point calculations by using thermodynamic methods are known as gamma-phi approach because the liquid and vapor phase coefficient are determined using two different equations. The purpose of the bubble point calculation is to determine the vapor phase compositions and temperature for a given pressure and liquid phase compositions. Bubble point calculations require initial guess of temperature because the calculations involve iterations. Good guess will cause fast convergence to the solution. The procedure of bubble point calculations is presented in Figure 4.4. 67 Fix: Pressure, P and liquid composition xi.Guess: Temperature Set: φ i = 1 γ i = 1 Identify: Component j Determine Psat for each component. Antoine Equation : Pi sat (mmHg ) = Ai − Bi T ( K ) − Ci 4.42 Calculate γ i using equation 4.39 and 4.40 Calculate : Pjsat = Ptot 4.43 ⎛ x i γ i ⎞⎛⎜ Pi ⎟⎟ sat ⎜ i =1 ⎝ φ i ⎠⎝ P j sat n ∑ ⎜⎜ T= Bj A j − ln Pjsat ⎞ ⎟ ⎟ ⎠ −Cj 4.44 Recalculate: Pi sat (equa. 4.42) and T (equa.4.44) Calculate: yi (equa. 4.41), φi (equa. 4.27 – 4.38), γ i (equa. 4.39– 4.40) Recalculate: Pjsat (equa. 4.43) and Tnew (equa.4.44) Check: Tnew ~ T Yes Solution reach Figure 4.4: Bubble point calculation procedure No 68 4.5.3 Summation Equation Summation Equations are used to determine the vapor phase and liquid phase compositions. 8 x9 = 1 − ∑ xi 4.45 i =1 8 y9 = 1 − ∑ yi 4.46 i =1 4.5.4 Heat Balance Equation The heat balance equations are used to determine the vapor flow rate leaving each tray. The reference temperature is at 273K. The liquid enthalpy and vapor enthalpy are calculated using the following equations: Liquid enthalpy for a component at tray-N, T hi, N = ∫A * i + Bi*T + Ci*T 2 + Di*T 3 dT 273 T ⎡ * Bi*T 2 C i*T 3 Di*T 4 ⎤ = ⎢ Ai T + + + ⎥ 2 3 4 ⎦ 273 ⎣ 4.47 Vapor enthalpy for a component at tray-N, T Hi, N = ∫A * i + Bi*T + Ci*T 2 + Di*T 3 dT + ∆H vap ,i 4.48 273 Heat of Vaporization, ∆H vap,i ⎡ T −T ⎤ = ∆H vap , n , i ⎢ c , i ⎥ ⎢⎣ Tc , i − Tb , i ⎥⎦ 0.38 4.49 9 Liquid enthalpy leaving each tray, hN = ∑x h i =1 i i, N 4.50 69 9 Vapor enthalpy leaving each tray, HN = ∑y H i =1 Vapor stream leaving each tray, VN = i, N LN −1hN −1 + VN +1H N +1 − LN hN HN Vapor stream leaving the reboiler, V29 = Where: i Q R + ( L28 − B)h B H 29 4.51 4.52 4.53 Ai* , Bi* , C i* , Di* = liquid heat capacity constant of component i ∆H vap = heat of vaporization ∆H vap, n = heat of vaporization at normal boiling point Tb,i = boiling point L28 = liquid flowrate at tray 28 hb = enthalpy for bottom stream QR = reboiler duty 4.5.5 Hydraulic Equations Francis Weir hydraulic equations are used to determine the liquid flow rate leaving each tray and tray molar hold-up. Liquid flow rate leaving each tray depends on the over weir height, how. The involved hydraulic equations are shown as follows: M N ( MW ) ρ ( At ) Liquid level, LH, N = Overwier height, how = LH − WL ⎛h ⎞ Tray liquid flow rate, LN = ⎜ ow ⎟ ⎝ 750 ⎠ 3/ 2 4.54 4.55 ⎛ ρ (3600) ⎞ ⎜⎜ ⎟⎟ ⎝ WL ( MW ) ⎠ 4.56 70 where: how = over weir height WL = weir length LH, N = liquid level at tray-N At = tray active area 4.6 Dynamic Simulation of the Study Column The development of dynamic simulated column involves three different stages, which are pre-simulation, algorithm formulation and program formulation. Pre-simulation activities consist of the following tasks: a. Understanding the assumptions that are used in simulating the column. The assumptions adopted in simulating the column are, 1. Liquid on the tray is perfectly mixed and incompressible. Compositions of the liquid stream leaving the tray will be the same as the tray molar hold up compositions. 2. Tray vapor hold ups are negligible because the column operating pressures are less than 10 bars. 3. Vapor and liquid are in thermal equilibrium. The temperature for both liquid vapor streams leaving the tray is similar. 4. Vapor and liquid are in the equilibrium phase. 5. Pressure is constant in each tray but varies linearly up the column from bottom pressure to the top column pressure or in other words, pressure drop for each tray is the same. 6. Coolant and steam dynamics are negligible. 7. Dynamic changes in internal energy on the trays are negligible compared with the latent heat effects. So, energy balance on each tray is just algebraic equations. 71 b. The next task is to acquire the inlet stream conditions. It is based on the data from Palm Oil Fractionation Plant. The data in Table 4.2 was sufficient enough for the inlet stream. c. The study column is simulated using Design II simulator to obtain a set of steady state results that comprises of the profile of the following column parameters: 1. Column temperature profile 2. Column liquid and vapor flow rate profile 3. Trays liquid and vapor composition profile 4. Liquid and vapor enthalpy profile 5. Bottom and distillate stream flow rate and compositions. 6. Sidedraw, pumparound and reflux stream flowrate and compositions Column temperature and liquid compositions profiles are used as the initial guess for the dynamic simulation. d. There are 31 DoF which means that 31 variables need to be fixed in order to achieve the desired bottom and distillate flowrate which is 37.5kmole/hr and 4kmole/hr respectively. The fix variables are: 1. 28 tray pressure with each tray encounters 0.00095 bar pressure drop. 2. Reflux flow rate (10 kmole/hr) 3. Pumparound flowrate (49.6 kmole/hr) 4. Reboiler duty (2.1576 x 106 kJ/h) 72 Algorithm formulation is to establish a proper solution sequence, which can reveal the column behaviors. MESH equations and hydraulic equations are used to model the column that describes the column behavior. The systematic procedures in formulating the simulated column are elaborated as follows, First: The algorithm for the bottom column to the 4th tray 1. Initialize the following derivatives variables. i. Bottom liquid level and compositions : 9 derivative variables ii. Each tray molar hold-up and compositions : 225 derivative variables There are 234 derivatives variables that have to be initialized since there are 234 derivative equations. 2. Calculate the bubble point from the bottom tray till 4th tray. The tray pressure is fixed but the tray temperature and liquid compositions are guessed. The bottom pressure is fixed at 0.133 bar and 0.00095 pressure drop at each tray. The guessing values are referred from the Design II steadystate simulation results. The bubble point calculations are used to determine the tray temperature and the vapor phase compositions. 3. Calculate the liquid and vapor enthalpy from the bottom tray till 4th tray. Based on the TN, xi, N and yi, N as calculated in the previous step, liquid and vapor enthalpy for each tray are determined via Equation 4.47 to 4.49. 4. Calculate the liquid flow rate using Francis Weir equations. From the initial bottom liquid level, trays molar hold-up and reflux drum liquid level, the following variables are determined from bottom stage till 4th stage: i) Bottom flow rate by Equation 4.2 ii) Liquid leaving each tray by Equation 4.54 – 4.56. 73 5. Calculate the vapor flow rate leaving each tray by energy balance. Based on the calculated liquid and vapor enthalpy, the vapor flow rate leaving each tray is determined by Equation 4.53 for reboiler stage and Equation 4.52 for other trays. 6. Evaluate the molar hold up and components composition derivatives variables. For every time increment, h = 0.01 hour, the evaluation for each derivative variable is started from the bottom stage till 4th stage using RungeKutta fourth order method. 7. Increase time step. After all derivative variables at time = t are evaluated from the first algorithm and the second algorithm is to continue the calculation steps until time = tend Second: The algorithm for the 3rd tray to reflux drum, pumparound and the top tray 1. Start at the reflux drum. Initialize the derivative variables as in step 1 in first algorithm. Perform bubble point calculation. Calculate the liquid and vapor enthalpy for all the streams. Determine the liquid stream flowrate via Equation 4.17. 2. Pumparound drum. The procedure as step 1 with the distillate flowrate calculated using Equation 4.22. 3. The step for 3rd tray to 1st tray is similar to first algorithm. 74 After the algorithm formulation, then the written program is performed and has the following features: 1. Process variables like feed stream flow rate, feed stream temperature, feed stream pressure, distillate pressure, bottom column pressure, reflux flow rate, pumparound flowrate and reboiler duty are having randomly generated values. In the commercial simulators, these variables can only accept fix value. 2. These process variables have autocorrelation effect, which is one of the industrial data characteristics. 3. The developed program has main program and sub-programs. The main program merely contains sub-programs executing sequences. The following calculations are written in sub-programs: Bubble point calculations. Thermodynamic model for vapor and liquid equilibrium calculations 4.7 Liquid and vapor enthalpy calculations. Runge Kutta fourth order calculations. Controller Tuning There are seven control loops in the study column. In order to be able to use these controllers, it must first be tuned to the system. This tuning synchronizes the controller with controlled variable, thus allowing the process to be kept at the desired operating condition. The empirical tuning method, which is Process reaction curve method developed by Cohen and Coon is used to tune the controllers. This method used the open loop in which the control loop is broken between the system measurement and the controller. A step change is introduced in the input variable at steady state during normal operating condition. The output is recorded and the parameters static gain, Kp, time constant, τ, and dead time, θ is determined and recorded. Dead time describes how long it takes for the process to start responding while the time constant is the time required to reach 63.2% of the full response to a 75 step change. These parameters are used to determine the controller settings using the equations in Table 4.6, Table 4.6: Cohen-Coon controller settings Type of controller Proportional, P Proportional-Integral, PI Proportional-IntegralDerivative, PID Controller Gain, Integral Time Derivative Time Kc Constant, τ I Constant, τ D 1 ⎛ τ ⎞⎛ θ ⎞ ⎜ ⎟⎜1+ ⎟ K p ⎝ θ ⎠⎝ 3τ ⎠ θ ⎞ 1 ⎛ τ ⎞⎛ ⎜ ⎟⎜ 0.9 + ⎟ K p ⎝ θ ⎠⎝ 12τ ⎠ θ⎜ 1 ⎛ τ ⎞⎛ 4 θ ⎞ ⎜ ⎟⎜ + ⎟ K p ⎝ θ ⎠⎝ 3 4τ ⎠ θ⎜ ⎛ 30 + 3θ / τ ⎞ ⎟ ⎝ 9 + 20θ / τ ⎠ ⎛ 32 + 6θ / τ ⎞ ⎟ ⎝ 13 + 8θ / τ ⎠ 4 ⎛ ⎞ ⎟ ⎝ 11 + 2θ / τ ⎠ θ⎜ Sections 4.7.1 to 4.7.7 show the controller properties of each control loops. The controller sampling time, TAPC also known as Automatic Process Control (APC) time sampling is determined using the guideline proposed by Ogunnaike and Ray (1994). The details is discussed in Section 4.7.8. 4.7.1 Bottom Liquid Level Controller The control variable for the bottom liquid level control loop is the bottom liquid level while the proposed manipulated variable is the bottom flow rate. The value of the bottom flow rate is varied (various step changes) and the parameters static gain, Kp, time constant, τ, and dead time, θ for each change of the value of the bottom flow rate is recorded and shown in Table 4.7. However, only small changes of the value of the bottom flow rate are carried out as the developed Matlab program will fail to converge for large changes in the value of the bottom flow rate. 76 Table 4.7: The step change results for bottom liquid level controller Change in flow Static Gain, Kp, Time Constant, τ, Dead Time, θ , rate of bottom m/(kmole/hr) hour hour +0.5% of SS value -1.048 0.1 0.01 +1.0% of SS value -1.047 0.1 0.01 -0.5% of SS value -1.048 0.1 0.01 -1.0% of SS value -1.048 0.1 0.01 Average Value -1.048 0.1 0.01 stream The controller used for controlling the bottom liquid level is Proportional (P) controller. Based on the average value of Kp, τ, and θ from Table 4.7, the controller parameter for this loop Controller Gain, Kc is determined. The calculated parameter and information of the bottom liquid level P controller are shown in Table 4.8. Table 4.8: Properties of bottom liquid level P controller Controller Control Manipulated Kc, Type Variable Variable (kmole/hr)/m P Bottom Bottom -9.86 liquid level, flow rate, B LH, B 4.7.2 Bottom Temperature Controller The reboiler of the study column is assumed to be a total reboiler with hot oil as the heating fluid for the reboiler system. The control variable in the bottom temperature control loop is the bottom temperature and the proposed manipulated variable is the flow rate of the hot oil in the reboiler system. The hot oil is assumed to have a heat capacity, Cp = 7.5 kJ/kg.K (Geankoplis, 1993) and a change in temperature, ∆T = 40K. These two parameters are assumed to be constant. The steady-state (SS) value of the hot oil flow rate is 7192 kg/hr. The flow rate of the hot 77 oil is varied (various step changes) and the static gain, Kp, time constant, τ, and dead time, θ , for each change of the value of the flow rate of the hot oil is recorded and shown in Table 4.9. Table 4.9: The step change results for bottom temperature controller Change in flow Static Gain, Kp, Time Constant, τ, Dead Time, θ , rate of hot oil K/(kg/hr) hour hour +5% of SS value 0.038 0.3 0.06 +10% of SS value 0.038 0.3 0.06 -5% of SS value 0.038 0.2 0.06 -10% of SS value 0.038 0.3 0.06 Average Value 0.038 0.3 0.06 The controller used for controlling the bottom temperature is ProportionalIntegral-Derivative (PID) controller. Using the average value of Kp, τ, and θ from Table 4.9, the controller parameters such as Controller Gain, Kc, Integral Time Constant, τI, and Derivative Time Constant, τD, are calculated. The calculated parameters and information of the bottom temperature PID controller are shown in Table 4.10. Table 4.10: Properties of bottom temperature PID controller Controller Control Manipulated Kc, Type Variable Variable (kg/hr)/K PID Bottom Hot oil flow 179.87 temperature, rate, Fhot Tbot τI, hour τD, hour 0.13 0.02 78 4.7.3 Top Column Pressure Controller The control variable for the top column pressure controller loop is the top column pressure while the proposed manipulated variable is the top vapor flow rate. The value of the top vapor flow rate is varied (various step changes) and Kp, τ, and θ for each change of the value of the top vapor flow rate is recorded and shown in Table 4.11. Table 4.11: The step change results for top column pressure controller Change in flow Static Gain, Kp, Time Constant, τ, Dead Time, θ , rate of top vapor bar/(kmole/hr) hour hour +5% of SS value -9.6 1.0 0.1 +10% of SS value -9.6 1.0 0.1 -5% of SS value -9.6 1.0 0.1 -10% of SS value -9.6 1.0 0.1 Average Value -9.6 1.0 0.1 stream The controller used for controlling the top column pressure is ProportionalIntegral (PI) controller. Using the average value of Kp, τ, and td from Table 4.11, the controller parameters such as Kc and τI are calculated. The calculated parameters and information of the top pressure PI controller are shown in Table 4.12. Table 4.12: Properties of top column pressure PI controller Controller Control Manipulated Kc, Type Variable Variable (kmole/hr)/Bar PI Top column Top vapor -0.95 pressure, PTop flow rate, Vtop τI, hour 0.28 79 4.7.4 Sidedraw Tray Liquid Level Controller The control variable for the sidedraw tray liquid level control loop is the sidedraw tray liquid level while the proposed manipulated variable is the distillate flow rate. The value of the distillate flow rate is varied (various step changes) and Kp, τ, and θ for each change of the value of the distillate flow rate is recorded and shown in Table 4.13. Table 4.13: The step change results for sidedraw tray liquid level controller Change in flow Static Gain, Kp, Time Constant, τ, Dead Time, θ , rate of bottom m/(kmole/hr) hour hour +5% of SS value -0.023 0.2 0.05 +10% of SS value -0.023 0.2 0.05 -5% of SS value -0.021 0.2 0.05 -10% of SS value -0.022 0.2 0.05 Average Value -0.022 0.2 0.05 stream The controller used for controlling the sidedraw liquid level is P controller. Using the average value of Kp, τ, and θ from Table 4.13, the Kc of this controller is calculated. The calculated parameter and information of the sidedraw tray liquid level P controller are shown in Table 4.14. Table 4.14: Properties of sidedraw tray liquid level P controller Controller Control Manipulated Kc, Type Variable Variable (kmole/hr)/m P Sidedraw Distillate -193.89 tray liquid flow rate, D level, LH, Sd 80 4.7.5 Pumparound Temperature Controller The control variable for the pumparound temperature controller is the temperature of the pumparound stream while the proposed manipulated variable is the flow rate of the cooling water in the pumparound cooler system. The water has a heat capacity Cp = 4.184 kJ/kg.K (Geankoplis, 1993) and a change in temperature, ∆T = 10K. These two parameters are assumed to be constant and the steady-state (SS) value of the cooling water flow rate is 411.57 kmole/hr. The value of the cooling water flow rate is varied (various step changes) and Kp, τ, and θ for each change of the value of the cooling water flow rate is recorded and shown in Table 4.15. Table 4.15: The step change results for pumparound temperature controller Change in flow Static Gain, Kp, Time Constant, τ, Dead Time, θ , rate of cooling K/(kmole/hr) hour hour +5% of SS value -0.303 1.1 0.1 +10% of SS value -0.303 1.1 0.1 -5% of SS value -0.303 1.1 0.1 -10% of SS value -0.303 1.1 0.1 Average Value -0.303 1.1 0.1 water The controller used for controlling the pumparound temperature is PID controller. Using the average value of Kp, τ, and θ from Table 4.15, the controller parameters such as Kc, τI and τD are calculated. The calculated parameters and information of the pumparound temperature PID controller are shown in Table 4.16. 81 Table 4.16: Properties of pumparound temperature PID controller Controller Control Manipulated Kc, Type Variable Variable (kmole/hr)/K PID Pumparound Cooling 22.71 temperature, water flow TP rate, Fcw 4.7.6 τI, hour τD, hour 0.46 0.07 Pumparound Flow Rate Controller The control variable for the pumparound flow rate control loop is the pumparound flow rate while the proposed manipulated variable is the liquid flow rate prior to the pumparound stream. Since this liquid stream directly affects the pumparound flow rate and is in the same streamline with the pumparound flow rate, no step changes were being carried out to determine the value of Kp, τ, and θ of this control loop. A simple P controller with a Kc = 1.0 is used for this control loop. The parameter and information of pumparound flow rate controller are shown in Table 4.17. Table 4.17: Properties of pumparound flow rate P controller Controller Control Manipulated Type Variable Variable P Pumparound Liquid flow rate flow rate, P prior to pumparound stream Kc 1.0 82 4.7.7 Reflux Flow Rate Controller The control variable for the reflux flow rate control loop is the reflux flow rate while the proposed manipulated variable is the liquid flow rate prior to the reflux stream. Since this liquid stream directly affects the reflux flow rate and is in the same streamline with the reflux flow rate, no step changes were being carried out to determine the value of Kp, τ, and θ of this control loop. A simple P controller with a Kc = 1.0 is used for this control loop. The parameter and information of reflux flow rate controller are shown in Table 4.18. Table 4.18: Properties of reflux flow rate P controller Controller Control Manipulated Type Variable Variable P Reflux flow Liquid flow rate rate, Re prior to reflux Kc 1.0 stream 4.7.8 Controller Sampling Time, TAPC The controller sampling time, TAPC, of the study column is determined using the guideline proposed by Ogunnaike and Ray (1994). They proposed using one tenth to one twentieth of smallest time constant of a process for its controller sampling time. From the controller tuning results obtained from Section 4.7.1 to Section 4.7.7, the bottom liquid level-bottom flow rate process section of the study column process has the smallest time constant, τ = 0.1 hour. Therefore, the controller sampling time of the study column is set at TAPC = 0.01 hour (one tenth of the smallest time constant of the study column process). 83 4.8 Dynamic Simulation Program Performance Evaluation The performances of the written algorithm and program are evaluated based on the ability of the developed algorithm and program to achieve steady state results that close to the steady state results from the HYSYS simulator. For each aspect comparison, the HYSYS simulator results are used as the basis. Percentage difference is calculated as follow: % Difference =[(HYSYS result – Matlab result)/ (HYSYS result)] x 100% The steady state results comparisons for the temperature, pressure, liquid flowrate, vapor flowrate, distillate flowrate and bottom flowrate are presented in Appendix A. The result obtained from the comparison of bottom and distillate composition, the maximum percent different was about 4%. It shows that the written program is able to achieve bottom flow rate and distillate flow rate that are close to the commercial simulator. On the other hand, the maximum percentage different for each temperature and pressure column profile is less than 10%. Therefore, it can be concluded that the developed dynamic simulation program is able to produce results that close to the commercial steady state simulators. Thus, the developed dynamic simulation program is ready to be used to generate a set of nominal operation condition (NOC) data and a set of disturbance data. 84 4.9 Chapter Summary Distillation column is chosen as unit operation for the proposed FDD method. Process description of the pre cut column from the Palm Oil Fractionation Plant and the information of this column is included. The column contains nine components. Prior to performing the dynamic modeling, DoF analysis was performed. There are 31 DoF for the study column. The fix variables are 28 tray pressures (pressure varies linearly from top tray to bottom column), reflux flowrate, pumparound flowrate and reboiler duty. Formulations of the distillation column model require MESH equation and hydraulic equations. The column model can be divided into eight sections i.e. reflux drum, pumparound drum, normal tray, sidedraw tray, reflux tray, feed tray and bottom column. The equilibrium equation is based on Wilson Correlation and Virial Correlation to determine activity coefficient and fugacity coefficient of the liquid and vapor respectively. The heat balance equations are used to determine the vapor flow rate leaving each tray. Francis Weir hydraulic equations are used to determine the liquid flow rate leaving each tray and tray molar hold-up. The development of dynamic simulated column involved three stages, which are pre-simulation, algorithm formulation and program formulation. Pre-simulation activities consist of understanding the assumptions that are used in simulating the column, acquiring the inlet stream conditions, simulating the study column using simulator Design II and specifying the desired separation flowrates and fixing the degree of freedom. Algorithm formulation is to establish a proper solution sequence, which can reveal the column behaviors. After completing the algorithm formulation, the written program performs using MATLAB software. The empirical tuning method, which is Process reaction curve method developed by Cohen and Coon is used to tune the controllers. The controller settings for various controller types for each control loop have been calculated. Controller 85 sampling time, TAPC = 0.01 hour was determined using one tenth of the smallest time constant of the study column process. CHAPTER 5 METHODOLOGY 5.1 Introduction The meaning of monitoring in this work refers to the condition where measurable variables are checked with the limits, and alarms are generated when out of control condition occur. The objective of monitoring in the study column is to produce the bottom composition of oleic acid and linoleic acid between 0.134 kmol/kmol to 0.135 kmol/kmol and 0.024 kmol/kmol to 0.025 kmol/kmol. Faulty conditions were detected when these variables exceeded the control limits. Once fault has been detected, the next step is to determine the cause of the out of control situation. Therefore, focusing the plant operators and engineers on the subsystem(s) would most likely know where the fault occurred. This assistance provided by the fault diagnosis scheme in locating the fault can effectively incorporate the operators and engineers in the monitoring process scheme and significantly reduce the time to recover in-control operations. The task of diagnosing the fault can be rather challenging when the number of process variables is large, and the process is highly integrated. Proper and immediate actions to variables which cause the fault can prevent the effect of the fault from propagating to other variables (Chiang and Braatz, 2003). In this research, a precut multi component distillation is used as the study unit operation. The monitoring proposed method also can be applied to other unit operations as well. This simulated column is developed using MATLAB. 87 5.2 Variable Selection for Process Monitoring There are seven control loops installed in the study column. These control loops controlled the process by counteracts the effects of disturbances change and set point change to maintain the process at the target. This concept is known as Automatic Process Control (APC), which uses continuous adjustment on manipulated variables to keep the process on target. APC variables are divided into three groups i.e controlled variable, manipulated variable, and measured disturbance. Table 5.1 lists APC variables for each control loop of the study column. Table 5.1: APC variables for each control loops Control variable Manipulated variable Measured disturbance Bottom column liquid level Bottom flowrate Liquid flowrate at tray 28 Bottom temperature Hot utility flowrate Liquid flowrate at tray 28 Top column pressure Vapor flowrate to vent Vapor flowrate at tray 2 Side draw tray liquid level Distillate flowrate Liquid flowrate at tray 1 Pumparound temperature Cold utility flowrate Pumparound point Pumparound flowrate Distillate flowrate Pumparound point Reflux flowrate Reflux flowrate Sidedraw flowrate In contrast, SPC does not control the process but rather performs a monitoring function and gives signals when maintenance is needed in the form of identification and removal the root causes. SPC seeks the deviation in the process by detecting the changes in monitoring variables when the assignable causes occur in the process. SPC variables can be categorized into two i.e quality variables and process variables. Quality variables consist of oleic acid composition, xc8 and linoleic acid composition, xc9 at bottom stream. A few variables were listed for process variables selection. These variables are feed flowrate, feed temperature, reflux flowrate, pumparound flowrate, pumparound temperature, reboiler heat duty, sidedraw flowrate and distillate flowrate. Normal correlation coefficient is calculated to analyze these variables quantitatively. Variables which have correlation coefficient greater than 0.05 were selected. Correlation coefficient which is less than 0.05 is considered small. Changes in process variables which have small correlation coefficient do not 88 affect quality variables significantly. Using small correlation coefficient can cause the control limits for process variables become wider. Based on the results obtained as shown in Table 5.2, five process variables have correlation coefficient greater than 0.5. Table 5.2: Normal correlation coefficient for process variables Process variables Correlation coefficient Feed flowrate 0.610 Feed temperature 0.340 Reflux flowrate 0.522 Pumparound flowrate 0.490 Reboiler heat duty 0.050 Pumparound temperature 0.030 Sidedraw flowrate 0.005 Distillate flowrate 0.001 These process variables were selected from five locations i.e feed stream, pumparound stream, reflux stream and bottom stream. Feed flowrate and feed temperature of the study column are the overall disturbances. Even though the study precut column has been installed with controllers, the probability of faults to occur is possible due to line blockage, line leakage and sensor fault. Pumparound and reflux flowrates are the controlled variables in APC system. These variables are maintained at set point values. The controllers will take action if any changes occur due to measured disturbances and set points. Both variables are set on target values as long as the controllers are functioning well or there is no fault exists. Even though these two variables have been controlled, the possibility of faults to occur is possible due to line blockage, line leakage, sensor faults, valve sticking and controller fault. Therefore, these variables are included in monitoring process to identify the causes of faults. 89 Reboiler heat duty of the study column is not measured directly and the value is calculated using Equation 5.1, Qr = FhotCp∆T 5.1 where, Fhot = Hot oil flowrate Cp = Specific heat capacity = 7.5 kJ/kg.K ∆T = Temperature difference = 40 K Hot oil flowrate, Fhot is the manipulated variable in APC for maintaining the bottom temperature at the desired value. This manipulated variable will change if the controller takes action to compensate the effect of disturbance (liquid flowrate at tray 28) and set point (bottom temperature) change. Otherwise it could be due to faults such as line blockage, line leakage, sensor faults, valve sticking and controller fault. Table 5.3 lists the selected process variables to be monitored in this study. Table 5.3: SPC variables Location Feed stream Process variable Feed flowrate, F Feed temperature, Tf Pumparound stream Pumparound flowrate, P Reflux stream Reflux flowrate, Re Bottom stream Reboiler heat duty, Qr During the monitoring process to detect and diagnose faulty conditions, quality variables acted as indicator variables to show that the process is in the state of statistical control or in the state of out of control. If any of the statistics value calculated from these variables falls out of control limits, it shows that the fault situation has taken place. On the other hand, process variables are used to find the causes of the fault situation. 90 Process monitoring for FDD involves both APC and SPC variables. If the disturbance or set point or both have changed, it is not considered as a fault. The controller in the process will take action by counteracts the effect of disturbance and set point change to keep the process on target. In order to detect the Out of Control (OC) condition, the Improved SPC control limits are determined and the Improved SPC charts are constructed. Process monitoring by the operator will be based on these charts. Any fault detected in the process will lead the operator to find cause(s) using the process variables data. 5.3 Statistical Process Control Sampling Time, TSPC In practice, the dynamics of a typical chemical or manufacturing process could cause the measurements to be autocorrelated (Bakshi, 1998). Therefore, the right data collected at appropriate rate with sufficient information content must be selected. Data for SPC analysis must be random with no autocorrelation. Statistical monitoring might not function well for autocorrelated data (Kano et al., 2001). Moreover, autocorrelation often reflects increased variability, which may cause difficulty to distinguish between common cause variation and assignable cause variation. Therefore, the data is collected at certain time interval called SPC sampling time, TSPC to ensure the data is random and without autocorrelation. TSPC is always greater than APC sampling time, TAPC. The measurement of the process variables is less likely to be correlated when the sampling time for SPC is greater than natural response time of the process, TAPC (Ogunnaike and Ray, 1994). The first step to determine TSPC is to select a variable with maximum time constant, τ . Then, autocorrelation coefficient is calculated using the observed data of this variable. Autocorrelation is a correlation coefficient between two values of the same variable at times ti and ti+k instead of correlation between two different variables. Given vector of univariate data consist of m observations, x1, x2, … xm, the lag k autocorrelation function, rk is defined as 91 ∑ = m− k rk i =1 ( xi − x )( xi + k − x ) ∑i=1 ( xi − x ) 2 m 5.2 where rk = Autocorrelation coefficient x = Individual data x = Arithmetic Average m = Number of observations k = Number of lags The autocorrelation coefficient at any lag k is zero for the truly random process. In practice, a computed autocorrelation coefficient rarely equals zero even the observations are truly independent since sampling variation will generally result in a small amount of correlation between observations (Alwan, 2000). The effect of autocorrelation on the control limits of control chart for individual data will not be significant until the lag one autocorrelation coefficient is 0.7 or higher (Woodall, 2000). The autocorrelation plot can be used for the following two purposes: 1. To detect non-randomness in data 2. To identify an appropriate time series model if the data are not random In order to compute TSPC , the autocorrelation coefficient of a univariate data is plotted with confidence bound, ± 2 / k (Wetherill and Brown, 1991). Any value greater than confidence bound shows the presence of autocorrelation. TSPC is determined right after the autocorrelation coefficient plot reach the confidence bound. Figure 5.1 summarizes the procedure to determine TSPC. 92 Select a variable with maximum time constant, τ Plot autocorrelation function and the confidence bound rk < confidence No bound Yes Calculate SPC time sampling, TSPC Use to take sample of all process variables using this TSPC value Figure 5.1: The procedure to calculate and implement TSPC 5.4 Normal Operating Condition (NOC) Data Selection and Preparation SPC methods rely on formation of a statistical model based on historical process data to establish normal operating behavior. The historical data is also known as NOC, where data is collected when the process is in statistical control or quality variables remain close to their desired values. This is the most important step in SPC methodology since it is used to predict the future behavior of the process. A set of quality variables and a set of process variables are collected and arranged in the matrix form, X = [F Tf Re P Qr xc8 xc9] where the measurements of a variable are arranged in the form of column vector. Some noises with zero mean were imbedded into the MATLAB simulation program. The values are considered as small random change in process variables such as feed flowrate, F, feed temperature, 93 Tf, reflux flowrate, Re, pumparound flowrate, P, and reboiler heat duty, Qr. The data was collected at SPC time sampling interval to ensure no autocorrelation among the measurements or to decorrelate the autocorrelation. This small random change is also known as common cause variation that creates random data for quality variables i.e oleic acid composition, xc8 and linoleic acid composition, xc9. To ensure a normal distribution of a set of NOC data, the number of samples used must be large. Based on the central limit theorem, if the sample size is large the theoretical sampling distribution of the mean can be approximated closely with normal distribution. The observation vectors are assumed to be independent and normally distributed. The normal distribution plays an important role in the analysis of random signals for a number of reasons (Gertler, 1998): 1. Many random physical processes can be rather accurately characterized by the normal distribution. 2. The behavior of variables representing the combined effect of a large number of phenomena approaches the normal distribution. 3. The normal distribution is relatively simple to handle mathematically. The visual appearance of the normal distribution is a symmetric or bell shaped curve. A histogram plot is used to show the distribution of data values. Other parameters such as skewness and kurtosis were determined to measure the symmetry of the data around the sample mean and the outlier-prone of a distribution respectively. If skewness is negative, the data are spread out more to the left of the mean than to the right and vice versa. The skewness of the normal distribution given data (or any perfectly symmetric distribution) is zero. The kurtosis of the normal distribution is 3 for distribution having 99.73% of its data fall between the limits defined by mean plus and minus three standard deviation. Distributions that are more outlier-prone than the normal distribution have kurtosis greater than 3 while distributions that are less outlier-prone have kurtosis less than 3. Once the NOC data set has been established, the data is treated before the correlation coefficient between quality variables and process variables and the confidence limits can be established for SPC charts. 94 5.4.1 Data Pretreatment The NOC data is standardized before further analysis is carried out since the variables have different units and wide range of data measurements. Each variable is adjusted to zero average by subtracting off the original average of each column and adjusted unit variance by dividing each column by its standard deviation. The standardized equation is xs = x−x s 5.3 where xs = Standardized variable x = Average s = Standard deviation After the standardization, each variable have equal weights with zero average and standard deviation equal to one (N (0, 1)). 5.4.2 Correlation Coefficient via Multivariate Analysis Techniques The standardized NOC data is used to build the linear relationship between quality variables i.e. oleic acid composition, xc8 and linoleic acid composition, xc9 and process variables i.e feed flowrate, F, feed temperature, Tf, reflux flowrate, Re, pumparound flowrate, P, and reboiler heat duty, Qr using multivariate analysis techniques, PCA and PCorrA. Let xj be the quality variables and xk be the process variables. The relationship between x j and x k can be written as, x j = Cjk x k 5.4 95 It has been argued that nonlinear model may be necessary to model continuous processes. Actually, the model depends on the application. If the model is used for monitoring, linear model is sufficient to describe process fluctuation around an operating point (MacGregor and Kourti, 1995). The following section discusses the detailed procedure to calculate the correlation coefficient using multivariate analysis method. 5.4.2.1 Principal Component Analysis, PCA The advantage of using PCA method is the developed NOC models can represent the major process variations in a lower dimension way. The first step in PCA methodology is to calculate the eigenvector matrix, V and the eigenvalue matrix, L1/2 of standardized data using Singular Value Decomposition (SVD) technique. Several authors have discussed this technique to determine the eigenvector matrix and eigenvalue matrix (Chiang et al., 2000; Chen et al., 2001; Ralston et al., 2001). This technique can be used to analyze the equation or matrices that are either singular or numerically very close to singular. Standardized data matrix, X (m,p), which has m variables with each variable has p measurements are arranged in the form of m x p, where the measurements of a variable are organized in the form of column vector. A standardized data matrix, X is decomposed into a product of the column-orthogonal matrix U (m,p), orthogonal matrix V(p,p) and a diagonal matrix L1/2. Equation 5.5 shows the fundamental identity of SVD. Bold capital letter alphabet is matrix while small letter alphabet is vector and italic is scalar. X(m,p) = U(m,p)L1/2 (p,p)VT (p,p) = u1 λ 1 v1T + u2 λ 2 v2T + … + up λ p vpT 5.5 96 where, U = Eigenvector matrix of XXT V = Eigenvector matrix of XTX L1/2 = Singular value = Diagonal matrix of positive square roots of the eigenvalue of XTX = λ λ = Eigenvalue The eigenvectors matrix, V(p,p), with square matrix contains eigenvectors v1, v2, …, vp. Each eigenvector is a column vector which contains the arrangement of elements [v1,1 v2,1 … vp,1], [v1,2 v2,2 … vp,2] … [v1,p v2,p … vp,p]. The eigenvectors matrix can be written as, ⎡ v1,1 ⎢v ⎢ 2,1 ⎢ . Eigenvectors matrix, V = [v1 v2 … vp] = ⎢ . ⎢ ⎢ . ⎢v p ,1 ⎣ v1, 2 v 2, 2 . . . v p,2 ... v1, p ⎤ ... v2, p ⎥⎥ ... . ⎥ ... . ⎥ ⎥ ... . ⎥ ... v p , p ⎥⎦ 5.6 The diagonal matrix L1/2 (p,p) with positive elements, λ 1, λ 2,…, λ p are called eigenvalues of X. The eigenvalues explained the percentage of variations corresponding to the eigenvectors. The eigenvalues are arranged in descending order λ 1> λ 2 > ,…, > λ p > 0. This matrix is shown as, ⎡ λ1 ⎢0 ⎢ Eigenvalue matrix, L1/2 = ⎢ 0 ⎢ ⎢0 ⎢0 ⎣ 0 λ2 0 0 0 0 0 0 0 . 0 0 0 . 0 0⎤ 0 ⎥⎥ 0⎥ ⎥ 0⎥ λ p ⎥⎦ 5.7 97 Matrices of U, V and L1/2 have the following properties (Nash and Lefkovitch, 1976): UTU = UUT =I VTV = VVT = I (L1/2)(L1/2) T = (L1/2) T (L1/2) =L 5.8 The next step is to determine the number of PCs to be retained. Reducing the number of PCs means to reduce the dimension of the variability in the data. Three rules of thumb are suggested to verify the number of PCs to be retained in the model (Ralston et al., 2001). The rules of thumb involved: 1. Determining the ‘knee’ in the eigenvalue plot. The plot is referred to scree plot, which is shown in Figure 5.2. 2. Determining the sudden jump in the eigenvalues. 3. Evaluating the percentage of variance to be retained. Figure 5.2: Example Scree plot 98 The percentage of the variation to be explained by the eigenvalues can be fixed by the user. Equation 5.9 shows the percent of variation that captured by ith eigenvalues. λi (%) = λi λ1 + λ2 + ... + λ p ×100% 5.9 In order to find the number of eigenvalues to be retained, the percentage of cumulative variation explained by j first eigenvalues is determined using equation 5.10. λ1 + λ2 + ... + λ j × 100% λ1 + λ2 + ... + λ j + ... + λ p 5.10 Let assume that n out of p eigenvalues are kept or retained. Based on theoretical study of PCA, the data matrix, X is related to PC after data reduction using Equation 5.11. X = PC n VnT + E 5.11 The retained principal components [PC1, PC2, …, PCn], which form the Pn VnT term, are associated with systematic variation in data while the residual principal components [PCn+1, PCn+2, …, PCp], which form the residual matrix E are considered of containing measurement errors (Seborg et al., 1996). The eigenvectors and eigenvalues according to number of PCs retained are used to determine the correlation coefficient between quality variables and process variables. The equation is derived from the basic theory of PCA method. Equation 5.11 is simplified for both quality variable vector, xj and process variable vector, xk and becomes, xj = PCv Tj 5.12 xk = PCv Tk 5.13 99 Since both variables in standardized form with mean zero and standard deviation is equal to one, the correlation coefficient, Cjk between xj and xk can be determined using covariance equation, Cjk = corr (xj, xk) = cov (xj, xk) = xjTxk 5.14 Substitute both equation 5.12 and equation 5.13 into equation 5.14. Therefore, the correlation coefficient can be determined using eigenvector via the equation 5.15 Cjk = cov (xj, xk) = xjTxk = (PCv Tj )T (PCv Tk ) = v j PCTPC v Tk 5.15 The next step is to combine the PCA equation in Equation 2.3 with the equation derived from SVD equation in Equation 5.5, PC = XV = UL1/2 VT V = UL1/2 5.16 Equation 5.16 is used to replace PC in equation 5.15 produce the following equation, Cjk = v j (UL1/2)T(UL1/2 )v Tk = v j UTU (L1/2)T(L1/2 )v Tk = v j Lv Tk 5.17 100 If n eigenvectors are involved, then the equation 5.17 can be rewritten as follows, n C jk = ∑ v jn vkn λn 5.18 i =1 where n = Number of retained eigenvectors vk = Eigenvectors of process variable vj = Eigenvectors of quality variable The PCA procedures to determine the correlation coefficient is summarized in figure 5.3. Standardized the NOC data Determine the eigenvector matrix via SVD technique Arrange the eigenvalues respective to eigenvector from the biggest to the smallest Determined the number of eigenvectors to be retained Calculate the correlation coefficient, Cjk Figure 5.3: The procedure to determine correlation coefficient, Cjk via PCA method 101 5.4.2.2 Partial Correlation Analysis, PCorrA Partial correlation coefficient is defined as a correlation of selected quality variable, xj and selected process variable, xk when the effects of other process variable(s) have been removed from xj and xk. The initial step in PCorrA methodology is to standardize NOC data. Then, the simple correlation coefficient between two variables is determined. The simple correlation coefficient, rj,k is a measure of linear association between the two variables. The simple correlation coefficient and correlation matrix are determined from equations 5.21 and 5.22 accordingly. Variance is a measure of a variable’s dispersion or variation while covariance is a measure of co-variation between two variables. The variance and covariance of each variable are determined respectively from equations 5.19 and 5.20. Variance-covariance matrix is carried out in the analysis if the involved data are in the same unit, same digit, and same magnitude order or the data are in percentage. But, if the data or measurements are in different units like pressure, temperature, flow rate, energy and so on, or the data are in different magnitude order, then the analysis shall perform on the correlation matrix (Marriott, 1974). ∑ (x m Variance, s jk = i − x j )∑ (xi ,k − xk ) m i, j k 5.19 (m − 1)(m − 1) ⎡ s1,1 ⎢s ⎢ 2,1 ⎢ . Covariance matrix, S = ⎢ . ⎢ ⎢ . ⎢ s p ,1 ⎣ s1, 2 s 2, 2 . . . s p,2 Simple correlation coefficient, r jk = s1, p ⎤ ... s 2, p ⎥⎥ . . ⎥ . . ⎥ ⎥ . . ⎥ ... s p , p ⎥⎦ ... s j ,k s j , j s k ,k 5.20 5.21 102 ⎡ r1,1 ⎢r ⎢ 2,1 ⎢ . Normal Correlation matrix, R = ⎢ . ⎢ ⎢ . ⎢r p ,1 ⎣ where: r1, 2 r2, 2 . . . rp,2 ... r1, p ⎤ ... r2, p ⎥⎥ . . ⎥ . . ⎥ ⎥ . . ⎥ ... r p , p ⎥⎦ i : ith measurement s 2j , j : Variance of variable xj xj : Average value for variable xj s j,k : Covariance of variable xj and xk 5.22 Since five process variables involved in this study, therefore the order of partial correlation is four. Correlation coefficient between selected quality variable, xj1 and selected process variable, xk1 after removing the effect of other process variables, xk2, xk3, xk4 and xk5 via PCorrA method is calculated using the Equation 5.23. The PCorrA procedures is summarized in Figure 5.4. C jk = C x j 1 , xk 1 xk 2 , xk 3 , xk 4 , xk 5 = rx j 1 , xk 1 xk 3 , xk 4 , xk 5 1 − rx2 , x j1 k2 − rx j 1 , xk 2 xk 3 , xk 4 , xk 5 r xk 3 , xk 4 , xk 5 xk 1 , xk 2 xk 3 , xk 4 , xk 5 1 − rx2k 1 , xk 2 xk 3 , xk 4 , xk 5 5.23 103 Standardized NOC data Determine the simple correlation coefficient via equation 5.21 Determine the order of partial correlation via equation 2.11 Calculate the correlation coefficient, Cjk via equation 5.23 Figure 5.4: The procedure to determine correlation coefficient, Cjk via PCorrA method 5.4.3 Control limits of Improved Statistical Process Control chart The control limits of Improved SPC chart is determined based on the correlation between quality variables and process variables. Each Improved SPC chart consists of two control limits, which are upper and lower limits. The upper line will be the limit for the point greater than target value and the lower line for the point less than target value. These limits are calculated based on the standardized NOC data, which represent the desired process operation. Standardized data results of all variables have zero mean and standard deviation of equal to one. The control limit is determined so that the number of samples outside the control limits is 0.27% of the entire samples. Many authors in literature used Shewhart chart with 3 standard deviation limits (Woodall, 2000; Kahn et al., 1996; Montgomery, 1996; Oakland, 1996). Experience shows 3 standard deviation limits to be the most effective scheme and seems to be an acceptable economic value (Woodall, 1996). The choice of size 3 standard deviation control limits gives a balance between statistical theory and practical experience (Kahn et al., 1996). Most of statistical process control books 104 provide tables of the constant used such as D.'001 , D.'999 and A2 . Table 5.4 shows the control limits of Improved SPC charts used in this study based on standardized data. Table 5.4: The control limits of Improved SPC chart Improved SPC Quality variable Process variable chart Control limit Control limit Shewhart individual UCL = 3 UCL = 3/Cjk , LCL = -3 LCL = -3/Cjk UCL = D.'001 R UCL = D.'001 R / C jk LCL = D.'999 R LCL = D.'999 R / C jk Shewhart range EWMA UCL = + L LCL = − L MA MR β [1 − (1 − β ) ] 2−β β 2m [1 − (1 − β ) ] 2−β 2m UCL = + L LCL = − L β 2−β β 2−β [1 − (1 − β ) ]/Cjk 2m [1 − (1 − β ) ]/Cjk UCL = + A2 R UCL = + A2 R / C jk LCL = − A2 R LCL = − A2 R / C jk UCL = D.'001 R UCL = D.'001 R / C jk LCL = D.'999 R LCL = D.'999 R / C jk 2m There are two important parameters to design EWMA control charts i.e the width of the control limit, L and the weighting factor, β . These two parameters affect the efficiency of EWMA charts for FDD. The selection of these parameters in this study is based on the performance of the chart to detect and diagnose small shift in the process. Theoretically, smaller values of β can detect smaller shift in the process. Several studies have been done and provide a range of value of L and β (Montgomery, 1996). Table 5.5 shows several values of EWMA parameters to be tested for sensitivity in this study. 105 Table 5.5: Parameters to design EWMA 5.5 Weighting factor, β Width of the control limit, L 0.05 2.615 0.10 2.814 0.20 2.962 0.25 2.998 0.40 3.054 Generated Out of Control, OC data In order to see the performance of Improved Statistical Process Control charts some faults which were detected and diagnosed had been imbedded into simulated distillation column. The model was run at steady state condition. The faulty conditions were introduced into the process by increasing or decreasing the process variables (F, Tf, P, Re, Qr) from the target, which caused the desired quality variables (xc8, xc9) statistics fell beyond the control limits. The Out of Control, OC data were collected during this condition and the data were arranged in matrix form and standardized in the same way described for NOC data. Fault causes were divided into three types of faults i.e sensor fault, valve fault and controller fault. The effect of these three types of fault in closed loop is different from those in the open loop operation. The differences are caused by the controller actions to remove the deviation on the controlled variable. For open loop variable, the fault will be of sensor fault or valve fault while for closed loop variable the fault can be of sensor fault, valve fault or controller fault. In the case of simple sensor fault, open loop and closed loops process give the same result on the measure variables and the fault diagnosis is simple and relatively easy. This type of fault occurs in a specific fault source and its effect is not propagated into other variables. For the second type of fault that is valve fault, open loop and closed loop processes give different result. Manipulated variables for closed loop will change and deviate from its steady state values for a period of time when the fault occurs. The control 106 variables deviate from its set point until the controller brings the variables back to its set points. But for the case of controller faults, both manipulated variables and control variables will continuously deviate from the steady state value and set point values respectively. Table 5.6 summarizes the effect of sensor fault, valve fault and controller fault on the disturbance, manipulated and control variables. 107 Table 5.6: Description of sensor fault, valve fault and controller fault Sensor Fault • • D For open • loop For Controller Fault open • loop closed loop variables, the variables, the value of of the variable changes value of the variable manipulated variable abnormally. For closed changes (MV) AND control loop variables, only the For value of the disturbance variables, (D) OR the manipulated manipulated variable (MV) OR the (MV) control variable (CV) variable (CV) changes changes abnormally. abnormally together. Control limit Fault Control limit Steady state value Steady state value Control D limit Control MV limit Fault only For variables, only the value Control limit Fault MV CV Valve Fault abnormally. closed loop variable (CV) both changes abnormally variable AND together. control D Steady state value Control MV limit Fault Steady state value CV Fault Set Point CV Control limit Fault Control limit Set Point Steady state value Steady state value Control limit Fault Set Point 108 Single and multiple faults were considered in this study. The OC data that were greater than ± 3standard deviation were divided into three regions. Region 1, region 2, and region 3 refer to mean ± 4standard deviation, mean ± 5standard deviation and over mean ± 5standard deviation respectively. The classification of these OC data can alert the operators to the level of failure occurred in the process. Improved SPC charts using quality variables were used to detect the faulty condition in the process. Samples are taken over time and values of statistics are plotted. The statistic for each Improved SPC charts is calculated using equation in Table 5.7. The process is considered to be in control if the plotted statistic of the quality variables falls within the control limits and out of control otherwise. On the other hand, control charts based on process variables data are used to diagnose the origin of the faults. Table 5.7: Improved SPC chart statistic for FDD Improved SPC chart Chart statistics Shewhart Individual x = xi Shewhart Range Ri = max [xi-1+1] – min [xi-1+1] EWMA zi= β xi + (1- β )zi-1 MA MAi =(xi + xi-1 + … + xi-w+1)/w MR MRi = max [xi-w+1] – min [xi-w+1] 109 5.6 Fault Coding Monitoring Framework Fault coding system is proposed to perform the process fault detection and diagnosis framework using Improved SPC chart. Using Improved SPC chart alone may cause difficulties in observing the process by looking through all the Improved SPC chart. The procedure for coding the control chart information are described as follows: i. Signal ‘1’ shows that: Statistical data for quality variables over the control limits Statistical data for process variables over the control limits ii. Signal ‘0’ shows that: Statistical data for quality variables within the control limits Statistical data for process variables within the control limits The faulty condition is detected when the quality variables chart give signal out of statistical control. Once fault has been detected, process variables charts are used to identify which variables caused the faulty condition. Table 5.8 and Table 5.9 summarize the application of coding system for fault detection and diagnosis. Table 5.8: Coding system for fault detection Observation 1 2 M m Oleic acid composition, xc8 1 1 M 0 Conclusion Out of Control Out of Control M In Control Table 5.9: Coding system for fault diagnosis Observation Process variable Re P 0 0 Conclusion 1 F 0 Tf 1 Qr 0 2 1 0 0 0 0 M m M 0 M 0 M 0 M 0 M 0 Feed temperature cause the fault Feed flowrate cause the fault In Control 110 Fault coding systems for each Improved SPC chart used in this study are shown in Table 5.10 to Table 5.12. Faulty condition is identified either Shewhart Individual or Shewhart Range or both Shewhart Individual and Shewhart Range give signal out of control situation. Moving Average and Moving Range chart used the same interpretation as Shewhart Individual and Shewhart Range. Table 5.10: Fault coding system for Shewhart Chart Observation Shewhart Individual Shewhart Range Conclusion 1 1 0 Out of control 2 0 1 Out of control 3 0 0 In control M M M M m 1 1 Out of control Table 5.11: Fault coding system for EWMA Chart Observation EWMA Conclusion 1 1 Out of control 2 0 In control M M M m 1 Out of control Table 5.12: Fault coding system for MAMR Chart Observation Moving Range Moving Average Conclusion 1 1 0 Out of control 2 0 1 Out of control 3 0 0 In control M M M M m 1 1 Out of control 111 5.7 Performance Evaluation of the FDD Method The performance of the FDD method using different Improved SPC chart is evaluated from two different perspectives. First, the number of false alarm generated during normal operating condition. Second, the successful of FDD method to detect the fault and identify the correct process variable as fault cause for each fault situation. The efficiency of the chart to detect and to diagnose the fault is discussed and the equation used is mentioned in section 5.7.1 and 5.7.2 respectively. 5.7.1 Fault Detection Efficiency The fault detection efficiency, ηFDetect. is determined using equation 5.24, η FDetect = Number of faults detected Number of faults generated in the process 5.24 112 5.7.2 Fault Diagnosis Efficiency The performance of a system in fault diagnosis is based on the ability of the system to identify the correct process variable as fault cause for each fault situation. The efficiency of the fault diagnosis system is evaluated based on the following situations: 1. Successfully diagnosed fault For every fault situation, the fault diagnosis tool is able to identify the correct process variables as fault causes. 2. Unsuccessfully diagnosed fault Unsuccessfully diagnosed faults can be categorized into three situations. i. Partially diagnosed fault cause situation Several of the known fault causes are identified by the fault diagnosis tool. However, there are some fault causes are not identified. ii. Over diagnosed fault cause situation Over-diagnosed fault cause refers to the situation where the system over-identifies other process variables that are not suppose to be the fault causes. iii. Undiagnosed fault cause situation This situation shows that the known fault causes are totally unidentified. Therefore, the fault diagnosis efficiency, ηFDiag is determined using Equation 5.25 η FDiag = Number of exact fault diagnosed Total diagnosed fault situation 5.25 113 5.8 Chapter Summary The methodology in this study is started with the variables selection for process monitoring. There are two types of variables involved i.e Automatic Process Contol (APC) variables and Statistical Process Control (SPC) variables. The APC variables is categorized as control variables, manipulated variables and measured disturbances while SPC variables is categorized as quality variables and process variables. Data for SPC analysis must be random with no existence of autocorrelation. In order to ensure the data generated is random with no autocorrelation exist among the variables, it must be collected at appropriate time. Therefore, SPC sampling time is determined using autocorrelation plot with confidence bound at ± 2 / k . Then, a set of historical data known as Normal Operating Condition (NOC) data is collected when the process is in statistical control or quality variables remained close to their desired values. The NOC data is subjected to small variation and normally distributed in region ± 3 standard deviation. Before further analysis, the NOC data is standardized to mean zero and unit variance since the variables have different units and wide range of data measurements. By doing so, it gives equal weight to all variables. The NOC data is collected for two purposes. The first one is to determine the correlation coefficient, which relates the quality variable with process variables via multivariate analysis techniques i.e Principal Component Analysis (PCA) and Partial Correlation Analysis (PCorrA). The second purpose is to calculate the control limits of Improved SPC charts for quality variables and process variables. The faulty condition is introduced into the process by increasing or decreasing the process variables from the target, which will cause the desired quality variables statistics fall beyond the control limits. This known fault causes are divided into three types of faults i.e sensor fault, valve fault and controller fault. Improved SPC charts using quality variables are used to detect the faulty condition in the process. On the other hand, control charts based on process variables data are used to diagnose the origin of the faults. The Improved SPC chart used to compare the current state of the process against NOC. If any points fall beyond the control limits, 114 the Improved SPC will give signal and it shows the Out of Control (OC) condition has taken place. Fault coding system is proposed for easier interpretation of the Improved SPC charts. Signal ‘1’ shows that the process is out of statistical control while signal ‘0’ shows that the process is in control. The efficiency of the Improved SPC chart is evaluated based on two aspects. First, the number of false alarm generated during normal operating condition. Second, the successful of FDD method to detect the fault and identify the correct process variable as fault cause for each fault situation. It is important to note that the implementation of the Improved SPC method is to detect and diagnose fault in the process when the set point and disturbance unchanged. This means that the controller in APC system are ready to counteracts the effect of disturbance change and set point change as long as the controller functions well. The methodology for developing the FDD systems for the simulated precut distillation column is outlined in Figure 5.5. 115 Identify APC variables and SPC variables for FDD purpose Calculate SPC sampling time, TSPC Generate data for both quality variables and process variables at TSPC interval Standardize data for each variable Determine the correlation coefficient between quality variables and process variables using PCA and PCorrA Calculate the control limits for each control charts 99.73% of data fall within control limits No Yes NOC data is representable Generate the OC data Construct the Improved SPC Performance evaluation of the charts developed FDD algorithm Figure 5.5: The outlined of FDD algorithm CHAPTER 6 RESULTS AND DISCUSSIONS 6.1 Statistical Process Control Sampling Time, TSPC Table 6.1 summarizes the time constant of the control variables for each control loops. Pumparound temperature has the highest time constant, so the autocorrelation plot is based on pumparound cooler model. Table 6.1: Time constant for each control loops Control variable Time constant, θ hr Bottom column liquid level 0.10 Bottom temperature 0.28 Top column pressure 1.00 Side draw tray liquid level 0.20 Pumparound temperature 1.45 Figure 6.1 displays the autocorrelation coefficient of pumparound temperature versus lag. As shown in this figure, the autocorrelation plot touched the upper bound (+2standard deviation) during 216th observations. The autocorrelation plot is based on data collected every 0.01 hr (APC sampling time, TAPC). Therefore, SPC sampling time is equal to 2.16 hrs (216 x 0.01hr). 117 Figure 6.1: The autocorrelation plot 6.2 Normal Operating Condition, NOC Data A set of historical data for both quality variables and process variables was generated from the simulated dynamic distillation column. Each variable was composed of 2000 observations during normal operating condition. The data was sampled every 2.16 hr. 30 data were identified to have values greater than ± 3standard deviations. MATLAB programming was used to remove any outlier that was greater than ± 3standard deviations. Therefore, after removing the outliers each variable consisted of 1970 data observations. The histogram plot as in figure 6.2 shows the distribution of the NOC data. Figure 6.2 shows the distribution of NOC data approximately normal with average at zeros and the data varies between ± 3standard deviations. The skewness and kurtosis in Table 6.2 show the value closed to zero and three respectively. Based on the histogram and the kurtosis and skewness values, it was decided that data were distributed normally and acceptable for further analysis to construct the control charts. Table 6.3 shows the summary of 118 the NOC data. The NOC data is standardized using Equation 5.3 before proceed to the next step in FDD methodology. Figure 6.2: Histogram plot of NOC data Table 6.2: The skewness and kurtosis value of NOC data F Tf Re P Qr xC8 xC9 Skewness 0.044 -0.020 0.021 -0.0057 -0.011 -0.0123 0.0192 Kurtosis 2.748 2.892 2.839 2.836 2.904 2.734 2.734 Table 6.3: Summary of the NOC data F Mean Standard Tf 41.558 483.15 Re P Qr xC8 xC9 10 49.6 2.1576x106 0.13422 0.0246 0.03 0.02 283.93 2.965x10-5 7.77x10-4 0.05 0.093 41.71 483.42 10.087 49.659 2.1584 x106 0.13431 0.02509 41.41 482.88 49.542 2.1568 x106 0.13414 0.0242 deviation Maximum value Minimum value 9.912 119 6.2.1 Correlation coefficient via Principal Component Analysis, PCA PCA is applied to the combined set of quality variables and process variables together in data matrix, X. In principle there is a significant relations between the two set of variables. Analysis of these variables together provides better detection of fault than looking at them separately (Lennox et al., 1999). Singular Value Decomposition, SVD was applied to determine the eigenvalues and eigenvectors of NOC data. Table 6.4 shows the eigenvectors, V obtained. The eigenvalue versus PC plot is shown in Figure 6.3 and a table of eigenvalues is shown in Table 6.5. Table 6.4: Eigenvectors Matrix P1 P2 P3 P4 P5 P6 P7 F -0.3471 -0.6298 0.0122 -0.4557 -0.1321 0.4638 0.2061 Tf 0.2044 -0.4614 0.0240 0.4035 -0.7180 -0.2356 -0.1047 -0.2998 0.4138 -0.5878 0.2325 -0.3943 0.3918 0.1741 -0.2844 -0.0053 0.5771 0.6358 0.1566 0.3625 0.1610 Qr 0.0466 -0.4674 -0.5658 0.4134 0.5355 -0.0090 -0.0384 xc8 -0.5763 -0.0278 -0.0244 0.0174 0.0099 -0.6611 0.4768 xc9 0.5767 0.0070 -0.0008 0.0010 0.0139 0.0852 0.8123 Re P Table 6.5: Eigenvalues, variation explained by each PC and cumulative variation Principal Eigenvalue Component Difference Variation Cumulative (∆= λ i+1- λ i) Explained Variation (%) (%) 42.9371 42.937 PC1 3.0056 PC2 1.0512 1.9544 15.0171 57.954 PC3 1.0097 0.0415 14.4242 72.378 PC4 0.9829 0.0268 14.0414 86.42 PC5 0.9506 0.0323 13.58 100 PC6 1.0314 x 10-7 0.9506 1.47 x 10-6 100 PC7 1.4219 x 10-9 0 2.03 x 10-8 100 120 After the eigenvectors and eigenvalues of the data matrix are obtained, the next step is to determine the number of PCs to be retained. According to the first rule of thumb, eigenvalues versus PC is plotted. From the scree plot in Figure 6.3, a “knee” is clearly occurs at PC number two and PC number six. This indicates that either one or six PCs should be retained in the model. Two PCs were preferred because if more than half number of total Principal Components are required to account for a reasonable variation, then the data cannot be summarized very well with a few components and shall stick to the original data itself (Cliff, 1987). Moreover, if six PCs are retained, this will violate the uniqueness of the PCA method for data reduction. Figure 6.3: Scree plot For the second rule of thumb, the value of eigenvalues is used as an indicator for the number of PC to be retained in the model. The value of eigenvalues was shown in Table 6.5. The difference between PCs is calculated and the biggest changes show the sudden jump in the eigenvalues. The average change in eigenvalue between PCs for this data set is 0.5. The largest changes occur between PCs number one and two (∆=1.9544) while the changes between PCs number five and six is 121 (∆=0.9506). Therefore, the best choice in this case is to select 2 PCs. In this data set, second PC accounts for as much variance as 1.0512 of original variables. For the third rule of thumb, the cumulative variation for the first ith PCs is calculated. As shown in Table 6.5, the first two PCs explained 58% of the total variance, which is about half of the total variance. Important information of original data might be lost if small cumulative variation is used. Moreover, greater eigenvalues usually describe the systematic variation and should be kept in the model and Sharma (1996) suggested retaining eigenvalues greater than one. This is referred to eigenvalues-greater-than-one rule. If two PCs retained were selected, important information from third PCs which has eigenvalues greater than one might lost. Therefore, three PCs retained are preferred which explained 72% of the total variance and has eigenvalues greater than one. In order to select the number of PCs to be retained, either two PCs or three PCs, the correlation coefficients using eigenvectors and eigenvalues captured by two PCs, three PCs and six PCs were determined. Six PCs, which explained 100% of the total variance is used as reference. Table 6.6 lists the result of correlation coefficient calculated using different cumulative variation. Table 6.6: The correlation coefficient using different cumulative variation Two PCs Three PCs Six PCs Variables xc8 xc9 xc8 xc9 xc8 xc9 F 0.6197 -0.6064 0.6194 -0.6064 0.6104 -0.6086 Tf -0.3406 0.3510 -0.3412 0.3510 -0.3411 0.3419 Re 0.5073 -0.5167 0.5217 -0.5162 0.5220 -0.5212 P 0.4927 -0.4930 0.4785 -0.4935 0.4909 -0.4908 Qr -0.0671 0.0774 -0.0532 0.0779 -0.0410 0.0853 The percentage difference for two PCs and three PCs retained compared to six PCs is calculated and the result is recorded in Table 6.7. The overall result obtained indicates that three PCs retained, which capture 72% of the total variance, give the correlation coefficient closer to 100% variation than two PCs. The percentage different for the correlation between quality variables and process 122 variables are small (less than 10%) except for correlation coefficient between reboiler duty and oleic acid compositions. The highest different was about 64% using two PCs while the percentage different was about 30% for three PCs retained. Table 6.7: The percentage difference for two PCs and three PCs retained compared to six PCs. Two PCs Three PCs ⎡ Two PCs - Six PCs ⎤ % ∆= ⎢ ⎥⎦ x100% Six PCs ⎣ ⎡ Three PCs - Six PCs ⎤ % ∆= ⎢ ⎥⎦ x100% Six PCs ⎣ Variables xc8 xc9 xc8 xc9 F 1.5236 -0.3615 1.4744 -0.3615 Tf -0.1466 2.6616 0.0293 2.6616 Re -2.8161 -0.8634 -0.0575 -0.9593 P 0.3667 0.4482 -2.5260 0.5501 Qr 63.6585 -9.2614 29.7561 -8.6753 Number of PCs retained will affect the fault detection and diagnosis results. The correlation coefficient calculated based on the number of PCs retained must close to the normal correlation using 100% variation. Three PCs retained has yielded a result close to 100% variation. Therefore, the correlation coefficient calculated using 3 PCs is used to develop the control charts for FDD. 123 6.2.2 Correlation coefficient via Partial Correlation Analysis, PCorrA The simple correlation coefficients, which show the associations between quality variables and process variables and among process variables, are presented in Table 6.8. Table 6.8: Simple correlation matrix F Tf Re P Qr xc8 xc9 F 1.0000 0.0018 -0.023 0.0029 0.0013 0.6104 -0.6086 Tf 0.0018 1.0000 -0.0379 -0.0129 0.0400 -0.3411 0.3419 Re -0.023 -0.0379 1.0000 -0.0019 -0.0157 0.5220 -0.5212 P 0.0029 -0.0129 -0.0019 1.0000 -0.0288 0.4909 -0.4908 Qr 0.0013 0.0400 -0.0157 -0.0288 1.0000 -0.0410 0.0853 xc8 0.6104 -0.3411 0.5220 0.4909 -0.0510 1.0000 -0.9990 xc9 -0.6086 0.3419 -0.5212 -0.4908 0.0853 -0.9990 1.0000 The correlation coefficient results show that the process variables have opposite correlation with oleic acid composition, xc8 and linoleic acid composition, xc9. Feed flowrate, reflux flowrate, and pumparound flowrate are positively correlated with oleic acid composition but negatively correlated with linoleic acid composition. Feed flowrate is highly correlated with quality variables while reboiler duty shows less correlation with quality variables. Partial correlation is calculated between one of the process variables with each quality variables while controlling for other process variables. The order of the partial correlation depends on the number of process variables controlled for. Fourth order partial correlation is calculated and the results are shown in Table 6.9. The result obtained as shown in Table 6.9 shows that process variables are highly correlated with the quality variables after the effect of other process variables has been removed. 124 Table 6.9: The correlation coefficient result using PCorrA 6.3 F Tf Re P Qr xc8 0.999 -0.999 0.999 0.999 -0.998 xc9 -0.999 0.999 -0.999 -0.999 0.999 Parameters to Design Improved Statistical Process Control chart As has been mentioned in Chapter 5, three standard deviation limits is used to construct the control limits for Shewhart individual for a few reason. In order to calculate Shewhart range, the size of the window used is two. The constant used to develop the control limits for Shewhart range using window size two is shown in Table 6.10. Table 6.10: Constant to design Shewhart range chart Constant Value D.'001 , 4.12 D.'999 0 A2 1.02 There are two important parameters to be selected to design EWMA chart i.e. width of the control limit, L and the weighting factor, β . For this reason, one fault case was introduced into the process. Pumparound flowrate was increased from 49.6 kmol/hr to 49.9 kmol/hr, which cause +0.6% deviation from target value. In order to maintain the sensitivity of EWMA chart to detect small shift, the weighting factor used starting from the smallest which is 0.5. This value was increased until the EWMA chart can detect and diagnose the known fault. The EWMA statistics was calculated and the chart was plotted for each parameters. Table 6.11 summarize the result obtained using various EWMA parameters. Appendix B provides the EWMA chart for each parameter. As shown in Table 6.11, EWMA chart using PCorrA technique can identify the fault causes using weighting factor 0.05, 0.10, 0.20, 0.25, and 0.4 but the chart can only detect the fault using weighting factor 0.4. On the 125 other hand, EWMA chart using PCA can only diagnose and detect fault using weighting factor 0.4. Therefore, width of control limit 3.054 and weighting factor 0.4 are selected which give the best outcome in fault detection and diagnosis using both techniques i.e PCorrA and PCA. Table 6.11: FDD result using different L and β to design EWMA chart Weighting Width of the Fault detection and diagnosis result factor, control limit, β L FDetect FDiagnose FDetect FDiagnose 0.05 2.615 Undetected Diagnosed Undetected Undiagnosed 0.10 2.814 Undetected Diagnosed Undetected Undiagnosed 0.20 2.962 Undetected Diagnosed Undetected Undiagnosed 0.25 2.998 Undetected Diagnosed Undetected Undiagnosed 0.40 3.054 Detected Diagnosed Detected Diagnosed PCorrA PCA To design the Moving Average and Moving Range, MAMR chart, the size of moving window need to be decided. For this reason, one fault case was introduced into the process. Feed flowrate was increased to 51 kmol/hr (+2.8% from the target value). MAMR chart using three different window sizes i.e 3, 4 and 5 was used to detect and diagnose this fault case. Larger window size may cause delay in detecting and diagnosing the fault in the process. For each successive group, earlier result is discarded and replaced by the latest. Tables 6.12 and 6.13 show the outcome of FDD using both multivariate techniques, PCA and PCorrA with different window size. MAMR charts for three different window sizes are shown in Appendix C. Fault situation is detected or diagnosed when MA or MR statistic greater than the control limits. It is also considered fault when both MA and MR statistic greater than control limits. Results shown in Tables 6.12 and 6.13 indicate that Moving Average chart have delay in detecting fault using all window sizes. Moving Average chart for quality variables exceed the control limits one points later using window size three, two points later using window size four and three points later using window size five after the fault occurs. However, both MA and MR chart with window size three can 126 detect and diagnose known fault using PCA and PCorrA. Therefore, window size three is selected to design MAMR chart for fault detection and diagnosis. Table 6.12: FDD result using PCA with different window size to design MAMR chart Window Fault detection and diagnosis result size 3 MR MA FDetect FDiagnose Remarks FDetect FDiagnose Remarks Detected Diagnosed - Detected Diagnosed One point delay 4 Detected Diagnosed - Detected Undiagnosed Two points delay 5 Detected Undiagnosed - Detected Undiagnosed Three points delay Table 6.13: FDD result using PCorrA with different window size to design MAMR chart Window Fault detection and diagnosis result size MR FDetect 3 FDiagnose Detected Diagnosed MA Remarks FDetect FDiagnose Remarks - Detected Diagnosed One point delay 4 Detected Diagnosed - Detected Diagnosed Two points delay 5 Detected Diagnosed - Detected Diagnosed Three points delay 127 6.4 Fault Detection and Diagnosis Method Performance The performance of the FDD method was analyzed from two different perspectives. First, the performance was analyzed with respect to the number of false alarm generated during normal operating condition using different type of Improved SPC chart. Second, the efficiency of the FDD method with respect to fault detection and diagnosis during faulty condition was analyzed. Ideally, the monitoring data should only affect by the fault during fault detection and diagnosis, FDD process. However, the presence of disturbances and noises may cause an error and thus interfering the monitoring results. Therefore, the generating data from the process for monitoring must be maximally unaffected by these nuisance inputs so that it is robust in the face of disturbances and noises. 100 data were generated without any known fault inserted into the process for false alarm analysis. 6.4.1 False Alarm Analysis False alarm occurred when Improved SPC chart for quality variables give signal out of statistical control whereas the process variables are in statistical control. Tables 6.14 and 6.15 show number of signal given by the Improved SPC charts during normal operating condition for oleic acid composition, xc8 and linoleic acid composition, xc9 respectively. One false alarm was detected using MAMR. Quality variables charts give signal out of statistical control during observation 27th. There is no false alarm found using Shewhart and EWMA. 128 Table 6.14: False alarm result corresponding to oleic acid composition Improved Number of false alarm SPC Process variables Total Quality chart variables F Tf Re P Qr xc8 Shewhart 0 0 0 0 0 0 0 EWMA 0 0 0 0 0 0 0 MAMR 0 0 0 0 0 1 1 Total 0 0 0 0 0 1 Table 6.15: False alarm result corresponding to linoleic acid composition ISPC Number of false alarm chart Process variables Total Quality variables F Tf Re P Qr xc9 Shewhart 0 0 0 0 0 0 0 EWMA 0 0 0 0 0 0 0 MAMR 0 0 0 0 0 1 1 Total 0 0 0 0 0 1 The occurrence of false alarm may cause wrong action taken to the process which operates at normal condition. 129 6.4.2 Fault Detection Efficiency Results and Results Discussions Various kinds of faults were introduced to the simulated precut column at 80 locations. Number of single and multiple faults involving five selected process variables i.e feed flowrate, F, feed temperature, Tf, reflux flowrate Re, pumparound flowrate, P and reboiler heat duty, Qr are listed in Table 6.16. Table 6.16: The known fault causes Fault Single cause fault Multiple faults Total As part As part fault of 2 of 3 causes causes causes F 10 7 9 26 Tf 10 7 9 26 Re 13 8 6 27 P 11 8 6 25 Qr 6 0 15 21 Total 50 30 45 125 The quality variables i.e oleic acid composition, xc8 and linoleic acid composition, xc9 were monitored to identify the Out of Control (OC) condition. Number of OC condition due to single faults and multiple faults detected by each Improved SPC chart are shown in Table 6.17. 130 Table 6.17: The known out of control, OC conditions Improved Quality Due to Due to multiple SPC chart variable single faults Total OC fault 2 causes 3 causes 50 15 15 80 50 15 15 80 33 13 14 60 34 15 15 62 24 13 14 51 28 13 15 56 condition Shewhart Oleic acid, xc8 Linoleic acid, xc9 EWMA Oleic acid, xc8 Linoleic acid, xc9 MAMR Oleic acid, xc8 Linoleic acid, xc9 Based on the result obtained in Table 6.17, the efficiency of each Improved SPC chart was calculated to determine the performance of each chart to detect faults in the process. The computed values of fault detection efficiency are shown in Figure 6.4. The results obtained shows that all known faults for both quality variables were detected by Shewhart chart. EWMA chart can detect 75% of the known faults occurs on oleic acid composition and 77.5% of the known faults occurs on linoleic acid composition. MAMR chart has the lowest percentage, which are 63.8% and 70% fault detected on oleic acid composition and linoleic acid composition respectively. This shows that Shewhart chart performance in detecting faults is better than EWMA and MAMR. Shewhart chart was plotted using 100% current data but MAMR statistic was calculated using window size three consist of 33% current data and 67% previous data. This caused detection delay using MAMR. EWMA statistic was calculated using weighting factor, β = 0.4. This gives the EWMA statistic consists of 40% current data 60% previous data. Based on the percentage of current data and previous data included during calculating the chart statistics, it shows that fault detection efficiency decreases as the percentage of previous data involved in calculating the statistics value for each charts increases. 131 FDetect efficiency (%) 100 80 60 40 20 0 Shewhart EWMA MAMR xc8 100 75 63.8 xc9 100 77.5 70 ISPC chart xc8 xc9 Figure 6.4: Fault detection efficiency using different Improved SPC chart Out of 80 faults inserted into the process, 17.5% fault cases identified fall in region one (mean ± 4standard deviation), 25% fault cases fall in region two (mean ± 5standard deviation) and 57.5% fault cases fall in region three (beyond mean ± 5standard deviation). Figure 6.5 shows the result obtained for percentage of faults detected in three different regions. Shewhart chart was able to detect all known faults in all regions. EWMA chart was able to detect 2.5% fault cases in region one, 16.3% fault cases in region two and 56.3% fault cases in region three while MAMR can only detect fault in region two and region three. This indicates that Shewhart chart was sensitive with small shift in the process while EWMA chart was sensitive with moderate change in the process. MAMR chart started give signal OC condition for large deviation occurs in the process. FDetect efficiency (%) 132 80.0 60.0 40.0 20.0 0.0 Shewhart EWMA MAMR Region 1 17.5 2.5 0.0 Region 2 25.0 16.3 8.8 Region 3 57.5 56.3 55.0 OC region Region 1 Region 2 Region 3 Figure 6.5: Percentage of faults occurs in different region detected by each Improved SPC chart 6.4.3 Fault Diagnosis Efficiency Results and Results Discussions The efficiency of Improved SPC chart for fault diagnosis depends on four situations: 1. Exact diagnosed fault cause situation 2. Partially diagnosed fault cause situation 3. Over diagnosed fault cause situation 4. Undiagnosed fault cause situation Improved SPC chart successfully diagnose fault if it can identify exactly fault cause situations. If Improved SPC charts partially diagnosed, over diagnosed and undiagnosed fault cause situation, then Improved SPC chart is considered failed to diagnose fault causes. Improved SPC chart was constructed using process variables data to diagnose faults. Discussions on the efficiency of Improved SPC charts to diagnose faults are related with multivariate analysis technique used to calculate the control limits for process variables. Figure 6.6 and Figure 6.7 provide fault diagnosis results for oleic acid composition using PCA and PCorrA techniques respectively while Figure 6.8 and Figure 6.9 for linoleic acid composition using PCA and PCorrA 133 techniques respectively. Number of exact fault diagnosed, partially fault diagnosed, over fault diagnosed and undiagnosed fault for single and multiple faults are listed detailed in Appendix D. From the results obtained in Figures 6.6 and 6.8, show that Shewhart chart using PCA technique was not able to diagnose all known faults in the process. Out of 80 fault causes insert into the process, Shewhart chart was able to identify only 87.5% fault causes which contribute out of statistical control to oleic acid composition and linoleic acid composition. However, the number of exact fault diagnosed by Shewhart is higher than EWMA and MAMR. Fault diagnosis using PCorrA technique as shown in Figures 6.7 and 6.9 results 100% in diagnosing exact fault causes using Shewhart chart. As shown in Figures 6.6 to 6.9, number of exact fault diagnosed by EWMA chart and MAMR chart using PCorrA technique was FDiagnose efficiency (%) higher than PCA technique. 100 50 0 Shewhart EWMA MAMR Exact diagnosed 87.5 55 35 Partial diagnosed 6.25 11.25 18.75 Over diagnosed 0 0 0 6.25 33.75 46.25 Undiagnosed ISPC chart Exact diagnosed Partial diagnosed Over diagnosed Undiagnosed Figure 6.6: Fault diagnosis situation using PCA for oleic acid composition 134 FDiagnose efficiency (%) 150 100 50 0 Shewhart EWMA MAMR Exact diagnosed 100 75 63.75 Partial diagnosed 0 0 0 Over diagnosed 0 0 0 Undiagnosed 0 25 36.25 ISPC chart Exact diagnosed Partial diagnosed Over diagnosed Undiagnosed FDiagnose efficiency (%) Figure 6.7: Fault diagnosis situation using PCorrA for oleic acid composition 100 50 0 Shewhart EWMA MAMR Exact diagnosed 87.5 53.75 36.25 Partial diagnosed 6.25 13.75 21.25 Over diagnosed 0 0 0 6.25 32.5 42.5 Undiagnosed ISPC chart Exact diagnosed Partial diagnosed Over diagnosed Undiagnosed Figure 6.8: Fault diagnosis situation using PCA for linoleic acid composition FDiagnose efficiency (%) 135 150 100 50 0 Shewhart EWMA MAMR Exact diagnosed 100 77.5 70 Partial diagnosed 0 0 0 Over diagnosed 0 0 0 Undiagnosed 0 22.5 30 ISPC chart Exact diagnosed Partial diagnosed Over diagnosed Undiagnosed Figure 6.9: Fault diagnosis situation using PCorrA for linoleic acid composition The performance of each Improved SPC chart to diagnose fault was affected by the multivariate technique used to determine the control limit for process variables. The calculated control limits for Improved SPC charts using PCorrA technique performed better than PCA technique because the correlation coefficients developed by this technique are closer to the actual value of the correlation coefficients, representing the correlation between the selected process variables with the quality variables of interest. Some of the process variables were kept constant during determining the correlation between selected quality variable with one of the process variables. By doing so, the correlation coefficient developed using PCorrA was not affected by the variance of other process variables, which influence the actual correlation between variables. The correlation coefficients obtained using PCorrA for each process variables are close to one. Each Improved SPC charts either exact diagnosed or undiagnosed known fault in the process. As shown in Figures 6.7 and 6.9, there is no partially diagnosed fault situation was identified. The PCA method calculates the cross-correlation between variables (interaction between variables) when determining the correlation coefficients between the process variables and quality variables. The correlation coefficient obtained using PCA is smaller than PCorrA. Small correlation coefficient, widen the control limits of Improved SPC chart. As can be seen in Figure 6.6 to 6.9, number of partially diagnosed and undiagnosed faults using PCA is higher than PCorrA. For 136 example as in Figures 6.6 and 6.7, EWMA chart failed to diagnose 45% fault causes using PCA while 25% fault causes failed to diagnose using PCorrA. The correlation coefficient calculated using PCA varies among the process variables. This caused some of the process variables gave signal OC condition while others did not when multiple faults involved. The results using PCA technique caused some multiple faults cases are partially diagnosed. However, PCA technique which used 72% variation of the original data is better than normal correlation, which used 100% variation of original data. 6.5 Chapter Summary A set of NOC data was generated from the simulated dynamic distillation column to observe two quality variables of interest and five selected process variables, each one composed of 1970 observations. NOC data was collected during normal operating condition with quality variables closed to target value. In this study, a sampling frequency at SPC sampling time that was 2.16 hrs enabled such an assumption of zero autocorrelation to be satisfied. The NOC data then was used to develop the correlation coefficient using PCA and PCorrA. Three rules of thumbs are used to determine number of PCs to be retained. Three Principal Components, PCs retained used in this study, which captured 72% variation of the original data when developing the correlation coefficient using PCA. Correlation coefficient calculated using three PCs retained give result close to normal correlation which used 100% variation of original data. Fourth order partial correlation is used to calculate the correlation coefficients using PCorrA and the result obtained shows that selected process variable are highly correlated with the quality variable after the effect of other process variables has been removed. In order to design Improved SPC chart, some parameters need to be selected to determine the control limits using NOC data. Based on theoretical study from literature, three standard deviation limits is used to construct the control limits for 137 Shewhart individual while Shewhart range is calculated using window size of two. Sensitivity test was conducted to select parameters for EWMA chart MAMR chart. The outcome from the test conducted give result; width of control limit equal to 3.054 and weighting factor equal to 0.4 are selected which give the best outcome in fault detection and diagnosis using both techniques i.e PCorrA and PCA for EWMA chart. The sensitivity test conducted for MAMR shows that MAMR was able to detect and diagnose fault using window size three for both techniques i.e PCorrA and PCA. False alarm analysis was performed before faulty condition introduced into the process to ensure the result obtained during faulty condition was not affected by false alarm error. One false alarm case was identified for both quality variables using MAMR chart. The performance of Improved SPC chart to detect and diagnose faults during faulty condition was evaluated. For this purpose, 80 faults location has been identified to insert various kinds of faults to the simulated precut column. Five selected process variables with single and multiple faults were introduced into the process. The results obtained show that Shewhart chart perform better than EWMA and MAMR in detecting and diagnosing faults. Both multivariate analysis technique i.e PCA and PCorrA give significant effect to the efficiency of each Improved SPC chart to diagnose fault causes in the process. The correlation coefficients developed using the PCorrA was better than PCA in representing the correlation between the selected process variables and the quality variables of interest which provided high efficiency in fault diagnosis. CHAPTER 7 CONCLUSIONS AND RECOMMENDATIONS 7.1 Introduction Process monitoring, early fault detection and diagnosis are important in chemical industries for safety, maintaining product quality and reducing cost. A procedure for Fault Detection and Diagnosis, FDD has been described. A new approach known as Improved SPC charts is introduced to detect faults in the process and identify variables, which cause the faults. The proposed method is demonstrated on a simulated precut multicomponent distillation column. Multivariate analysis technique i.e Principal Component Analysis, PCA and Partial Correlation Analysis, PCorrA are utilized to determine the correlation coefficient between quality variables and process variables. The correlation coefficients are used to translate the control limits of SPC charts from quality variables into process variables. This correlation is adopted in conventional SPC charts to construct the control limits for process variables. The quality variables and some of the process variables, which are correlated between the former and the later variables are monitored. The quality variables data is incorporated in the Improved SPC chart during the faulty condition for fault detection purpose, while the process variables, which has been correlated with the quality variables is used for fault diagnosis. The Improve SPC charts applied is not only for quality variables but also for process variables. 139 7.2 Conclusions of Improved SPC framework The proposed Improved SPC monitoring framework consist of the following elements to perform false alarm analysis, fault detection and fault diagnosis: 1) False alarm analysis was conducted using normal operating condition data to identify outliers. 2) Fault detection tool, which monitors the quality variables was used to detect faulty condition in the process. 3) Fault diagnosis tools, which based on setting up the control limits for process variables using multivariate analysis technique was used to identify the causes of faulty condition. 4) Fault coding system, which uses ‘1’ to denote out of control limits situations while uses ‘0’ to denote in control situations, was used to avoid of using too many ISPC charts. The methodology of the proposed method started with variables selections for process monitoring. The variables involved are divided into two categories i.e APC variables and SPC variables. Some of the selected SPC variables are part of APC variables. Even though the simulated precut column has been installed with controllers to keep the process on target, there is a probability of faults to occur due to line blockage, line leakage, sensor faults, valve sticking and controller fault. It is very important to highlight that the ISPC is implemented for the process change that is not due to set point change and disturbance change since the controllers installed are ready to counteract the effect of these changes. Two sets of data i.e NOC data and OC data are generated from the simulated precut column at TSPC interval. NOC data was subject to common cause variation, which creates random behavior and used to construct the control limits for Improved SPC chart and to develop the correlation coefficient via multivariate analysis technique. On the other hand, OC data was subject to assignable cause variation due to faulty condition occur in the process. The performance of the Improved SPC chart is evaluated based on these two sets of data. Both NOC data and OC data are 140 standardized before further analysis since the variables have different units and wide range of data measurements. The proposed Improved SPC framework is analyzed in two different perspectives. First, the number of false alarm generated during normal operating condition. Second, the successful of the proposed method to detect the fault and identify the correct process variable as fault cause for each fault situation. Based on the result obtained, there is no false alarm given by Shewhart and EWMA charts while MAMR chart has one false alarm case. Small number of false alarm produced will result less risk of taking wrong action to the normal process. Result on fault detection efficiency shows that Shewhart chart was able to detect all known faults in the process. EWMA chart can detect 75% of the known faults occurs on oleic acid composition, xc8 and 77.5% of the known faults occurs on linoleic acid composition, xc9. MAMR chart has the lowest percentage of fault detection efficiency, that is 63.8% and 70% fault detected on oleic acid composition and linoleic acid composition respectively. Multivariate analysis technique used to construct the control limits for process variables gives significant effect to the efficiency of Improved SPC chart to identify exact fault causes. Implementation of PCorrA provides 100% fault diagnosis efficiency using Shewhart chart while PCA provides 87.5 % fault diagnosis efficiency. EWMA and MAMR chart using PCorrA also give better performance compared to PCA in terms of percentage of exact fault diagnosed. PCorrA technique performed better than PCA technique because the correlation coefficients developed by this technique are closer to the actual value of the correlation coefficients, representing the correlation between the selected process variables with the quality variables of interest. All Improved SPC charts i.e Shewhart, EWMA and MAMR charts are able to detect faults in the process. However, Shewhart chart gives the best performance in detecting known faults in the process. While for fault diagnosis, implementation of PCorrA in Improved SPC chart performed better than PCA. Among these charts, Shewhart charts give the highest efficiency in exactly identifying fault causes using 141 PCorrA. The simplicity of the presentation of the Improved SPC charts based on coding system used makes these charts easier to analyze and interpret. As a conclusion, the research objective on studying the performance of Improved SPC charts to detect faults and utilizing the multivariate analysis techniques to develop the correlation between quality variables and process variables for fault diagnosis have been achieved. 7.3 Recommendations to Future Research In order to enhance the applicability of the proposed Improved SPC method, the following suggestions are proposed: i. Future research activities can focus to apply the proposed approach plant-widely, either in a simulation plant, pilot plant or real plant. i. The developed FDD tools for incoming research activity can be applied on a multiple unit operations case study. LIST OF REFERENCES Alwan, L. C. (2000). Statistical process analysis. New York: McGraw Hill. Bakshi, B. R. (1998). Multiscale PCA with application to multivariate statistical process monitoring. American Institute of Chemical Engineering Journal. 44: 1596-1610. Bakshi, B. R. (1999). Multiscale analysis and modeling using wavelets. Journal of Chemometrics. 13: 415–434. Bower, K. M. (2000). Using exponentially weighted moving average (EWMA) charts. Minitab Inc., State College, P.A., USA. Chen, Q., Kruger, U., Meronk, M. and Leung, A.Y.T. (2004). Synthesis of T2 and Q statistic for process monitoring. Control Engineering Practice. 12: 745-755. Chen, G., McAvoy, T. J. and Piovoso, M. J. (1997). A multivariate statistical controller for on line quality improvement. Journal of Process Control. 8(2): 139-149. Chen, K. H. (2001). Data-Rich Multivariate Detection and Diagnosis Using Eigenspace Analysis. PhD Thesis. Massachusetts Institute Of Technology. Chen, K. H., Boning, D. S. and Welsch, R. E. (2001). Multivariate statistical process control and signature analysis using eigenfactor detection method. The 33rd Symposium on the Interface of Computer Science and Statistics. June 2001. Costa Mesa, CA. 1996. 1-20. 143 Chiang, L. H., Russell, E. L. and Braatz, R. D. (2000). Fault diagnosis in chemical processes using Fisher discriminant analysis, discriminant partial least squares, and principal component analysis. Chemometrics and Intelligent Laboratory Systems. 50: 243–252 Chiang, L. H., and Braatz, R. D. (2003). Process monitoring using causal map and multivariate statistics: fault detection and identification. Chemometrics and Intelligent Laboratory Systems. 65: 159– 178. Cliff, N. (1987). Analyzing Multivariate Data. Harcourt Brace Jovanovich Publishers. Dash, S. and Venkatasubramanian, V. (2000). Challenges in the industrial applications of fault diagnostic systems. Computers and Chemical Engineering. 24: 785-791. Daszykowski, M., Walczak, B. and Massart, D. L. (2003). Projection methods in chemistry. Chemometrics and Intelligent Laboratory Systems. 65: 97–112. DeVor, R. E., Chang, T. H. and Sutherland, J. W. (1992). Statistical quality design and control: contemporary concepts and methods. New York: Macmillan. DeVries, A. and Conlin, B. J. (2003). Design and performance of statistical process control charts applied to estrous detection efficiency. Journal of Dairy Science. 86: 1970–1984. Doty, L. A. (1996). Statistical process control. 2nd Edition. New York: Industrial press Inc. Duchesne, C., Kourti, T., and MacGregor, J. F. (2002). Multivariate SPC for startups and grade transition. American Institute of Chemical Engineering Journal. 48 (12): 2890-2901. 144 Finn, J.D. (1974). A general model for multivariate analysis. USA: Holt, Rinehart and Winston. Foucart, T. (2000). A decision rule for discarding principal components in regression. Statistical Planning and Inference. 89: 187-195. Frank, P. M., and Ding, X. (1997). Survey of robust residual generation and evaluation methods in observer based fault detection systems. Journal of process control. 7(6): 403-424. Geankoplis, C. J. (1993). Transport processes and unit operations. 3rd Edition. New Jersey: Prentice-Hall. Gertler, J. J. (1998). Fault detection and diagnosis in engineering systems. New York: Marcel Dekker. Goulding, P.R., Lennox, B., Sandoz, D.J., Smith, K. and Marjanovic, O. (2000). Fault detection in continuous processes using multivariate statistical methods. International Journal of Systems Science. 31(11): 1459-1471. Hashimoto, K., Shimizu, E., Komatsu, N., Nakazato, M., Okamura, N., Watanabe, H., Kumakiri, C., Shinoda, N., Okada, S., Takei, N., and Iyo, M. (2003). Increased levels of serum basic fibroblast growth factor in schizophrenia. Psychiatry Research. 120: 211–218. Himmelblau, D.M. (1978). Fault detection and diagnosis in chemical and petrochemical processes. Amsterdam: Elsevier Press. Hurowitz, S., Anderson, J., Duvall, M. and Riggs, J. B. (2003). Distillation control configuration selection. Journal of Process Control. 13: 357–362. Hunter, J.S (1989). One point plot equivalent to the Shewhart chart with Western Electris Rules. In: Montgomery, D. C. Introduction to Statistical Quality Control. 3rd Edition. Canada: John Wiley & Sons. 338. 145 Hwang, D. and Han, C. (1999). Real time monitoring for a process with multiple operating modes. Control Engineering Practice. 7: 891 – 902. Isermann, R. (1997). Supervision, Fault Detection and Fault Diagnosis Methods – An Introduction. Control Engineering Practice. 5(5): 639 – 652. Ibrahim, K.A. (1997). Application of Partial Correlation Analysis in Active Statistical Process Control. Proceedings of Regional Symposium of Chemical Engineering. October 13-15, 1997. Johor: UTM and IEM. 1997. 434 – 439. Jackson, J. E. (1991). A User’s Guide to Principal Components. USA: John Wiley & Sons. JiJi, R. D., Hammond, M. H., Williams, F. W. and Rose-Pehrsson, S. L. (2003). Multivariate statistical process control for continuous monitoring of networked early warning fire detection (EWFD) systems. Sensors and Actuators B. 93: 107– 116. Kahn, M. G., Bailey, T. C., Steib, S. A., Fraser, V. J. and Dunagan, W. C. (1996). SPC for expert system performance monitoring. Journal of American Medical Informatics Association. 3(4): 258-269. Kano, M., Nagao, K., Hasebe, S., Hashimoto, I., Ohno, H., Strauss, R. and Bakshi, B. R. (2000a). Comparison of statistical process monitoring methods: application to the Eastman challenge problem. Computers and Chemical Engineering. 24: 175-181. Kano, M., Segawa, T., Ohno, H., Hasebe, S. and Hashimoto, I. (2000b). Identification of Fault Situations by Using Historical Data Sets. Proceedings of International Symposium on Design, Operation and Control of Next Generation Chemical Plants (PSE Asia 2000). December 6-8. Kyoto, Japan. 345 – 350. 146 Kano, M., Ohno, H., Hasebe, S., Hashimoto, I., Strauss, R. and Bakshi, B. R. (2000c). Contribution plots for fault identification based on dissimilarity of process data. American Institute of Chemical Engineering Annual Meeting. November 12-17. Los Angeles, CA. Unpublished. Kano, M., Nagao, K., Ohno, H., Hasebe, S., Hashimoto, I. (2000d). Dissimilarity of Process Data for Statistical Process Monitoring. Proceedings of IFAC Symposium on Advanced Control of Chemical Processes (ADCHEM). June 14-16. Pisa, Italy: IFAC, 231-236. Kano, M., Hasebe, S., Hashimoto and Ohno, H. (2001a). Fault Detection and Identification Based on Dissimilarity of Process Data. European Control Conference (ECC). September 4-7. Porto, Portugal. 1888-1893. Kano, M., Hasebe, S., Hashimoto, I., Ohno. (2001b). A new multivariate process monitoring method using principal component analysis. Computers and Chemical Engineering. 25: 1103-1113. Kano, M., Nagao, K., Hasebe, S., Hashimoto, I., Ohno, H., Strauss, R. and Bakshi, B. R. (2002). Comparison of multivariate statistical process monitoring methods with application to the Eastman challenge problem. Computers and Chemical Engineering. 26: 161-174. King, C. J. (1980). Separation processes. 2nd Edition. New York: McGraw Hill. Kleinbaum, D. G., Kupper, L. L. and Muller, K. E. (1987). Applied regression analysis and other multivariable methods. Boston: PWS-KENT Publishing Company. Kourti, T., Lee, J. and Macgregor, J. F. (1996). Experiences with industrial applications of projection methods for multivariate statistical process control. Computers chemical engineering. 20: 745-750. 147 Kourti, T. and McGregor, J.F. (1996). Multivariate SPC methods for process and product monitoring. Journal of Quality Technology. 1996. 28 (4): 409–428. Kresta, J., MacGregor, J.F and Marlin, T.E. (1991). Multivariate statistical monitoring of process operating performance. Canada journal chemical engineering. 69: 35-47. Leger, R. P., Garland, Wm. J.and Poehlman, W. F. S. (1998). Fault detection and diagnosis using statistical control charts and artificial neural networks. Artificial Intelligence in Engineering. 12: 35-47. Lennox, B., Goulding, P.R. and Sandoz, D.J. (1999). Analysis of multivariate statistical methods for continuous systems. Computers and Chemical Engineering. 23: 207-210. Lennox, B., Montague, G. A., Hiden, H. G., Kornfeld, G., and Goulding, P. R. (2001). Application of multivariate statistical process control to batch operations. Biotechnology and Bioengineering. 74(2): 125-135. Loong, L. H., and Ibrahim, K. I. (2002). Improved Multivariable Statistical Process Control (MSPC) for Chemical Process Fault Detection and Diagnosis (PFDD)– Cross-Variable Correlation Approach. Regional Symposium on Chemical Engineering, RSCE/SOMChem. 1637-1644. Lopes, J. A. and Menezes, J. C. (1998). Faster Development of Fermentation Processes. Early Stages Process Diagnosis. American Institute of Chemical Engineering Journal. 94( 320): 391-396. Luyben, W. L. (1963). Process Modeling, Simulation, and Control for Chemical Engineer. USA: McGraw Hill. MacGregor, J. F., Jaeckle, C., Kiparissides, C. and Koutoudi, M. (1994). Process Monitoring and Diagnosis by Multi-block Methods. American Institute of Chemical Engineering Journal. 40: 826–838. 148 MacGregor, J.F. and Kourti, T. (1995). Statistical process control of multivariate processes. Control engineering practice. 3(3): 403-414. Marriott, F. H. C. (1974). The Interpretation of Multiple Observations. N.Y.: Academic Press. Martin, E.B., and Morris, A.J. (1996). Non-parametric confidence bounds for process performance monitoring charts. Journal process control. 6(6): 349-358. Martin, E. B., Morris, A.J., Papazoglou, M. C., and Kiparissides, C. (1996). Batch process monitoring for consistent production. Computers Chemical Engineering. 20: 599-604. Martin, E. B., Morris, A.J., and Kiparissides, C. (1999). Manufacturing performance enhancement through multivariate statistical process control. Annual Reviews in Control. 23: 35-44. Mason, R. L. and Young, J. C. (2002). Multivariate statistical process control with industrial applications. U.S.A: Asa-Siam. Miles, J. and Shevlin, M. (2001). Applying regression and correlation: A guide for students and researchers. London: Sage Publication. Miletic, I., Quinn, S., Dudzic, M., Vaculik, V., and Champagne, M. (2004). Review:An industrial perspective on implementing on-line applications of multivariate statistics. Journal of Process Control. 14: 821-836. Misra, M., Yue, H. H., Qin, S. J., and Ling, C. (2002). Multivariate process monitoring and fault diagnosis by multi-scale PCA. Computers and Chemical Engineering. 26: 1281–1293. Montgomery, D. C. (1996). Introduction to Statistical Quality Control. 3rd Edition. Canada: John Wiley & Sons. 149 Nash, J.C. and Lefkovitch, L.P. (1976). Principal component and regression by SVD on a small computer. Apply Statistics. 25(3): 210-216. Nomikos, P. and MacGregor, J. F. (1994). Monitoring of batch processes using Multi-Way Principal Component Analysis. American Institute of Chemical Engineering Journal. 40 (8): 1361 – 1375. Nomikos, P. (1996). Detection and diagnosis of abnormal batch operations based on multi-way principal component analysis. ISA Transactions. 35:25-266. Nong Y., Qiang C., Syed Masum E., and Kyutae N. (2000). Chi-square Statistical Profiling for Anomaly Detection. Proceedings of the 2000 IEEE Workshop on Information Assurance and Security United States Military Academy. June 6-7, 2000. West Point, NY: IEEE. 2000. 187-193. Oakland, J. S. (1996). Statistical Process Control. 3rd Edition. Oxford: Butterworth Heinemann. Ogunnaike, B.A. and Ray, W.H. (1994). Process Dynamics, Modelling and Control. Oxford University Press, Inc. Onat, A., Sansoy, V. and Uysal, O. (1999). Waist circumference and waist-to-hip ratio in Turkish adults: interrelation with other risk factors and association with cardiovascular disease. International Journal of Cardiology. 70: 43–50. Patton, R., Frank, P. and Clark, R. (1989). Fault diagnosis in dynamic systems: Theory and application. UK: Prentice Hall. Quemerais, B., Cossa, D., Rondeau, B., Pham, T. T. and Fortin, B. (1998). Mercury distribution in relation to iron and manganese in the waters of the St. Lawrence river. The Science of the Total Environment. 213: 193-201. Quesenberry, C. P. (1997). Statistical process control methods for quality improvement. New York: John Wiley. 150 Ralston, P., DePuy, G., and Graham, H. (2001). Computer-based monitoring and fault diagnosis: a chemical process case study. ISA transaction. 40: 85-98. Reid, R.C., Prausnitz, J. M. and Poling, B. E. (1987). The properties of gases and liquids. 4th Edition. New York: MacGraw Hill. Ruiz, D., Nougues, J. M., and Puigjaner, L. (2001). Fault diagnosis support system for complex chemical plants. Computers and Chemical Engineering. 25: 151160. Santen, A., Koot, G. L. M., and Zullo, L. C. (1997). Statistical data analysis of a chemical plant. Computers Chemical Engineering. 21: 112-1129. Seborg, D. E., Thomas, C. and Tetsuya, W. (1996). Principal component analysis applied to process monitoring of an industrial distillation column. 13th World IFAC Congress. 1996. San Francisco. Sharma, S. (1996). Applied Multivariate Techniques. USA: John Wiley & Sons. Shirley, M. D. F., Rushton, S. P., Smith, G. C., South, A. B., and Lurz, P. W. W. (2003). Investigating the spatial dynamics of bovine tuberculosis in badger populations: evaluating an individual-based simulation model. Ecological Modelling. 167: 139–157. Shinskey, F. G. (1984). Distillation control for productivity and energy conservation. 2nd Edition. New York: MacGraw Hill. Simoglou, A., Martin, E. B., and Morris, A. J. (2000). Multivariate statistical process control of an industrial fluidised-bed reactor. Control Engineering Practice. 8: 893-909. Singh. R., and Gilbreath, G. (2002). A real time information system for multivariate statistical process control. International Jurnal of Production Economics. 75: 161-172. 151 Smith, J. M., Van Ness, H. C., and Abbott, M. M. (1996). Introduction to chemical engineering thermodynamics. 5th Edition. New York: MacGraw Hill. Souza, A. M., Samohyl, R. W., and Malave, C. O. (2004). Multivariate feed back control: an application in a productive process. Computers & Industrial Engineering. 46: 837–850. Stephanopoulos, G. (1984). Chemical Process Control: An Introduction to Theory and Practice. New Jersey: Prentice Hall. Srivastava, M.S. (2002). Methods of Multivariate Statistics. New York: John Wiley and Sons. Summers, D. C. (2000). Quality. 2nd Edition. Upper Saddle River, N.J.: Prentice Hall. Thorndike, R. M. (1976). Correlational procedures for research. New York: Gardner Press. Tsumoto, S., Hirano, S., Yasuda, A. and Tsumoto, K. (2002). Analysis of aminoacid sequences by statistical technique. Information Sciences. 145: 205–214. Tubb, C., Omerdic, E. (2001). Fault Detection and Handling. Improves technical report FD 001. 1-8. Upadhyaya, B.R., Zhao, K., and Lu, B. (2003). Fault monitoring of nuclear power plant sensors and field devices. Progress in Nuclear Energy. 43: 337-342. Venkatasubramanian, V., Rengaswamy, R., Yin, K., and Kavuri, S. N. (2003a). A review of process fault detection and diagnosis Part I: Quantitative model-based methods. Computers and Chemical Engineering. 27: 293-311. 152 Venkatasubramanian, V., Rengaswamy, and Kavuri, S. N. (2003b). A review of process fault detection and diagnosis Part II: Qualitative models and search strategies. Computers and Chemical Engineering. 27: 313-326. Venkatasubramanian, V., Rengaswamy, R., Kavuri, S. N., and Yin, K. (2003c). A review of process fault detection and diagnosis Part III: Process history based methods. Computers and Chemical Engineering. 27, 327-346. Walas, S. M. (1985). Phase equilibrium in chemical engineering. Butterworth Publisher. Wang, J. C. and Henke, G. E. (1966). Tridiagonal Matrix for Distillation. Hydrocarbon Processing. 45(8): 155 – 165. Wang, S. and Xiao, F. (2004). Detection and diagnosis of AHU sensor faults using principal component analysis method. Energy Conversion and Management. 45: 2667-2686. Wang, H., Song, Z. and Wang, H. (2002). Statistical process monitoring using improved PCA with optimized sensor locations. Journal of Process Control. 12: 735-744. Wetherill, W.B. and Brown, D. (1991). Statistical process control for the process industries. 3rd Edition. London : Chapman and Hall. Wise, B. M., and Gallagher, N. B. (1996). The process chemometrics approach to process monitoring and fault detection. Journal of Process Control. 6: 329-348. Wissemeier, A. H. and Zuhlke, G. (2002). Relation between climatic variables, growth and the incidence of tipburn in filed-grown lettuce as evaluated by simple, partial and multiple regression analysis. Scientia Horticulturae. 93: 193204. 153 Woodall, A. H. (2000). Controversies and contradictions in Statistical Process Control. Journal of Quality Control. 30(4): 341-350. Yoon, S. and MacGregor, J.F. (2000). Statistical and causal model-based approaches to fault detection and isolation. American Institute of Chemical Engineering Journal. 46(9): 1813-1824. Yoon, S. and MacGregor, J.F. (2001). Fault diagnosis with multivariate statistical model part I: Using steady state fault signatures. Journal of process control. 11: 387-400. Yoon, S., Kettaneh, N., Wold, S., Landry, J. and Pepe, W. (2003). Multivariate process monitoring and early fault detection (MSPC) using PCA and PLS. Plant automation and decision support conference. September 21-24, 2003. Washington, DC: National Petrochemical and Refiners Association. 2003. 1-18. Yung, W. K. C. (1996). An integrated model for manufacturing process improvement. Journal of Materials Processing Technology. 61: 39-43. 154 APPENDIX A COLUMN PROFILE COMPARISON A.1 Comparisons on temperature Stage 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 HYSYS 174.4598 176.0961 189.1816 216.3445 216.3445 216.3445 216.3445 216.3445 216.3445 216.3445 216.3445 218.5873 225.0159 229.3237 231.0301 231.0301 231.0301 231.0301 231.0301 231.0301 231.0301 231.0301 231.0301 232.416 235.0202 235.0202 235.0202 238.3387 238.3387 MATLAB 182.6361 186.668 193.2194 197.806 202.2502 207.1006 211.4466 214.4575 216.2516 217.3226 218.0954 218.9426 220.5438 225.9756 227.3566 228.0729 228.5256 228.8687 229.1604 229.4247 229.6729 229.905 230.1275 230.3494 230.5759 230.8451 231.2717 232.4103 234.9 % Difference 4.686 6.003 2.134 -8.568 -6.514 -4.272 -2.263 -0.872 -0.042 0.452 0.809 0.162 -1.987 -1.459 -1.590 -1.279 -1.084 -0.935 -0.809 -0.694 -0.587 -0.486 -0.390 -0.889 -1.891 -1.776 -1.594 -2.487 -1.442 155 A.2 Comparisons on pressure Stage 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 HYSYS 0.1067 0.1076 0.1086 0.112717 0.11276 0.112802 0.112888 0.112974 0.11306 0.113403 0.114089 0.116833 0.119578 0.122322 0.122351 0.122379 0.122408 0.122494 0.12258 0.122665 0.123008 0.123694 0.125067 0.127811 0.1333 0.1333 0.1333 0.1333 0.1333 MATLAB 0.1088 0.1098 0.1107 0.1117 0.1127 0.1136 0.1146 0.1156 0.1166 0.1175 0.1185 0.1195 0.1204 0.1214 0.1224 0.1233 0.1243 0.1253 0.1262 0.1272 0.1282 0.1291 0.1301 0.1311 0.1321 0.133 0.134 0.135 0.1359 % Difference 1.968 2.044 1.933 -0.901 -0.052 0.707 1.516 2.324 3.131 3.612 3.866 2.282 0.687 -0.753 0.040 0.752 1.545 2.290 2.953 3.696 4.220 4.370 4.024 2.573 -0.900 -0.225 0.525 1.275 1.950 156 A.3 Comparisons on distillate composition Distillate Stream n-hexanoic acid n-octanoic acid n-decanoic n-dodecanoic n-tetradecanoic n-hexadecanoic stearic acid oleic acid linoleic acid A.4 HYSYS 0.02148 0.51648 0.39755 0.01448 1.97E-06 6.72E-10 1.43E-14 2.53E-13 2.98E-13 MATLAB 0.02164 0.52339 0.41340 0.01394 0 0 0 0 0 % Difference 0.780 1.337 3.986 -3.742 0 0 0 0 0 MATLAB 0 0 0.00359 0.57694 0.17371 0.07521 0.01611 0.13497 0.02269 % Difference 0.00 0.00 0.389 0.902 -0.220 -0.170 -0.504 -0.212 -0.208 Comparisons on bottom composition Bottom Stream n-hexanoic acid n-octanoic acid n-decanoic n-dodecanoic n-tetradecanoic n-hexadecanoic stearic acid oleic acid linoleic acid HYSYS 3.32E-14 2.18E-07 0.0035 0.5717 0.1740 0.0753 0.0162 0.1352 0.0227 157 APPENDIX B PARAMETER SELECTION TO DESIGN EWMA CHART B.1 Parameter Selection for EWMA Chart via PCA The out of control data was generated during observation 10th. The pumparound flowrate was increased from 49.6 kmol/hr to 49.9 kmol/hr. Figures B1 to B5 show the EWMA charts using different width of the control limit, L and the weighting factor, β . Figure B1: EWMA chart for xc8, xc9 and P using L=3.054 and β =0.4 158 Figure B2: EWMA chart for xc8, xc9 and P using L=2.998 and β =0.25 Figure B3: EWMA chart for xc8, xc9 and P using L =2.962 and β =0.20 Figure B4: EWMA chart for xc8, xc9 and P using L =2.814 and β =0.10 159 Figure B5: EWMA chart for xc8, xc9 and P using L =2.615 and β =0.05 B.2 Parameter Selection for EWMA Chart via PCorrA The out of control data was generated during observation 10th. The pumparound flowrate was increased from 49.6 kmol/hr to 49.9 kmol/hr. Figure B6 to B10 show the EWMA charts using different width of the control limit, L and the weighting factor, β . Figure B6: EWMA chart for xc8, xc9 and P using L =3.054 and β =0.4 160 Figure B7: EWMA chart for xc8, xc9 and P using L =2.998 and β =0.25 Figure B8: EWMA chart for xc8, xc9 and P using L =2.962 and β =0.20 Figure B9: EWMA chart for xc8, xc9 and P using L =2.814 and β =0.10 161 Figure B10: EWMA chart for xc8, xc9 and P using L =2.615 and β =0.05 162 APPENDIX C PARAMETER SELECTION TO DESIGN MAMR CHART C.1 Parameter Selection for MAMR Chart via PCA The out of control data was generated during observation 10th. The feed flowrate was increased from 41.557 kmol/hr to 51 kmol/hr. The control limits is calculated using PCA method. Figures C1 to C6 show the MAMR charts using different window size. xC8 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 Number of observation 70 80 90 100 F mr 10 5 0 xC8 ma 2 0 -2 F ma 5 0 -5 Figure C1: MAMR chart for xc8 and F using window size of three 163 xC9 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 60 50 40 Number of observation 70 80 90 100 F mr 10 5 0 xC9 ma 2 0 -2 F ma 5 0 -5 Figure C2: MAMR chart for xc9 and F using window size of three xC8 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 60 50 40 Number of observation 70 80 90 100 F mr 10 5 0 xC8 ma 2 0 -2 F ma 5 0 -5 Figure C3: MAMR chart for xc8 and F using window size of four 164 xC9 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 60 50 40 Number of observation 70 80 90 100 F mr 10 5 0 xC9 ma 2 0 -2 F ma 5 0 -5 Figure C4: MAMR chart for xc9 and F using window size of four xC8 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 60 50 40 Number of observation 70 80 90 100 F mr 10 5 0 xC8 ma 2 0 -2 F ma 5 0 -5 Figure C5: MAMR chart for xc8 and F using window size of five 165 xC9 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 60 50 40 Number of observation 70 80 90 100 F mr 10 5 0 xC9 ma 2 0 -2 F ma 5 0 -5 Figure C6: MAMR chart for xc9 and F using window size of five C.2 Parameter Selection for MAMR Chart via PCorrA The out of control data was generated during observation 10th. The feed flowrate was increased from 41.557 kmol/hr to 51 kmol/hr. The control limits is calculated using PCorrA method. Figures C7 to C12 show the MAMR charts using different window size. 166 xC8 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 Number of observation 70 80 90 100 F mr 10 5 0 xC8 ma 2 0 -2 F ma 4 2 0 -2 Figure C7: MAMR chart for xc8 and F using window size of three xC9 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 60 50 40 Number of observation 70 80 90 100 F mr 10 5 0 xC9 ma 2 0 -2 F ma 4 2 0 -2 Figure C8: MAMR chart for xc9 and F using window size of three 167 xC8 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 Number of observation 70 80 90 100 F mr 10 5 0 xC8 ma 2 0 -2 F ma 2 0 -2 Figure C9: MAMR chart for xc8 and F using window size of four B B xC9 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 Number of observation 70 80 90 100 F mr 10 5 0 xC9 ma 2 0 -2 F ma 2 0 -2 Figure C10: MAMR chart for xc9 and F using window size of four B B 168 xC8 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 Number of observation 70 80 90 100 F mr 10 5 0 xC8 ma 2 0 -2 F ma 2 0 -2 Figure C11: MAMR chart for xc8 and F using window size of five B B xC9 mr 10 5 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 Number of observation 70 80 90 100 F mr 10 5 0 xC9 ma 2 0 -2 F ma 2 0 -2 Figure C12: MAMR chart for xc9 and F using window size of five B B 169 APPENDIX D NUMBER OF FAULT CAUSES D.1 Number of fault causes for single fault and multiple faults Table D1 to Table D4 listed the number of known fault, which exactly diagnosed, partially diagnosed, over diagnosed and undiagnosed for single and multiple faults. The results are arranged according to different Improved SPC chart used corresponding with multivariate analysis technique applied to construct the control limits. 170 Table D1: Fault diagnosis result using PCA technique for oleic acid composition Fault Diagnosis Results OC Condition Shewhart EWMA MAMR 1. Exact fault cause diagnosis • Known 1 fault cause situation 49 31 21 • Known 2 fault cause situation 11 7 4 • Known 3 fault cause situation 10 6 3 70 44 28 Total 2. Partially fault cause diagnosis • Known 1 fault cause situation 0 0 0 • Known 2 fault cause situation 0 1 3 • Known 3 fault cause situation 5 8 12 5 9 15 Total 3. Over diagnosed fault cause • Known 1 fault cause situation 0 0 0 • Known 2 fault cause situation 0 0 0 • Known 3 fault cause situation 0 0 0 0 0 0 Total 4. Undiagnosed fault cause • Known 1 fault cause situation 1 19 29 • Known 2 fault cause situation 4 7 8 • Known 3 fault cause situation 0 1 0 5 27 37 Total 171 Table D2: Fault diagnosis result using PCorrA technique for oleic acid composition OC Condition Fault Diagnosis Results Shewhart EWMA MAMR 1. Exact fault cause diagnosis • Known 1 fault cause situation 50 33 24 • Known 2 fault cause situation 15 13 13 • Known 3 fault cause situation 15 14 14 80 60 51 Total 2. Partially fault cause diagnosis • Known 1 fault cause situation 0 0 0 • Known 2 fault cause situation 0 0 0 • Known 3 fault cause situation 0 0 0 0 0 0 Total 3. Over diagnosed fault cause • Known 1 fault cause situation 0 0 0 • Known 2 fault cause situation 0 0 0 • Known 3 fault cause situation 0 0 0 0 0 0 Total 4. Undiagnosed fault cause • Known 1 fault cause situation 0 17 26 • Known 2 fault cause situation 0 2 2 • Known 3 fault cause situation 0 1 1 0 20 29 Total 172 Table D3: Fault diagnosis result using PCA technique for linoleic acid composition Fault Diagnosis Results OC Condition Shewhart EWMA MAMR 1. Exact fault cause diagnosis • Known 1 fault cause situation 49 31 24 • Known 2 fault cause situation 11 6 3 • Known 3 fault cause situation 10 6 2 70 43 29 Total 2. Partially fault cause diagnosis • Known 1 fault cause situation 0 0 0 • Known 2 fault cause situation 0 2 4 • Known 3 fault cause situation 5 9 13 5 11 17 Total 3. Over diagnosed fault cause • Known 1 fault cause situation 0 0 0 • Known 2 fault cause situation 0 0 0 • Known 3 fault cause situation 0 0 0 0 0 0 Total 4. Undiagnosed fault cause • Known 1 fault cause situation 1 19 26 • Known 2 fault cause situation 4 7 8 • Known 3 fault cause situation 0 0 0 5 26 34 Total 173 Table D4: Fault diagnosis result using PCorrA technique for linoleic acid composition OC Condition Fault Diagnosis Results Shewhart EWMA MAMR 1. Exact fault cause diagnosis • Known 1 fault cause situation 50 34 28 • Known 2 fault cause situation 15 13 13 • Known 3 fault cause situation 15 15 15 80 62 56 Total 2. Partially fault cause diagnosis • Known 1 fault cause situation 0 0 0 • Known 2 fault cause situation 0 0 0 • Known 3 fault cause situation 0 0 0 0 0 0 Total 3. Over diagnosed fault cause • Known 1 fault cause situation 0 0 0 • Known 2 fault cause situation 0 0 0 • Known 3 fault cause situation 0 0 0 0 0 0 Total 4. Undiagnosed fault cause • Known 1 fault cause situation 0 16 22 • Known 2 fault cause situation 0 2 2 • Known 3 fault cause situation 0 0 0 0 18 24 Total