Optimum Feature Selection and Extraction for Fault Diagnosis and Prognosis Abhinav Saxena and George Vachtsevanos Intelligent Control Systems Laboratory, School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta GA 30332 0250 computational complexity issues. Parallel processing of multiple sensors and features are accomplished efficiently and effectively through this platform. Such parallel processing and pipelining capabilities of this new processing tool promise to facilitate and expedite the implementation of diagnostic, prognostic and control technologies on-board aircraft. In the following section we will describe the feature extraction and selection process. We will also point out various areas where improvement is desired. Then we will introduce the cell processing environment followed by how it can be used for feature extraction. We will conclude with future directions in making use of this processor. Abstract— Fault diagnosis and failure prognosis of critical dynamic systems, such as aircraft and industrial processes, rely on degradation or fatigue models and measurements typically acquired on-line in real-time. Such sensor data must be pre-processed in order to remove artifacts and improve the signal-to-noise ratio. Furthermore, they must be processed appropriately so that useful information in compact form can be extracted and used to detect incipient failures and predict the remaining useful life of failing components. We present a methodology to select an optimum feature vector from a list of candidate features, prioritize and rank them to meet set performance objectives. The enabling technologies include genetic programming tools, data fusion and model-based approaches for feature selection and extraction. We will suggest a multi-core processing environment for the efficient and expedient implementation of these technologies. Performance metrics are defined to assess the efficacy of the methodology. Typical examples from aircraft systems are used to demonstrate the proposed techniques. T II. Feature Selection and Extraction I. Introduction his paper introduces a hybrid hardware/software realization for a fundamental problem in the area of Prognostics and Health Management (PHM) systems: How do we process raw sensor data on-board an aircraft in order to detect incipient failures of critical system components/subsystems and predict the remaining useful life of such failing components? The extraction of useful information from raw data (vibrations, temperature, usage patterns, etc.) constitutes the cornerstone for accurate and timely fault diagnosis and failure prognosis. When multiple sensors are monitored and a large vector of features or Condition Indicators are to be extracted, computational resources are taxed severely for on-line, real-time applications. The feature extraction problem is elaborated and examples are presented. A multi-core cell processor environment is suggested in order to address the A. Saxena is a PhD student in the school of Electrical and Computer Engineering at Georgia Institute of Technology, Atlanta, GA 30332 USA (Phone : 404-457-2756; fax: 404-894-4130; e-mail: asaxena@ece.gatech.edu). G. Vachtsevanos, is a professor in school of Electrical and Computer Engineering at Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: gjv@ece.gatech.edu). Feature or condition indicator selection and extraction constitute the cornerstone for accurate and reliable fault diagnosis. The classical image recognition and signal processing paradigm of data information knowledge becomes most relevant and takes central stage in the fault diagnosis case, particularly since such operations must be performed on-line in a real-time environment [1]. Figure 1 depicts a conceptual schematic for feature extraction and fault mode classification. Figure 1 A general scheme for feature extraction and fault mode classification. Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. In many cases, this is considered a data preprocessing step. 102 When performing analysis of complex data one of the major problem stems from the number of variables involved. Analysis with a large number of variables generally requires a large amount of memory and computational power or a classification algorithm which overfits the training sample and generalizes poorly to new samples. Feature extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy. A diagnostic feature is a system parameter (or derived system parameter) that is sensitive to the functional degradation of one or more components contained in the system. Diagnostic features can be used to predict the occurrence of an undesirable system event or failure mode. Fault diagnosis depends mainly on extracting a set of features from sensor data that can distinguish between fault classes of interest, detect and isolate a particular fault at its early initiation stages. These features should be fairly insensitive to noise and within fault class variations. The latter could arise because of fault location, size, etc. in the frame of a sensor. “Good” features must have the following attributes: • Computationally inexpensive to measure • Mathematically definable • Explainable in physical terms • Characterized by large interclass mean distance and small interclass variance • Insensitive to extraneous variables • Uncorrelated with other features Much attention has focused over the past years on the feature extraction problem whereas feature selection has relied primarily on expertise, observations, past historical evidence, and our understanding of fault signature characteristics. In selecting an “optimum” feature set, we are seeking to address such questions as: Where is the information? How fault (failure) mechanisms relate to the fundamental “physics” of complex dynamic systems? Fault modes may induce changes in: • The energy (power) of the system • Its entropy • Power spectrum • Signal magnitude • Chaotic behavior • Other Feature selection is application dependent. We are seeking those features, for a particular class of fault modes, from a large candidate set that possess properties of fault distinguishability and detectability while achieving a reliable fault classification in the minimum amount of time. Feature extraction, on the other hand, is viewed as an algorithmic process where, given sensor data, features are extracted in a computationally efficient manner while preserving the maximum information content. Thus, the feature extraction process converts the fault data into an N -dimensional feature space, such that one class of faults is clustered together and can be distinguished from other classes. However, in general, not all faults of a class need N features to form a compact cluster. It is only the faults that are in the overlapping region of two or more classes that govern the number of features required to perform classification. Classically, feature selection is carried out by ranking the features on the basis of various metrics. These metrics include [1]: • Distinguishability: quantifies a feature’s ability to differentiate between various classes by finding the area of the region in which the two classes overlap. The smaller the area, the higher the ability of the feature to distinguish between the classes. • Detectability or isolatability: is the extent to which the diagnostic scheme can detect the presence of a particular fault; it relates to the smallest failure signature that can be detected and the percentage of false alarms. • Identifiability tracks the similarity of features as they identify a fault mode k but also distinguishing it from other fault classes, and • Degree of certainty combines the body of evidence collected from all other metrics until a desired level of false positives and false negatives is obtained Typical features or Condition Indicators (CIs) in the time domain may include peak values, rms, energy, kurtosis, etc. in the frequency domain; we focus primarily on features for rotating equipment that exhibit a marked difference between baseline or no-fault and faulty data [1]. For example, we seek in this category a comparison (amplitude, energy, etc.) of certain sidebands to dominant frequencies, when the sensor signals are transformed via an FFT routine to the frequency domain [2]. Other possible features are extracted through coherence and correlation calculations. When the information is shared between the time and frequency domain, it might be advantageous to extract features in the wavelet domain, offering an appropriate tradeoff between the two domains. When multiple features are extracted for a particular fault mode, it might be desirable to combine or fuse uncorrelated features to enhance the fault detectability. There are several methods for feature fusion at different levels. The most popular methods include Bayesian Inference, Dempster-Shafer fusion, Weighting/voting scheme, Fuzzy Logic inference, and Neural Network fusion. In past studies, we exploited also the functions of genetic programming in order to derive artificial features from baseline ones that meet prescribed performance metrics. In a Genetic Programming based feature fusion approach we define an appropriate fitness function and use genetic 103 operators to construct new feature populations from old ones. These features are non-linear combinations of multiple features and perform better than individual features [3]. III. Multi-Core Cell Processing Environment The Cell processor is a microprocessor architecture jointly developed by Sony, Toshiba, and IBM, an alliance known as "STI". As shown in Figure 2, the Cell processor can be broadly divided into four components [4,5]: (1) external input and output structures, (2) the main processor called the Power Processing Element (PPE) (a two-way simultaneous multithreaded Power ISA v.2.03 compliant core), (3) eight fully-functional co-processors called the Synergistic Processing Elements (SPE), and (4) a specialized high-bandwidth circular data bus connecting the PPE, input/output elements and the SPEs, called the Element Interconnect Bus or EIB. The advantage of the Cell processor lies in its capability to achieve high performance in mathematically intensive tasks such as decoding/encoding MPEG streams, generating or transforming three dimensional data, or undertaking Fourier analysis of data by an efficient allocation of its processing and communication resources. The PPE, which is capable of running a conventional operating system, has control over the SPEs and can start, stop, interrupt and schedule processes running on the SPEs. Despite having Turing complete architectures, the SPEs are not fully autonomous and require the PPE to initiate them before they can do any useful work. Therefore, the PPE acts as a manager and most of the "horsepower" of the system comes from the SPEs. Both the PPE and SPE are RISC architectures with a fixed-width 32-bit instruction format. Although the Cell was developed for high speed gaming applications (Sony Playstation), its capabilities have led to explorations in blade servers, home cinema, supercomputing and cluster computing, etc. [6] Recently Cell-based computer systems have been developed for embedded applications such as medical imaging, industrial inspection, aerospace and defense, seismic processing, and telecommunications [7]. We intend to explore Cell capabilities for onboard real-time data processing for enhanced diagnostics and prognostics for aircraft systems. [1] Cell Processing for Feature Extraction In a recent study, we identified how cell capabilities can be appropriately utilized for various steps of a Diagnostics and Prognostics system [8]. From all of these steps, the feature extraction step can itself be carried out in different ways. More specifically, the feature extraction process can utilize Cell capabilities in three different ways, as depicted in Figure 3. Figure 3 Utilizing Cell capabilities in feature extraction. Figure 2 Cell architecture schematic. 104 Features or Condition Indicators may be extracted from a variety of data and sensors. Some of these features are computationally inexpensive, such as peak value, rms, energy, etc., while others may involve considerable computations. For example, features that are based on chaotic indicators, such as Lyapunov exponents, dimension, etc., derived from a system that exhibits chaos, typically entail a large computational cost. Similarly, derived features employing genetic programming tools may belong to the same category. It is desirable, therefore, to view the processing of multiple sensors as well as the estimation of multiple features via a parallel processing architecture that optimizes resource requirements and affords a real time solution to the problem. Pipelining capabilities of the cell processor can be exploited when the outputs of the feature extraction module are fused or used for classification purposes. The problem can be set up as a constrained optimization problem aimed at minimizing time while maximizing processor utility. IV. Preliminary Results Given the unique architecture of the Cell processor, it has been a challenge to write algorithms that can utilize this technology to its full potential. Therefore, our research team has focused on two fronts; one, to port already existing single processor algorithms on the Cell and two, to conduct research for developing efficient parallel counterparts of these CBM/PHM algorithms. In the initial phase we focused on developing parallel algorithms for Fast Fourier Transforms (FFTs) that form the basis of most signal processing techniques. Our research partners have successfully demonstrated the effectiveness of their FFT algorithms in achieving the speedup of as much as upto four times the other implementations and architectures (Figure 4) [9]. Cell @ GT (8 SPEs) We are also working on parallelizing other algorithms that will pave the way of an overall onboard parallelized architecture suitable for real-time aircraft applications. V. Conclusions and Future Directions In this paper we have described our efforts in exploring the high performance capabilities of the Cell platform for computationally expensive feature extraction algorithms for on-board diagnostics and prognostics. We have identified multiple ways in which the Cell can be employed depending on computational demands in specific cases. We also intend to explore possibilities for maximizing the Cell utilization through intelligent resource allocation in a constrained optimization problem setting. However, there are few limitations that one needs to keep in mind before expecting the Cell to perform. The architecture emphasizes efficiency/watt, prioritizes bandwidth over latency, and favors peak computational throughput over simplicity of program code [10]. For these reasons, the Cell is widely regarded as a challenging environment for software development. Even though several research teams are building software libraries for the Cell platform, the full utilization of its capabilities is just a beginning to be realized. Software adoption remains a key issue in whether the Cell ultimately delivers on its performance potential. Despite those challenges, research has indicated that Cell excels at several types of scientific computation. We are collaborating with multiple such research teams to explore ways in which its capabilities can be utilized to its maximum potential on an aircraft. Acknowledgment We gratefully acknowledge the support from AveTec Corp. We also thank our collaboration partner teams from Pratt & Whitney, Hartford CT, the University of Connecticut, Hartford and the College of Computing at Georgia Tech, Atlanta. References InItel Core Duo FFTW on Cell Intel Pentium 4 AMD Opteron IBM Power 5 Source: Bader et al. (2007) Figure 4 FFT implementation performance comparisons with other architectures and Cell implementations [9]. [2] Vachtsevanos, G., F.L. Lewis, M. Roemer, A. Hess, and B. Wu (2006), Intelligent Fault Diagnosis and Prognosis for Engineering Systems: Wiley (September 29, 2006). (Pages 456) [3] Wu, B., Saxena, A., Patrick, R. and Vachtsevanos, G., “Vibration Monitoring for Fault Diagnosis of Helicopter Planetary Gears,” Proceedings of the 16th IFAC World Congress, July 2005. [4] Wiggins, M.C., Firpi, H.A., Gerstenfeld, E.P., Vachtsevanos, G., and Litt, B., "Electrogram Changes Precede Atrial Fibrillation After Coronary Artery Bypass Graft", Computers in Cardiology, September 17 -20, 2006. [5] Chen, T, Raghvan, R, Dale, J., Iwata, E., “Cell Broadband Engine Architecture and its first implementation”. IBM developerWorks (Nov 29, 2005). Retrieved on 2007-09-05 105 [6] [7] [8] [9] [10] [11] http://www-128.ibm.com/developerworks/power/library/pacellperf/ J Kahle, J.A., Day, M.N., Hofstee, H.P., Johns, C.R., Maeurer, T.R., and Shippy, D., "Introduction to the Cell Multip rocessor", IBM Journal of Research and Development, 200508-07. Retrieved on 2007-07-23. http://researchweb.watson.ibm.com/journal/rd/494/kahle.html Williams, S., Shalf, J., Oliker, L., Kamil, S., Husbands, P., and Yelick, K., “The Potential of the Cell Processor for Scientific Computing.” Computational Research Division, Lawrence Berkeley National Laboratory. Retrieved on 2007-09-05. http://www.cs.berkeley.edu/~samw/projects/cell/CF06.pdf "Mercury Wins IBM PartnerWorld Beacon Award", Supercomputing Online, 2007-04-12. Retrieved on 2007-0915. http://www.supercomputingonline.com/article.php?sid=13477 A. Saxena, S. Kang, " A Cell BE Processor Application for Real-time Fault Diagnosis and Failure Prognosis of Aircraft Systems" Georgia Tech, Sony/Toshiba/IBM Workshop on Software and Applications for the Cell/B.E. processor, Atlanta, GA June 18-19, 2007. D.A. Bader, V. Agarwal, "FFTC: Fastest Fourier Transform on the IBM Cell Broadband Engine", 14th IEEE International Conference on High Performance Computing (HiPC 2007), Goa, India, December 18-21, 2007 Shankland, Stephen. "Octopiler seeks to arm Cell programmers", CNET, 2006-02-22. Retrieved on 2007-07-28. http://news.com.com/Octopiler+seeks+to+arm+Cell+program mers/2100-1007_3-6042132.html 106