Unsupervised classification of ….. using autoencoder neural networks Goran Kvaščev, Maja Gajić-Kvašćev, Velibor Andrić, Željko Đurović Abstract — Noise reduction has always been important part of any control, acquisition or processing task. In order to increase the usage of some smaller and cheaper, but on the other hand less precise sensor solutions, it is necessary to incorporate some signal processing techniques for noise reduction. Nowadays soft computing techniques such as neural networks are widely used in many signal processing applications and provide very good results. In this paper, an approach to noise reduction by using autoassociative neural networks is described. The main idea is to use more precise, therefore more expensive sensor, for network training, and afterword use this network for less precise and cheaper sensor signal processing. So, when the network is formed, it is possible to use less precise sensor and use network for noise reduction. This would ensure noise reduction in less precise sensor signal. With this signal processing tool, less precise sensors could be used in desired applications. When comparing the results obtained by using autoassociative neural networks with results obtained by using digital filters, the obvious advantage is that neural networks do not bring delay into the system like filters do. All described simulations and data processing is performed in Matlab and Simulink. Abstract — Nowadays soft computing techniques are widely used for classification and pattern recognition, to improve speed, accuracy and to lower the cost of analysis. In this paper it will be present a comparative analysis of the use of radial basis neural networks and autoassociative neural networks for the classification and determination of pigment origin in the field of cultural heritage. Radial basis neural networks will be applied to filter input data (x-ray spectra) and feature extraction, in order to carry out the dimension reduction procedure for efficient classification. The second approach involves the direct application of autoassociative neural networks to raw data (x-ray spectra) and classification. Such networks, in the training process, reduce the dimensions and select those features that carry the greatest amount of information, in order to maximize the overlap of input and output data which are the same x-ray spectra. Results, analyzed accuracy, speed and application of both techniques will be presented. Key words — Autoassociative neural network, radial basis neural network, feature extraction, dimension reduction, classification. Nataša Kljajić – PhD studies at School of Electrical Engineering, University of Belgrade, Bulevar Kralja Aleksandra 73, 11020 Belgrade, Serbia, works at Military Technical Institute, Belgrade, Ratka Resanovića 1 (e-mail: natasha.kljajic@ yahoo.com). I. INTRODUCTION Noise cancellation, as well as fault detection and correction has always been significant part of any control, acquisition or processing task. In control systems that use any kind of navigation system or have inertial navigation units, such as missile, UAV (Unmanned Aerial Vehicle), navigation system for car, or even a mobile phone, sensors such as gyroscopes are used. Some cheaper, smaller sensors that use less power are more attractive solution for any application military or commercial, but the problem could be noise, especially if it integrates over time (Angular Random Walk) with units of deg/hour and manufacture limitations due to noise. There are many different approaches to reduction techniques. Due to real time processing capabilities, and possibility of massive parallel computation, neural networks can be used for optimization of adaptive filter’s coefficients, as it is described in [1], [2]. There are also some interesting new approaches based on neural networks and biological systems, such as described in [3]. Very interesting approach found in literature is using autoassociative neural networks. Autoassociative neural networks are used, because these networks perform functional mapping, and its outputs are not discrete classes, but continuous variables [5]. In this paper, an approach to noise reduction by using autoassociative neural networks is described. The main idea is to use more precise, therefore more expensive sensor for network training, and afterword use this network for less precise and cheaper sensor signal processing. So, when the network is formed, it is possible to use less precise gyroscope and use network for noise reduction. This would ensure noise reduction in less precise sensor signal. With this signal processing tool, less precise sensors could be used in desired applications. In order to obtain good results, it is necessary to have apriori information about the tested signal. This is not condition that is hard to achieve, since it is always necessary, especially during any system development phase, to know what the expected signal is. The last chapter compares the results obtained by using autoassociative neural networks with the results obtained by filtering with some typical noise reduction filters. The experiments show that the proposed autoassociative neural networks design gives very impressive results., Goran Kvaščev – Elektrotehnički fakultet, Univerzitet u Beogradu, Bulevar Kralja Aleksandra 73, 11020 Belgrade, Serbia (e-mail: kvascev@ etf.bg.ac.rs) especially in comparison with digital filters that always bring delay into the system, which is not the case when using neural networks. II. AUTOASSOCIATIVE NEURAL NETWORK ARCHITECTURE Artificial neural networks have originated from studies of the human brain that operates in a massively parallel mode based on highly interconnected processing units or neurons. The same principle is used with neural networks. Neural networks are therefore simply computers or computational structures consisting of large numbers of primitive process units connected on a massively parallel scale that have ability to form generalized representations of complex relationships and data structures. [6]. Standard autoassociative neural networks are feedforward neural networks with the identity function where network inputs should be equal to network outputs. If a network learned the identity function exactly, it would have no value, since there would be no transformation of the data. That is why autoassociative neural networks are used, because they contain an internal constraint that prevents them from learning the identity mapping perfectly. This constraint is a bottleneck layer that represents a hidden layer nodes that are smaller in dimension then input and output. [5]. This is the idea that lies on a PCA (Principal Component Analysis) and NPCA (Nonlinear Principal Component Analysis), which is a technique for mapping multidimensional data into lower dimensions with minimal loss of information [7]. These networks contain input layer, hidden layers: mapping layer, bottleneck layer, demapping layer and output layer as we can see in Figure 1 that shows Autoassociative Neural Network architecture. The dimension of the bottleneck layer is required to be the smallest in the network. For mapping and demapping layers, nonlinear sigmoidal functions are used. The bottleneck plays the key role by forcing internal compression of the inputs, so that most important piece of information about the signal is preserved, and noise and other faults in signal are removed. The training of the autoassociative network is accomplished by using backpropagation algorithm [5]. Figure. 1. Autoassociative Neural Network Architecture III. AUTOASSOCIATIVE NEURAL NETWORK DESIGN In this chapter autoassociative neural network design process is described. First part represents data preparation for neural network, such as sensor data simulation. Then sensor signal processing techniques are described. Next part is autoassociative neural network training. The last part of this chapter shows testing results when autoassociative neural networks are used in noise reduction, as well as result when some unexpected values appear as sensor signal. A. Data preparation and signal processing In this paper, simulated gyroscope data is used. Simulation is performed in Matlab and Simulink. First signals data base is formed out of various signals similar to the ones we expect from our sensors, with sample time 0.01s and duration of 5s, and these data are used in Simulink model. There is a block function in Simulink that represents three-axis gyroscope. Various gyro parameters can be set, such as natural frequency, damping ratio, scale factors and cross-coupling, measurement bias, G-sensitive bias and update rate and also noise parameters like noise power. By setting noise power to lower value for the more precise gyroscope measurement, and higher value for less precise gyroscope measurement, we will have simulation of two gyroscopes, one more and one less precise and this data will be used for our neural network training and testing. The training and target data consist of three signals, mutually correlated, since this is a condition for network to work properly, because the idea of network architecture is to find correlation between signals, while noise represents non correlated part of signal information that the network should not consider as an important part of signal. Usually, when developing new system, information about expected values of signals is available. If we this information is used when designing a network, depending on how much information is available, our network will provide us with better results. B. Signal processing In order to have proper input, target and testing data for our neural network design and validation, all data should be preprocessed. So, data base is made for input data, target data and also test data. Afterwards, data normalization is performed. Data base of input data, for neural network training, consists of more precise gyroscope data, where noise is added in some time frames from less precise gyroscope data. This is accomplished by using custom made function that adds noise in frames we specify. Three signals with added noise are used and input matrix is made out of these three signals. Afterwards, data normalization is performed as a data preparation for neural network. So, the result is a normalized input data matrix with three different signals with added noise, with sample time 0.01s and 5s duration. In similar way, target matrix is made. For each input signal there is target vector from more precise gyroscope. So, for input data matrix target data matrix is made, also with normalized data. Test data are signals taken from less precise gyroscope. Since noise filtering in autoassociative network depends on redundancy [5], there are three measurements of the test signal, and test data matrix is made out of them. Network is supposed to reduce noise in this sensor’s measurements. Since all data should be normalized in a preparation step before neural network training and testing, custom made function is used for this matter. Normalization range is [-1, 1], in order to simplify the training process [4]. Normalization consists of maximum and minimum data calculation in matrix prepared for network and element-wise multiplication of matrix and reciprocal value of maximum, minimum, for data greater than zero or less than zero, respectively. Maximum and minimum values are in range [-10, 10], since this is proper range for voltage data measurement, therefore our signals will be in this range. C. Autoassociative Neural Network Training This step consists of design and training of autoassociative neural network. A feedforward neural network with one input and output (one matrix with three columns, where each column is one signal) and three or more hidden layers is made. For transfer functions of nodes tansig - symmetric sigmoid transfer function is used. As a training function trainlm - LevenbergMarquardt backpropagation network training function that updates weight and bias states according to LevenbergMarquardt optimization algorithm is used. Mean squared normalized error performance function (MSE) is used as a performance function. For training, input data sets and target data sets are used. When training neural network it is important to set maximum number of failures (cross-validation errors), as well as number of iterations, since it is necessary to avoid network overfitting [8], which would lead to network design incapable of generalization, that will not give good results. While training the network, plots of performance, regression and training state show how network is trained and if the training was properly performed. D. Autoassociative Neural Network Testing For network testing test data set is used with various neural network designs. First, different designs are tested in order to find optimal number of hidden layers. Results showed that network works better with more than three hidden layers. The optimal design for this case scenario is a “3 5 7 3 7 5 3” design, with five hidden layers. Testing results are shown for the optimal network design.. In Figure 2, the results of testing with neural network of a design “3 5 7 3 7 5 3” are shown. Figure. 2. Testing data and NN noise reduction data with “3 5 7 3 7 5 3” design Figure. 2. Testing data and NN noise reduction data with “3 5 7 3 7 5 3” design Figure. 2. Testing data and NN noise reduction data with “3 5 7 3 7 5 3” design Figure. 2. Testing data and NN noise reduction data with “3 5 7 3 7 5 3” design network more information about input signals will result in better noise reduction. As for the fault data replacement unexpected data replacement, these networks can obtain very good results. Figure. 3. Testing data and NN noise reduction data with “3 5 7 3 7 5 3” design and additional signal information Figure. 4. Testing data and NN with “3 5 7 3 7 5 3” design and unexpected sensor signal values Figure. 2. One of original and reconstructed spectra from dataset. Figure. 2. Testing data and NN noise reduction data with “3 5 7 3 7 5 3” design E. Noise reduction with digital filters In order to compare the results obtained by using neural networks with some commonly used noise reduction techniques digital filters are used. Noise reduction was performed with FIR, Butterworth, Moving Average, Moving weighted, Smooth, Gaussian and Median filter. Results obtained with FIR filter of the order 25 are shown in Figure 5. It is easy to see the impact of filter delay, which is equal to order of filter. By using some other filters that smoothen the signal (Figure 6) where the delay is not so obvious like in previous example (but still exists), some other problems are present, such as the shape of the signal is changed. In this case signal is sharpened, while the real signal is of a rectangular shape, whereas with neural network filtering that is not the case. Low pass Butterworth filter preserves the shape of the signal, and also has delay close to zero in the passband, but on the other hand the results of filtering itself are not very good, even in comparison with other filters, as it can be seen in Figure 7. After testing the data with digital filters, the conclusion is that autoassociative neural networks have impressive results and far better performance than all tested filters, with a constraint that it is necessary to have a priori information about the signal. Since in the great number of applications, especially in system’s development, it is common to have the knowledge about the expected tested signal, this is not a limiting constraint. Figure. 2. Testing data and NN noise reduction data with “3 5 7 3 7 5 3” design As it can be seen from these results, network is capable of data noise reduction. When more information about expected input signal is present, which is normal when developing a system, and this data is used to improve the performance of a network, then, of course, better results are obtained, as it is shown in Figure 3. When dealing with fault sensor data replacement, network is able to ignore some unexpected signal values (Figure 4). It is important to emphasize that the process of finding the optimal design of a network is not an easy task. It is crucial to set the number of errors and iterations correctly in the training phase, to avoid network overfitting, and also try different combinations for the same network design. When designing a IV. CONCLUSION As a need for more precise, reliable, but also cheaper and smaller sensors in military, industry and all other fields grows, as well grows a need for more powerful signal processing techniques that can ensure demanded performances. Neural networks represent a technique that imitates the most perfect decision making device that exists – a human brain, and it is evident that this is a technique that is going to be more and more used and improved in the future. Autoassociative neural networks are showing good results in noise reduction, but also in gross error and fault detection [5], which is also shown in this paper. A new approach presented here is training the network with the more precise and expensive sensor, in order to perform data processing with autoassociative neural network on less precise sensor, that can be used in desired application with much better performance. Networks show really good results when apriori information about the expected sensor signal are involved in the network design process. In comparison with digital filters that are commonly used in noise reduction these networks show an important advantage. Neural networks do not bring delay into the system as all filters do, which can be very important when dealing with real time systems. It is also shown that Neural networks noise reduction system does not change the shape of the signals, as some filters do. Fault detection should be examined more in the future, accompanied with larger fault data base. For future work it is planned to design more robust and precise algorithm that can perform also on-line noise reduction as well as fault detection with real time sensor signals. LITERATURE [1] [2] [3] [4] [5] L. Tao, H.K. Kwan, “A neural network method for adaptive noise cancellation”, 1999 S. Dixit, D. Nagaria, “Neural Network Implementation of Least-MeanSquare Adaptive Noise Cancellation”, 20J4 Internationai Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT), 2014. K. B. Swain, S. S. Solanki, A. K. Mahakula, Author, “3. Bio Inspired Cuckoo Search Algorithm Based Neural Network and its Application to Noise Cancellation,” International Conference on Signal Processing and Integrated Networks, 2014. J. Reyes, M. Vellasco, R. Tanscheit, “Fault detection and measurements correction for multiple sensors using a modified autoassociative neural network”, Neural Comput. & Applic, DOI: 10.1007/s00521-013-1429-4, 2013. M. A. Kramer, “Autoassociative Neural Networks”, Computers Chem. Engng, Vol. 16, No. 4, pp. 313/328, 1992. [6] [7] [8] C. Aldrich, L. Auret, “Advances in Computer Vision and Pattern Recognition - Unsupervised Process Monitoring and Fault Diagnosis with Machine Learning”, Springer 2013. M. A. Kramer, “Nonlinear Principal Component Analysis Using Autoassociative Neural Networks”, AIChe Jornal, 1991. M. L. Mathisen, “Noise filtering from a nonlinear system by using AANN”, Master’s thesis, University of Stavanger, Norway, Stavanger2010. ABSTRAKT Apstrakt — Redukcija šuma je oduvek bila veoma važan deo bilo kojeg zadatka upravljanja, akvizicije ili obrade signala. Kako bi se omogućila i povećala upotreba manjih i jeftinijih, ali sa druge strane manje preciznih senzora, neophodno je uključiti u sistem I određene tehnike obrade signala u svrhu smanjenja šuma. Danas su tehnike “soft computing”, kao što su neuralne mreže, široko korišćene u mnogim aplikacijama za obrade signala i daju veoma dobre rezultate. U ovom radu je predstavljen pristup redukciji šuma korišćenjem autoasocijativnih neuralnih mreža. Glavna ideja jeste da se koristi precizniji i skuplji senzor za treniranje mreže, a nakon toga se ovakva mreža može koristiti za obradu signala sa manje preciznog senzora. Tako da je moguće, nakon formiranja mreže, koristiti manje precizan senzor i koristiti mrežu za smanjenje šuma u signalu. Na taj način je omogućena redukcija šuma manje preciznog senzora. Sa ovakvim alatom za obradu signala, u željenim aplikacijama se mogu upotrebljavati i manje precizni senzori. Sve opisane simulacije i obrade podataka su izvršene u Matlab I Simulink programskom paketu. Ključne reči — Redukcija šuma; obrada autoasocijativne neuralne mreže; Matlab, Simulink. signala; Redukcija šuma korišćenjem autoasocijativnih neuralnih mreža Goran Kvaščev, Maja Gajić-Kvašćev,, Velibor Andrić, Željko Đurović