Review on Driving Condition Detection of Vehicle by using Neural Network Classifier and SVM Classifier Bhumika J. Barai1 Department of Electronics Engineering,WCEM,Nagpur bhumika.barai@gmail.com Abstract— Vehicles may be classified by a number of different criteria and objectives. However, comprehensive classification is elusive, because a vehicle may fit into multiple categories; there are numerous ways of categorizing vehicles. Numerous jurisdictions establish vehicle classification according to the vehicles construction, engine, weight, type of fuel and emissions, as well as the purpose for which they are used. Vehicular acoustic signal have long been considered as unwanted traffic noise. However Vehicles of different types generate dissimilar sound patterns even in similar working conditions. Sound patterns generated by moving vehicles vary depending on their types and hint information necessary to classify the vehicles. Various sound sources include engine, rotating parts and exhaust system. Other factors influencing variation of sound are condition of the vehicle, environmental conditions, road conditions, maintenance of the vehicle and so on. Different types of noise coming from different vehicles mix in the environment and identifying a particular vehicle is a challenging one. In this work acoustic signals generated by different category of vehicles will be used to detect its presence and classify its type. Many acoustic factors can contribute to the classification accuracy of ground vehicles. Acoustic signal classification consists of extracting the features from a roadside acquired acoustic signal, and of using these features to identify classes the acoustic signal is liable to fit. Index Terms— Acoustic signal processing, land vehicles, pattern recognition, road transportation. I. INTRODUCTION Vehicles may be classified by a number of different criteria and objectives into different categories such as trucks, cars, bikes etc this is what we refer to as vehicular classification. Vehicular classification and speed estimation using acoustic signal based research in Intelligent Transportation Systems, with a specific focus on urban environments to battlefield environment. The research on vehicular acoustic signal which is mixture of engine noise, tyre noise, noise due to mechanical effects etc. expands from vehicular speed estimation which is major concern of city authority in case of chaotic and non-lane driven environment to vehicular classification. In both the cases estimation using magnetic loop detectors, speed guns and video monitoring seems to be best, but the installation, maintenance and operation cost associated with these approaches are very high. The image recognition techniques may fail due to poor lighting conditions and adverse environmental conditions. Hence use of road side acoustic signal seems to be an alternative, research shows acceptable accuracy for acoustic 1 signal. Research states vehicular classification with Acoustic signals can prove to be excellent approach particularly for battlefield vehicles, and also for city vehicles. The classification process includes sensing Unit, class definition, feature extraction, classifier application and system evaluation. The time-domain and frequency-domain features of the acoustic signatures are used as input for the backpropagation (BP) neural network for classifying the motorcycles into bikes and scooters [1]. Signal energy, energy entropy, ZCR, spectral roll-off, spectral centroid (SC) and spectral flux are extracted from the vehicle sounds and used for detection and classification of moving vehicles [2]. Two feature extraction methods are investigated for acoustic signal-based classification of moving ground vehicles [3]. The first one is based on spectrum distribution and the second one on wavelet packet transform. The performances of k-nearest neighbor (kNN) algorithm and support vector machine (SVM) are evaluated. A vehicle sound classification system is presented that uses time-encoded signal processing and recognition method combined with the archetypes technique [4]. The work implements different Butterworth low-pass filters for low-pass filtering. A novel methodology for statistical modelling and classification of acoustic signals collected from a wireless sensor network is discussed [5]. One-dimensional (1D) wavelet decomposition of the acoustic signal is performed and the resulting sub-band coefficients are modelled using the alphastable distribution. The similarity between two acoustic signals is measured by employing a variant of the Kullback– Leibler divergence between the characteristic functions of the corresponding sub-band representations. vectors, such as, frequency feature vectors in primary component analysis. A fusion approach combines two sets of features, extracted from engine and tyre friction [7]. The are used for vehicle classification based on the features extracted from the acoustic signals [8]. The approach demonstrates a novel way of using Aura matrices to create a new feature derived from the power spectral density of a signal. An improved BP neural network classifier for vehicle target classification is designed to classify the vehicles into cars, jeeps and trucks [9]. It generates the feature vectors by employing the time-domain energy of the acoustic signals in different scales of wavelet decomposition. Independent component analysis and principal component analysis (PCA) are used to extract features for military vehicle classification based on acoustic and seismic signals [10]. Four different classifiers including decision tree (C4.5), k-NN, probabilistic neural network and SVM are used for classification. The multi-category classification of battlefield ground vehicles is carried out [11]. It exploits the information inherent in a problem to determine the number of rules and the architecture, of classifiers. The work given in [12] presents a neural network-based approach for vehicle classification. The structural features are extracted from a vehicle image and the vehicles are classified into bus, opel Saab or van using direct solution training method. Vehicle detection and classification are handled by employing time-domain features, frequencydomain features and cepstral features, of acoustic signals, input to SVM classifier [13]. A fusion model is presented for vehicle classification [14] with the focus on acoustic and visual feature extraction. It employs the featurebased data fusion and decision modelling based on SVM and modified Dempster–Shafer theory for classification. The work presented in [15] employs two-levels of sensors: tier 1 sensors acquire the acoustic signal and transmit computed feature vectors up to tier 2 processors for maximum-likelihood classification using Gaussian-mixture models. Rough neural networks are used for vehicle classification in wireless sensor networks [16]. The work presents the recognition of sedans, vans and trucks using various vehicle descriptors based on vehicle images. Vehicle type is recognised with various classifiers: ANN, k-NN, decision tree and random forest [17]. A strain-based vehicle classification approach is developed applying PCA to extract essential features from the strain time histories [18]. These features are input to a two-layered BP neuralnetwork and trained to classify vehicles into five classes based on video images. spectral and wavelet features of the acoustic signatures. Owing to variations in experimental conditions and databases it is difficult to compare these methods. To the best of our knowledge, no work is reported on a particular class of vehicles. Hence, a study is taken up on classification of motorcycles. The remainder of the paper is organised into three sections. The proposed methodology of the work, along with a brief description of tools and techniques, is discussed in Section 2. The experimental results are presented in Section 3. Finally, the Section 4 concludes the work. II. PROPOSED METHODOLOGY The overview of the methodology is depicted in Fig. 1. It comprises of three stages namely, segmentation, feature extraction and classification. Subsections 2.1–2.4 discuss the acquisition of sound samples, segmentation, feature extraction and classification, respectively. A. Audio Data The sound signatures of the motorcycles are recorded using Sony ICD-PX720 digital voice recorder. The recording is carried out in service stations where the disturbances from human speech, motorcycles being serviced, air compressor and vehicle-repair tools are common. In future, embedded applications may be developed for automatic on-ride fault diagnosis. Hence the signals are processed without denoising. The recorder is held closer to the running engine in idling state. The automobile standards recommend the sampling frequency to be in the range of 9–30 kHz, for recording. The sound signals of the motorcycles are recorded with the sampling frequency of 44.1 kHz and quantised with 16 bit quantisation. Higher sampling frequency is used for recording to capture the sound effectively in the noisy environment. Motorcycles in the same range of age are considered for uniformity in processing. B. Data Collection Data collection is simply how information is gathered. We are collecting data using a micro-phone BETA 58A, with either 8 KHz or16KHz sampling frequency. Data is collected either at workstation or at on any road segment of city. The portion of the signal of duration 1 s, beginning from local maxima within 50 milliseconds duration, forms a segment. The next segment begins at local maxima in the next 50 milliseconds duration. No preprocessing is carried out over the sound samples acquired, so that the approach can be applied in real-time environments. Fig.1 Block diagram of the proposed methodology Fair amount of research has gone at the broader level, classifying the vehicles into trucks, cars and motorcycles. There are reported works using costlier computational techniques. Some of the cited works are based on image and video signals. A common focus is on the analysis of temporal, C. Feature Extraction The time-domain features ZCR, STE, RMS and frequency domain extracted as features and later input for classifiers. 1) Time-domain features Rectangular window ofsize 50 ms is used for all these features. Short-time energy is a simple short-time speech measurement. It is defined as in (1) where x(m) is the input signal, w(m) is the window function and L is the window length. The frequency spectrum can be analysed in different ways and SC is calculated as given in (5) ZCR is defined as given in (2) where w(n) = (1/2)N, 0 ≤ n ≤ N-1. The RMS value is defined to be the square root of the average of a squared signal. Equation (3) defines the RMS formally where M is the total number of samples in processing window, x(m) is the value of mth sample. where k is the order of DFT. A(n, k) is the DFT of nth frame of a signal. The SC is a measure that signifies if the spectrum contains majority of high or low frequencies. The CMean and the Csd are used as features for classification. The chosen features exhibit good separability for the sound signals of bikes and scooters with the classifiers used in this work. The power spectra of the sound signatures of motorcycles are shown in Fig. 2. It can be observed that the power spectrum of bikes degrades uniformly and has no spurious spikes. However, the variations in scooter power spectrum are not uniform and also some spurious spikes can be observed in lower end of the normalised frequency spectrum. D. Classifiers Two different classifiers are employed in this work and the classification performance of each of them is reported. Following Sections discussed the SVM and NN, respectively. 1) Support Vector Machine (SVM): Support Vector Machine is most widely used neural network supervised learning model. SVM is used for the purpose of Classification and Regression. SVM does not have prior knowledge about problem but learns about it during training phase. Generalization capability is major advantage of SVM. This feature makes it better than most of the other models present in this field. SVM works equally for both linearly separable data as well as non-linearly separable data. The major advantage of SVM is its ability to classify unknown data set with high accuracy as it works on the concept of maximum margin hyperplane. Fig. 2 Spectra of a sample bike and car 2) Frequency-domain features. Spectrum centroid (SC): SC is computed based on the analysis of the frequency spectrum for the signal. Generally, it is used to classify between noise, speech and music. The frequency spectrum for this feature is calculated with discrete Fourier transform (DFT), as defined by (4) 2) Neural Network (NN): A neural network is of an artificial represent ion of human brain that tried to simulate its learning process. An artificial neural network often called as Neural Network or neural net (NN). Traditionally , the word neural network is referred to a network of Biological Neuron s in the nervous system that process and transmit the information. ANN is the interconnection of Artificial Neurons that uses a mathematical model for information processing based on connectionist approach to computation. The overview of the architecture of the neural network. The three time-domain features and two frequency-domain features of the sound signature are input to the neural network with five input nodes. The two output nodes correspond to the two-bit output vector indicating the class of the motorcycle. The hidden layer contains ten nodes. The neural network is trained using BP-learning algorithm. The stabilized weights are reloaded and the test vector is input during testing. The goal is set as 0.00001. If the goal is met during training and if the same samples are used for training as well as testing, 100% classification accuracy can be achieved. It usually happens for feature sets used for training. However, for larger training sets, the user can terminate the training process after sufficient number of epochs. Further, the learning rate is varied suitably depending on the size of the training set. Smaller values of the learning rate are used for smaller training feature sets and larger values for larger sets. The number of nodes in the hidden layer is chosen as 10 based on the selection criterion given in (6). where n is the number of hidden layer neurons, C is the constantto yield optimal performance, d is the number of features and Nis the number of rows in the training sample matrix. Vehicular Speed Estimation Doppler frequency shift is used to provide a theoretical description of single vehicle speed. Assumption made that distance to the closest point of approach is known the solution can accommodate any line of arrival of the vehicle with respect to the microphone. The solution for speed estimation of single vehicle is more applicable as compared to several vehicles. In presence of several vehicles the interference is combined with acoustic waveforms . Sensing techniques based on passive sound detection are reported. These techniques utilizes microphone array to detect the sound waves generated by road side vehicles and are capable of capable of monitoring traffic conditions on lane-by-lane and vehicle-by-vehicle basis in a multilane carriageway. Correlation based algorithm are used, which extracts key data reflecting the road traffic conditions, e.g. the speed and density of vehicles. S. Chen et Al develops multilane traffic sensing concept based passive sound which is digitized and processed by an on-site computer using a correlation based algorithm. The system having low cost (installation, maintenance and operational), safe passive detection, immunity to adverse weather conditions, and competitive manufacturing cost. The system performs well for free flow traffic however for congested traffic performance is difficult to achieve. Valcarce et al. exploit the differential time delays to estimate the speed. Pair of omnidirectional microphones was used and technique is based on maximum likelihood principle. It directly estimates car speed without any assumptions on the acoustic signal emitted by the vehicle. Lo and Ferguson develop a nonlinear least squares method for vehicle speed estimation using multiple microphones. Quasi-Newton method for computational efficiency was used. The estimated speed is obtained using generalized cross correlation method based on time-delay-of-arrival estimates. Mohan et al. estimates vehicle’s speed using combination of smartphone features and basic honk signals. Simple Doppler frequency shift computations are done to estimate speed. Sen et al. used Doppler frequency shift rule with assumption that vehicle is moving in same direction as the straight line connecting vehicle to the microphone. In presence of multiple honking vehicles it is not clear that how two microphone distinct honk emitted by same vehicle. The experimental setup covers at two roads, on single lane one way and three lane bidirectional. Cevher et al. uses single acoustic sensor to estimate vehicle’s speed, width and length by jointly estimating acoustic wave patterns. Wave patterns are approximated using three envelop shape components. Results obtained from experimental setup shows the vehicle speeds are estimated as (18.68, 4.14) m/s by the video camera and (18.60, 4.49) m/s by the acoustic method. Same authors estimate a single vehicle’s speed, engine’s rounds per minute (RPM), the number of cylinders, and its length and width based on its acoustical wave patterns. Wave patterns are determined using the vehicle’s speed, the Doppler shift factor, the sensor’s distance to the vehicle’s closest-point-of-approach, and three (ES) components. Vehicle profile vector is estimated which provides a fingerprint for vehicle identification and classification (Ford F150, Chevy Impala, Honda Accord, Nissan Maxima & Frontier, Mercedes E, Volvo 850 SW, Isuzu Rodeo and VW Passat). This system is only applicable for single vehicle and its type has been recognized. Traffic Density Estimation Urban areas are concerned with effective traffic signal control and traffic management. Time estimation for reaching from source to destination using real time traffic density information is major concern of city authorities. Referring to the developing geographical areas like Asia, the traffic is characterised be non lane-driven. In such condition traffic density estimation using magnetic loop detectors, speed guns and video monitoring seems to be best, but the installation, maintenance and operation cost associated with these approaches are very high. Use of road side acoustic signal seems to be an alternative for traffic density estimation. Jien Kato proposed method for traffic density estimation based on recognition of temporal variations that appear on the power signals in accordance with vehicle passes through reference point. HMM is used for observation of local temporal variations over small periods of time, extracted by wavelet transformation. Experimental results show good accuracy for detection of passage of vehicles. Vivek Tyagi et al. classify traffic density state as free flowing, Medium flow and Jammed. They consider short term spectral envelops features of cumulative acoustic signal, and then class conditional probability distribution is modelled on one of the three broad traffic density state (mentioned above). Experimental setup uses omnidirectional microphone placed at about 1.5 m height and cumulative acoustic signal is recorded at 16000 Hz sampling frequency. Bayes classifier is applied to classify traffic density state which results in ~ 95% of accuracy, which is then improved by using discriminative classifier such as RBF-SVM. This technique is independent of light condition and works well for developing regions. Techniques requires accurate detection of honk signal to arrive at average speed. Vehicular Classification Problem of vehicular classification is example of pattern recognition theory. Acoustic signals collected by acoustic sensors are used to identify the type of moving ground vehicles. Typical classification process consists of sensing, class definition, feature extraction, classifier application and system evaluation. Based on collected acoustic data feature vectors are extracted. Figure Typical classification system Sensing unit collects raw data in order to provide sensor node the information about traffic condition. Segmentation refers to separation of single vehicle imposes major restriction on acoustic classification system because traffic recordings are consist of signals from multiple vehicles which are mutually overlap. Feature extraction refers to extracting representative set of features which are able to distinguish different classes of vehicle. Richard O. Duda writes in “Pattern Classification” ,“The conceptual boundary between feature extraction and classification proper is somewhat arbitrary: An ideal feature extractor would yield a representation that makes the job of the classifier trivial; conversely, an omnipotent classifier would not need the help of a sophisticated feature extractor. The distinction is forced upon us for practical rather than theoretical reasons.” Classification decides which class or category a given feature vector belongs. Broad categories are statistical and structural classifier. Extensive research work has been carried out in vehicle detection and classification system, especially on battlefield vehicles. III. RESULTS AND DISCUSSION The sound sample database contains 270 samples of bikes And 270 samples of scooters. While training the size of the sample is varied in steps of 10 from 60 to 210. The remaining 270 2 n, samples are used for testing, where 60 ≤ n ≤ 210. Motorcycles of four major Indian manufacturers, namely Hero MotoCorp (earlier Hero Honda Motors), Honda motors, TVS Motor Co. Ltd., Bajaj Auto, are considered in this work. The recording is carried out in service stations, under the supervision of expert mechanics. The recording environment has disturbances such as human speech, sound from other vehicles being serviced, sound of air-compressor and autorepair tools. In order to minimise the effect of noise, the recorder is held closer to the engine. Fig 3 depicts the environment maintained during the recording of the motorcycle sounds. The recorder is held 500 mm from the centre line of the exhaust end, at the angle of 458 measured from the centre line of the exhaust end and at the height of the exhaust pipe. The 500 mm is vital, since an 80 mm error results up to 1 Db change in sound level. The motorcycle is in idling state and stationary with background sound of 10 dBA below the maximum allowed. The start of engine and the control of throttle are carried out simultaneously by the expert mechanic. The recording is carried out in real-time environment with disturbances around, with an intention of applicability in the real-time systems. The results are satisfactory even without denoising, proving the method robust. Fig. 3 Recording environment IV. CONCLUSION Automatic vehicle recognition and classification is a step towards development of automatic fault diagnosis system. The work classifies the motorcycles of different models from different manufacturers into bikes and scooters. The ANNbased classifier is costlier in terms of training time, but yields 100% classification accuracy when tested with the same samples used for training. The DTW yields better results for bikes than for scooters, because of variations in models of scooters. KBC yields better performance for both types of vehicles, especially with increasing in number of training samples. The work finds interesting applications in industry, medicine, tuning of musical instruments, failures in civil constructions, faulty railroads and so on. It helps in designing intelligent transportation and census systems. The work can be extended to fault diagnosis of vehicles, machines, musical instruments and the like, which have scope for acoustic signalbased fault diagnosis. Further the work leaves scope for fault source localisation, detection of presence of multiple faults and advanced applications of fault diagnosis to MANETs and VANETs. V. REFERENCES 1 Anami, B.S., Pagi, V.B.:‘An acoustic signature based neural network model for type recognition of two-wheelers’. IEEE Int. Conf. on Multimedia Systems, Signal Processing and Communication Technologies (IMPACT-09), AMU Aligarh, March 2009, pp. 28–31 2 Padmavathi, G., Shanmugapriya, D., Kalaivani, M.: ‘Acoustic signal based feature extraction for vehicular classification’. Third Int. Conf. Bottom of Form Advanced Computer Theory and Engineering (ICACTE), Chengdu, China, August 2010, pp. V2-11–V2-14 3 Aljaafreh, A., Dong, L.:‘An evaluation of feature extraction methods for vehicle classification based on acoustic signals’. Proc. Int. Conf. on Networking, Sensing and Control (ICNSC), Chicago, April 2010, pp. 570– 575 4 Ghiurcau, M.V., Rusu, C.:‘Vehicle sound classification. Application and low pass filtering influence’. Proc. Int. Symp. on Signals, Circuits and Systems, ISSCS, Iasi Romania, July 2009, pp. 1–4 5 Kornaropoulos, E.M., Panagiotis, T.:‘A novel KNN classifier for acoustic vehicle classification based on alpha-stable statistical modeling’. Proc. IEEE 15th Workshop on Statistical SignalProcessing (SSP’ 09), Cardiff, Wales, UK, August–September 2009,pp. 1–4 6 Seung, S.Y., Yoon, G.K., Hongsik, C.:‘Vehicle identification using discrete spectrums in wireless sensor networks’, J. Netw., 2008, 3, (4), pp. 51–63 7 Baofeng, G., Mark, S.N., Thyagaraju, D.:‘Acoustic information fusion for ground vehicle classification’. Proc. Annual Conf. of ITA (ACITA), London, 2008, pp. 190–196 8 Baljeet, M., Ioanis, N., Janelle, H.:‘A simple vehicle classification framework for wireless audio-sensor networks’, J. Telecommun. Inf. Technol. Wirel. Ad-Hoc Netw., 2008, 1, pp. 43–50 9 Jinghua, L., Jiadong, X., Hongjuan, L.:‘Vehicle classification using acoustic energy signature in wavelet scale space and neural network’. Proc. First Int. Conf. on Transportation Engineering, Southwest Jiaotong University, Chengdu, China, July 2007, pp. 926–931 10 Hanguang, X., Congzhong, C., Qianfei, Y., Xinghua, L., Yufeng, W.:‘A comparative study of feature extraction and classification methods for military vehicle type recognition using acoustic and seismic signals’, in Huang, D.-S., Heutte, L., Loog, M. (Eds.): ‘ICIC 2007’ Sitges, Spain, October 2007, (LNCS, 4681), pp. 810–819 11 Wu, H., Mendel, J.M.:‘Classification of battlefield ground vehicles using acoustic features and fuzzy logic rule-based classifiers’, IEEE Trans. Fuzzy Syst., 2007, 15, (1), pp. 56–72 12 Anshul, G., Brijesh, V.:‘A neural network based approach for the vehicle classification’. Proc. IEEE Symp. on Computational Intelligence in Image and Signal Processing (CIISP 2007), Honolulu, HI, April 2007, pp. 226–231 13 Andreas, K., Stefan, E., Allan, T., Bernhard, R.:‘DSP based acoustic vehicle classification for multi-sensor real-time traffic surveillance’. EUSIPCO 2007, Poznan, 2007, pp. 1916–1920 14 Andreas, K., Allan, T., Bernhard, R.:‘Vehicle classification on multisensor smart cameras using feature- and decision-fusion’. First ACM/IEEE Int. Conf. on Distributed Smart Cameras, Vienna, September 2007, pp. 67–74 15 Necioglu, B.F., Christou, C.T., George, E.B., Jacyna, G.M.:‘Vehicle acoustic classification in netted sensor systems using Gaussian mixture models’, in Kadar, I. (Ed.): ‘Proc. Signal Processing, Sensor Fusion, and Target Recognition XIV’, 2005, vol. 5809, pp. 409–419 16 Huang, Q., Xing, T., Liu, H.T.:‘Vehicle classification in wireless sensors network based on rough neural network’ (Springer-Verlag, Berlin, Heidelberg, 2006), (LNCS, 3973), pp. 58–65 17 Piotr, D., Andrzej, C.:‘Vehicle classification based on soft computing algorithms’. 2010 (LNCS, 6086), pp. 70–79 18 Linjun, Y., Michael, F., Ahmed, E., Tony, F., Kendra, O.:‘Neural networks and principal components analysis for strain-based vehicle classification’, J. Comput. Civ. Eng., 2008, 22, (2), pp. 123–132 19 Kishan, M., Chilukuri Mohan, K., Sanjay, R.:‘Elements of artificial neural networks’ (The MIT Press, October 1996) 20 Xu, S., Chen, L.:‘A novel approach for determining the optimal number of hidden layer neurons for FNN’s and its application in data mining’. Fifth Int. Conf. on Information Technology and Applications, Queensland, 2008, pp. 683–686 21 Sakoe, H., Chiba, S.: ‘Dynamic programming algorithm optimization for spoken word recognition’, IEEE Trans. Acoust., Speech Signal Process., 1978, 26, (1), pp. 43–49