I. Introduction - Academic Science,International Journal of Computer

advertisement
Review on Driving Condition Detection of Vehicle by
using Neural Network Classifier and SVM Classifier
Bhumika J. Barai1
Department of Electronics Engineering,WCEM,Nagpur
bhumika.barai@gmail.com
Abstract— Vehicles may be classified by a number of
different criteria and objectives. However, comprehensive
classification is elusive, because a vehicle may fit into multiple
categories; there are numerous ways of categorizing vehicles.
Numerous jurisdictions establish vehicle classification according to
the vehicles construction, engine, weight, type of fuel and
emissions, as well as the purpose for which they are used.
Vehicular acoustic signal have long been considered as unwanted
traffic noise. However Vehicles of different types generate
dissimilar sound patterns even in similar working conditions.
Sound patterns generated by moving vehicles vary depending on
their types and hint information necessary to classify the vehicles.
Various sound sources include engine, rotating parts and exhaust
system. Other factors influencing variation of sound are condition
of the vehicle, environmental conditions, road conditions,
maintenance of the vehicle and so on. Different types of noise
coming from different vehicles mix in the environment and
identifying a particular vehicle is a challenging one. In this
work acoustic signals generated by different category of vehicles
will be used to detect its presence and classify its type. Many
acoustic factors can contribute to the classification accuracy of
ground vehicles. Acoustic signal classification consists of
extracting the features from a roadside acquired acoustic signal,
and of using these features to identify classes the acoustic
signal is liable to fit.
Index Terms— Acoustic signal processing, land vehicles,
pattern recognition, road transportation.
I.
INTRODUCTION
Vehicles may be classified by a number of different criteria
and objectives into different categories such as trucks, cars,
bikes etc this is what we refer to as vehicular classification.
Vehicular classification and speed estimation using acoustic
signal based research in Intelligent Transportation Systems,
with a specific focus on urban environments to battlefield
environment. The research on vehicular acoustic signal which
is mixture of engine noise, tyre noise, noise due to mechanical
effects etc. expands from vehicular speed estimation which is
major concern of city authority in case of chaotic and non-lane
driven environment to vehicular classification. In both the
cases estimation using magnetic loop detectors, speed guns
and video monitoring seems to be best, but the installation,
maintenance and operation cost associated with these
approaches are very high.
The image recognition techniques may fail due to poor
lighting conditions and adverse environmental conditions.
Hence use of road side acoustic signal seems to be an
alternative, research shows acceptable accuracy for acoustic
1
signal. Research states vehicular classification with Acoustic
signals can prove to be excellent approach particularly for
battlefield vehicles, and also for city vehicles. The
classification process includes sensing Unit, class definition,
feature extraction, classifier application and system evaluation.
The time-domain and frequency-domain features of the
acoustic signatures are used as input for the backpropagation
(BP) neural network for classifying the motorcycles into bikes
and scooters [1]. Signal energy, energy entropy, ZCR, spectral
roll-off, spectral centroid (SC) and spectral flux are extracted
from the vehicle sounds and used for detection and
classification of moving vehicles [2]. Two feature extraction
methods are investigated for acoustic signal-based
classification of moving ground vehicles [3]. The first one is
based on spectrum distribution and the second one on wavelet
packet transform. The performances of k-nearest neighbor (kNN) algorithm and support vector machine (SVM) are
evaluated. A vehicle sound classification system is presented
that uses time-encoded signal processing and recognition
method combined with the archetypes technique [4]. The work
implements different Butterworth low-pass filters for low-pass
filtering. A novel methodology for statistical modelling and
classification of acoustic signals collected from a wireless
sensor network is discussed [5]. One-dimensional (1D) wavelet
decomposition of the acoustic signal is performed and the
resulting sub-band coefficients are modelled using the
alphastable distribution. The similarity between two acoustic
signals is measured by employing a variant of the Kullback–
Leibler divergence between the characteristic functions of the
corresponding sub-band representations. vectors, such as,
frequency feature vectors in primary component analysis. A
fusion approach combines two sets of features, extracted from
engine and tyre friction [7]. The are used for vehicle
classification based on the features extracted from the acoustic
signals [8]. The approach demonstrates a novel way of using
Aura matrices to create a new feature derived from the power
spectral density of a signal. An improved BP neural network
classifier for vehicle target classification is designed to classify
the vehicles into cars, jeeps and trucks [9]. It generates the
feature vectors by employing the time-domain energy of the
acoustic signals in different scales of wavelet decomposition.
Independent component analysis and principal component
analysis (PCA) are used to extract features for military vehicle
classification based on acoustic and seismic signals [10]. Four
different classifiers including decision tree (C4.5), k-NN,
probabilistic neural network and SVM are used for
classification. The multi-category classification of battlefield
ground vehicles is carried out [11]. It exploits the information
inherent in a problem to determine the number of rules and the
architecture, of classifiers. The work given in [12] presents a
neural network-based approach for vehicle classification. The
structural features are extracted from a vehicle image and the
vehicles are classified into bus, opel Saab or van using direct
solution training method. Vehicle detection and classification
are handled by employing time-domain features,
frequencydomain features and cepstral features, of acoustic
signals, input to SVM classifier [13]. A fusion model is
presented for vehicle classification [14] with the focus on
acoustic and visual feature extraction. It employs the featurebased data fusion and decision modelling based on SVM and
modified Dempster–Shafer theory for classification.
The work presented in [15] employs two-levels of sensors: tier
1 sensors acquire the acoustic signal and transmit computed
feature vectors up to tier 2 processors for maximum-likelihood
classification using Gaussian-mixture models. Rough neural
networks are used for vehicle classification in wireless sensor
networks [16]. The work presents the recognition of sedans,
vans and trucks using various vehicle descriptors based on
vehicle images. Vehicle type is recognised with various
classifiers: ANN, k-NN, decision tree and random forest [17].
A strain-based vehicle classification approach is developed
applying PCA to extract essential features from the strain time
histories [18]. These features are input to a two-layered BP
neuralnetwork and trained to classify vehicles into five classes
based on video images.
spectral and wavelet features of the acoustic signatures.
Owing to variations in experimental conditions and databases
it is difficult to compare these methods. To the best of our
knowledge, no work is reported on a particular class of
vehicles. Hence, a study is taken up on classification of
motorcycles. The remainder of the paper is organised into
three sections. The proposed methodology of the work, along
with a brief description of tools and techniques, is discussed in
Section 2. The experimental results are presented in Section 3.
Finally, the Section 4 concludes the work.
II. PROPOSED METHODOLOGY
The overview of the methodology is depicted in Fig. 1. It
comprises of three stages namely, segmentation, feature
extraction and classification. Subsections 2.1–2.4 discuss the
acquisition of sound samples, segmentation, feature extraction
and classification, respectively.
A. Audio Data
The sound signatures of the motorcycles are recorded using
Sony ICD-PX720 digital voice recorder. The recording is
carried out in service stations where the disturbances from
human speech, motorcycles being serviced, air compressor
and vehicle-repair tools are common. In future, embedded
applications may be developed for automatic on-ride fault
diagnosis. Hence the signals are processed without denoising.
The recorder is held closer to the running engine in idling
state. The automobile standards recommend the sampling
frequency to be in the range of 9–30 kHz, for recording. The
sound signals of the motorcycles are recorded with the
sampling frequency of 44.1 kHz and quantised with 16 bit
quantisation. Higher sampling frequency is used for recording
to capture the sound effectively in the noisy environment.
Motorcycles in the same range of age are considered for
uniformity in processing.
B. Data Collection
Data collection is simply how information is gathered. We
are collecting data using a micro-phone BETA 58A, with
either 8 KHz or16KHz sampling frequency. Data is collected
either at workstation or at on any road segment of city.
The portion of the signal of duration 1 s, beginning from local
maxima within 50 milliseconds duration, forms a segment.
The next segment begins at local maxima in the next 50
milliseconds duration. No preprocessing is carried out over
the sound samples acquired, so that the approach can be
applied in real-time environments.
Fig.1
Block diagram of the proposed methodology
Fair amount of research has gone at the broader level,
classifying the vehicles into trucks, cars and motorcycles.
There are reported works using costlier computational
techniques. Some of the cited works are based on image and
video signals. A common focus is on the analysis of temporal,
C. Feature Extraction
The time-domain features ZCR, STE, RMS and frequency
domain extracted as features and later input for classifiers.
1) Time-domain features
Rectangular window ofsize 50 ms is used for all these
features. Short-time energy is a simple short-time speech
measurement. It is defined as in (1)
where x(m) is the input signal, w(m) is the window function
and L is the window length. The frequency spectrum can be
analysed in different ways and SC is calculated as given in (5)
ZCR is defined as given in (2)
where w(n) = (1/2)N, 0 ≤ n ≤ N-1.
The RMS value is defined to be the square root of the
average of a squared signal. Equation (3) defines the RMS
formally
where M is the total number of samples in processing
window, x(m) is the value of mth sample.
where k is the order of DFT. A(n, k) is the DFT of nth frame
of a signal. The SC is a measure that signifies if the spectrum
contains majority of high or low frequencies. The CMean and
the Csd are used as features for classification. The chosen
features exhibit good separability for the sound signals of
bikes and scooters with the classifiers used in this work. The
power spectra of the sound signatures of motorcycles are
shown in Fig. 2. It can be observed that the power spectrum of
bikes degrades uniformly and has no spurious spikes.
However, the variations in scooter power spectrum are not
uniform and also some spurious spikes can be observed in
lower end of the normalised frequency spectrum.
D.
Classifiers
Two different classifiers are employed in this work and
the classification performance of each of them is reported.
Following Sections discussed the SVM and NN, respectively.
1) Support Vector Machine (SVM):
Support Vector Machine is most widely used neural
network supervised learning model. SVM is used for the
purpose of Classification and Regression. SVM does not have
prior knowledge about problem but learns about it during
training phase. Generalization capability is major advantage of
SVM. This feature makes it better than most of the other
models present in this field. SVM works equally for both
linearly separable data as well as non-linearly separable data.
The major advantage of SVM is its ability to classify
unknown data set with high accuracy as it works on the
concept of maximum margin hyperplane.
Fig. 2 Spectra of a sample bike and car
2) Frequency-domain features. Spectrum centroid
(SC):
SC is computed based on the analysis of the frequency
spectrum for the signal. Generally, it is used to classify
between noise, speech and music. The frequency spectrum
for this feature is calculated with discrete Fourier transform
(DFT), as defined by (4)
2) Neural Network (NN):
A neural network is of an artificial represent ion of
human brain that tried to simulate its learning process. An
artificial neural network often called as Neural Network or
neural net (NN). Traditionally , the word neural network is
referred to a network of Biological Neuron s in the nervous
system that process and transmit the information. ANN is the
interconnection of Artificial Neurons that uses a mathematical
model for information processing based on connectionist
approach to computation. The overview of the architecture of
the neural network. The three time-domain features and two
frequency-domain features of the sound signature are input to
the neural network with five input nodes. The two output
nodes correspond to the two-bit output vector indicating the
class of the motorcycle. The hidden layer contains ten nodes.
The neural network is trained using BP-learning algorithm.
The stabilized weights are reloaded and the test vector is input
during testing. The goal is set as 0.00001. If the goal is met
during training and if the same samples are used for training
as well as testing, 100% classification accuracy can be
achieved. It usually happens for feature sets used for training.
However, for larger training sets, the user can terminate the
training process after sufficient number of epochs. Further, the
learning rate is varied suitably depending on the size of the
training set. Smaller values of the learning rate are used for
smaller training feature sets and larger values for larger sets.
The number of nodes in the hidden layer is chosen as 10 based
on the selection criterion given in (6).
where n is the number of hidden layer neurons, C is the
constantto yield optimal performance, d is the number of
features and Nis the number of rows in the training sample
matrix.
Vehicular Speed Estimation
Doppler frequency shift is used to provide a theoretical
description of single vehicle speed. Assumption made that
distance to the closest point of approach is known the solution
can accommodate any line of arrival of the vehicle with
respect to the microphone. The solution for speed estimation
of single vehicle is more applicable as compared to several
vehicles. In presence of several vehicles the interference is
combined with acoustic waveforms .
Sensing techniques based on passive sound detection
are reported. These techniques utilizes microphone array to
detect the sound waves generated by road side vehicles and
are capable of capable of monitoring traffic conditions on
lane-by-lane and vehicle-by-vehicle basis in a multilane
carriageway. Correlation based algorithm are used, which
extracts key data reflecting the road traffic conditions, e.g. the
speed and density of vehicles. S. Chen et Al develops
multilane traffic sensing concept based passive sound which is
digitized and processed by an on-site computer using a
correlation based algorithm. The system having low cost
(installation, maintenance and operational), safe passive
detection, immunity to adverse weather conditions, and
competitive manufacturing cost. The system performs well for
free flow traffic however for congested traffic performance is
difficult to achieve.
Valcarce et al. exploit the differential time delays to
estimate the speed. Pair of omnidirectional microphones was
used and technique is based on maximum likelihood principle.
It directly estimates car speed without any assumptions on the
acoustic signal emitted by the vehicle. Lo and Ferguson
develop a nonlinear least squares method for vehicle speed
estimation using multiple microphones. Quasi-Newton
method for computational efficiency was used. The estimated
speed is obtained using generalized cross correlation method
based on time-delay-of-arrival estimates.
Mohan et al. estimates vehicle’s speed using
combination of smartphone features and basic honk signals.
Simple Doppler frequency shift computations are done to
estimate speed. Sen et al. used Doppler frequency shift rule
with assumption that vehicle is moving in same direction as
the straight line connecting vehicle to the microphone. In
presence of multiple honking vehicles it is not clear that how
two microphone distinct honk emitted by same vehicle. The
experimental setup covers at two roads, on single lane one
way and three lane bidirectional.
Cevher et al. uses single acoustic sensor to estimate
vehicle’s speed, width and length by jointly estimating
acoustic wave patterns. Wave patterns are approximated using
three envelop shape components. Results obtained from
experimental setup shows the vehicle speeds are estimated as
(18.68, 4.14) m/s by the video camera and (18.60, 4.49) m/s
by the acoustic method. Same authors estimate a single
vehicle’s speed, engine’s rounds per minute (RPM), the
number of cylinders, and its length and width based on its
acoustical wave patterns. Wave patterns are determined using
the vehicle’s speed, the Doppler shift factor, the sensor’s
distance to the vehicle’s closest-point-of-approach, and three
(ES) components. Vehicle profile vector is estimated which
provides a fingerprint for vehicle identification and
classification (Ford F150, Chevy Impala, Honda Accord,
Nissan Maxima & Frontier, Mercedes E, Volvo 850 SW,
Isuzu Rodeo and VW Passat). This system is only applicable
for single vehicle and its type has been recognized.
Traffic Density Estimation
Urban areas are concerned with effective traffic signal
control and traffic management. Time estimation for reaching
from source to destination using real time traffic density
information is major concern of city authorities. Referring to
the developing geographical areas like Asia, the traffic is
characterised be non lane-driven. In such condition traffic
density estimation using magnetic loop detectors, speed guns
and video monitoring seems to be best, but the installation,
maintenance and operation cost associated with these
approaches are very high. Use of road side acoustic signal
seems to be an alternative for traffic density estimation.
Jien Kato proposed method for traffic density
estimation based on recognition of temporal variations that
appear on the power signals in accordance with vehicle passes
through reference point. HMM is used for observation of local
temporal variations over small periods of time, extracted by
wavelet transformation. Experimental results show good
accuracy for detection of passage of vehicles.
Vivek Tyagi et al. classify traffic density state as free
flowing, Medium flow and Jammed. They consider short term
spectral envelops features of cumulative acoustic signal, and
then class conditional probability distribution is modelled on
one of the three broad traffic density state (mentioned above).
Experimental setup uses omnidirectional microphone placed
at about 1.5 m height and cumulative acoustic signal is
recorded at 16000 Hz sampling frequency. Bayes classifier is
applied to classify traffic density state which results in ~ 95%
of accuracy, which is then improved by using discriminative
classifier such as RBF-SVM. This technique is independent of
light condition and works well for developing regions.
Techniques requires accurate detection of honk signal to
arrive at average speed.
Vehicular Classification
Problem of vehicular classification is example of
pattern recognition theory. Acoustic signals collected by
acoustic sensors are used to identify the type of moving
ground vehicles. Typical classification process consists of
sensing, class definition, feature extraction, classifier
application and system evaluation. Based on collected
acoustic data feature vectors are extracted.
Figure Typical classification system
Sensing unit collects raw data in order to provide
sensor node the information about traffic condition.
Segmentation refers to separation of single vehicle imposes
major restriction on acoustic classification system because
traffic recordings are consist of signals from multiple vehicles
which are mutually overlap. Feature extraction refers to
extracting representative set of features which are able to
distinguish different classes of vehicle. Richard O. Duda
writes in “Pattern Classification” ,“The conceptual boundary
between feature extraction and classification proper is
somewhat arbitrary: An ideal feature extractor would yield a
representation that makes the job of the classifier trivial;
conversely, an omnipotent classifier would not need the help
of a sophisticated feature extractor. The distinction is forced
upon us for practical rather than theoretical reasons.”
Classification decides which class or category a given feature
vector belongs. Broad categories are statistical and structural
classifier. Extensive research work has been carried out in
vehicle detection and classification system, especially on
battlefield vehicles.
III. RESULTS AND DISCUSSION
The sound sample database contains 270 samples of bikes
And 270 samples of scooters. While training the size of the
sample is varied in steps of 10 from 60 to 210. The remaining
270 2 n, samples are used for testing, where 60 ≤ n ≤ 210.
Motorcycles of four major Indian manufacturers, namely Hero
MotoCorp (earlier Hero Honda Motors), Honda motors, TVS
Motor Co. Ltd., Bajaj Auto, are considered in this work. The
recording is carried out in service stations, under the
supervision of expert mechanics. The recording environment
has disturbances such as human speech, sound from other
vehicles being serviced, sound of air-compressor and autorepair tools. In order to minimise the effect of noise, the
recorder is held closer to the engine. Fig 3 depicts the
environment maintained during the recording of the
motorcycle sounds. The recorder is held 500 mm from the
centre line of the exhaust end, at the angle of 458 measured
from the centre line of the exhaust end and at the height of the
exhaust pipe. The 500 mm is vital, since an 80 mm error
results up to 1 Db change in sound level. The motorcycle is in
idling state and stationary with background sound of 10 dBA
below the maximum allowed. The start of engine and the
control of throttle are carried out simultaneously by the expert
mechanic. The recording is carried out in real-time
environment with disturbances around, with an intention of
applicability in the real-time systems. The results are
satisfactory even without denoising, proving the method
robust.
Fig. 3
Recording environment
IV. CONCLUSION
Automatic vehicle recognition and classification is a step
towards development of automatic fault diagnosis system. The
work classifies the motorcycles of different models from
different manufacturers into bikes and scooters. The ANNbased classifier is costlier in terms of training time, but yields
100% classification accuracy when tested with the same
samples used for training. The DTW yields better results for
bikes than for scooters, because of variations in models of
scooters. KBC yields better performance for both types of
vehicles, especially with increasing in number of training
samples. The work finds interesting applications in industry,
medicine, tuning of musical instruments, failures in civil
constructions, faulty railroads and so on. It helps in designing
intelligent transportation and census systems. The work can be
extended to fault diagnosis of vehicles, machines, musical
instruments and the like, which have scope for acoustic signalbased fault diagnosis. Further the work leaves scope for fault
source localisation, detection of presence of multiple faults
and advanced applications of fault diagnosis to MANETs and
VANETs.
V. REFERENCES
1 Anami, B.S., Pagi, V.B.:‘An acoustic signature based neural network
model for type recognition of two-wheelers’. IEEE Int. Conf. on Multimedia
Systems, Signal Processing and Communication Technologies (IMPACT-09),
AMU Aligarh, March 2009, pp. 28–31
2 Padmavathi, G., Shanmugapriya, D., Kalaivani, M.: ‘Acoustic signal based
feature extraction for vehicular classification’. Third Int. Conf. Bottom of
Form Advanced Computer Theory and Engineering (ICACTE), Chengdu,
China, August 2010, pp. V2-11–V2-14
3 Aljaafreh, A., Dong, L.:‘An evaluation of feature extraction methods for
vehicle classification based on acoustic signals’. Proc. Int. Conf. on
Networking, Sensing and Control (ICNSC), Chicago, April 2010, pp. 570–
575
4 Ghiurcau, M.V., Rusu, C.:‘Vehicle sound classification. Application and
low pass filtering influence’. Proc. Int. Symp. on Signals, Circuits and
Systems, ISSCS, Iasi Romania, July 2009, pp. 1–4
5 Kornaropoulos, E.M., Panagiotis, T.:‘A novel KNN classifier for acoustic
vehicle classification based on alpha-stable statistical modeling’. Proc. IEEE
15th Workshop on Statistical SignalProcessing (SSP’ 09), Cardiff, Wales,
UK, August–September 2009,pp. 1–4
6 Seung, S.Y., Yoon, G.K., Hongsik, C.:‘Vehicle identification using discrete
spectrums in wireless sensor networks’, J. Netw., 2008, 3, (4), pp. 51–63
7 Baofeng, G., Mark, S.N., Thyagaraju, D.:‘Acoustic information fusion for
ground vehicle classification’. Proc. Annual Conf. of ITA (ACITA), London,
2008, pp. 190–196
8 Baljeet, M., Ioanis, N., Janelle, H.:‘A simple vehicle classification
framework for wireless audio-sensor networks’, J. Telecommun. Inf. Technol.
Wirel. Ad-Hoc Netw., 2008, 1, pp. 43–50
9 Jinghua, L., Jiadong, X., Hongjuan, L.:‘Vehicle classification using
acoustic energy signature in wavelet scale space and neural network’. Proc.
First Int. Conf. on Transportation Engineering, Southwest Jiaotong
University, Chengdu, China, July 2007, pp. 926–931
10 Hanguang, X., Congzhong, C., Qianfei, Y., Xinghua, L., Yufeng, W.:‘A
comparative study of feature extraction and classification methods for
military vehicle type recognition using acoustic and seismic signals’, in
Huang, D.-S., Heutte, L., Loog, M. (Eds.): ‘ICIC 2007’ Sitges, Spain, October
2007, (LNCS, 4681), pp. 810–819
11 Wu, H., Mendel, J.M.:‘Classification of battlefield ground vehicles using
acoustic features and fuzzy logic rule-based classifiers’, IEEE Trans. Fuzzy
Syst., 2007, 15, (1), pp. 56–72
12 Anshul, G., Brijesh, V.:‘A neural network based approach for the vehicle
classification’. Proc. IEEE Symp. on Computational Intelligence in Image and
Signal Processing (CIISP 2007), Honolulu, HI, April 2007, pp. 226–231
13 Andreas, K., Stefan, E., Allan, T., Bernhard, R.:‘DSP based acoustic
vehicle classification for multi-sensor real-time traffic surveillance’.
EUSIPCO 2007, Poznan, 2007, pp. 1916–1920
14 Andreas, K., Allan, T., Bernhard, R.:‘Vehicle classification on multisensor smart cameras using feature- and decision-fusion’. First ACM/IEEE
Int. Conf. on Distributed Smart Cameras, Vienna, September 2007, pp. 67–74
15 Necioglu, B.F., Christou, C.T., George, E.B., Jacyna, G.M.:‘Vehicle
acoustic classification in netted sensor systems using Gaussian mixture
models’, in Kadar, I. (Ed.): ‘Proc. Signal Processing, Sensor Fusion, and
Target Recognition XIV’, 2005, vol. 5809, pp. 409–419
16 Huang, Q., Xing, T., Liu, H.T.:‘Vehicle classification in wireless sensors
network based on rough neural network’ (Springer-Verlag, Berlin,
Heidelberg, 2006), (LNCS, 3973), pp. 58–65
17 Piotr, D., Andrzej, C.:‘Vehicle classification based on soft computing
algorithms’. 2010 (LNCS, 6086), pp. 70–79
18 Linjun, Y., Michael, F., Ahmed, E., Tony, F., Kendra, O.:‘Neural
networks and principal components analysis for strain-based vehicle
classification’, J. Comput. Civ. Eng., 2008, 22, (2), pp. 123–132
19 Kishan, M., Chilukuri Mohan, K., Sanjay, R.:‘Elements of artificial
neural networks’ (The MIT Press, October 1996)
20 Xu, S., Chen, L.:‘A novel approach for determining the optimal number
of hidden layer neurons for FNN’s and its application in data mining’. Fifth
Int. Conf. on Information Technology and Applications, Queensland, 2008,
pp. 683–686
21 Sakoe, H., Chiba, S.: ‘Dynamic programming algorithm optimization for
spoken word recognition’, IEEE Trans. Acoust., Speech Signal Process.,
1978, 26, (1), pp. 43–49
Download