Performance Analysis of Robust Method to Identify

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- Sep 2013
Performance Analysis of Robust Method to Identify
the Speaker Using Lip Segmentation
Kolukula B Shankar#1, K Kanthi Kumar*2 K V Murali Mohan*3
1
Kolukula B Shanker, Pursuing M.Tech (ECE) at Holy Mary Institute of Technology and science (HITS), Bogaram,
Keesara, Hyderabad. Affiliated to JNTUH, Hyderabad, A.P, INDIA
2
K Kanthi Kumar, working as an Associate Professor (ECE) at Holy Mary Institute of Technology and science (HITS),
Bogaram, Keesara, Hyderabad. Affiliated to JNTUH, Hyderabad, A.P, INDIA
3
K V Murali Mohan, working as a Professor HOD ( ECE) at Holy Mary Institute of Technology and science (HITS),
Bogaram, Keesara, Hyderabad. Affiliated to JNTUH, Hyderabad, A.P, INDIA
Abstract— This document addresses the problem of providing
security to vehicles based on a unique biometric feature that is
lip motions. This work proposes the use of explicit lip motion
features for speaker identification so that the car can be
unlocked depending on the identification process results. For
identification process, lip boundaries are tracked over the images
and compared to the database. For the locking or unlocking
operation, an Embedded processor of PIC family is used along
with other electronic peripherals.
Keywords— Speaker Identification, Image Segmentation, Lip
motion.
I. INT ROD UCT IO N
Biometrics refers to the automatic identification of a living
person based on physiological or behavioural characteristics
for authentication purpose [1]. For the purpose of car security,
different works on various biometric parameters like face
recognition [2], fingerprint recognition [3], finger-geometry
[4], hand geometry [4], iris recognition [5], vein recognition
[6], voice recognition [7] have been presented. Biometric
method requires the physical presence of the person to be
identified. This emphasizes its preference over the traditional
method of identifying ‘what you have’ such as, the use of
password, a smartcard etc.
It is quite natural to assume that lip movement would also
characterize the identity of an individual as well as what the
individual is speaking. For the speaker identification problem
however, the use of lip motion is a more sophisticated issue
and has been addressed only in few works such as [8, 9, and
10]. The main challenge is that the principal components of
the lip information are not usually sufficient to discriminate
between the speakers. High frequency or non-principal
components of the signal should also be valuable especially
when the objective is to model the biometrics, i.e. specific lip
movements of an individual rather than what is uttered. Every
biometric recognition system can be summarized in the
following three steps
ISSN: 2231-5381
i.
Feature Extraction: A set of characteristics are
extracted from the samples collected and a user
template is created.
ii.
Comparison: When a new user needs to be
identified, an existing time sample is taken and
matched against the stored samples. Different
distances (e.g., Euclidean and Hamming), statistics
methods (Gaussian mixture model) and classifiers
have been successfully applied to perform this
comparison task.
iii.
Decision: Based on the comparison results, an
action has to be taken depending on whether the
result is successful or not.
A biometric security system based on lip detection is an
interesting topic on which very less research has been done
and it has its own benefits. This work explores the topic of car
security system and a prototype has been developed in order
to check the feasibility of the project. The prototype uses an
Embedded micro-controller to control the locking system of
the car. The method is divided into five stages
1. Face Region Extraction
2. Mouth Region Localization
3. Key points extraction
4. Feature Vector Comparison
5. Unlocking Action
II. FACE REGION EXTRACTION
To extract the face region, we study the chromatic
distribution of the face region. To reduce the problem created
because of chromatic variation in different faces, a histogram
is constructed using many face images.
A. Color Space Transformation and Lighting Compensation
We convert the RGB image into HSI representation using
the formula:
1
= ( + + )
3
‫ݎܫ‬
http://www.ijettjournal.org
Page 3870
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- Sep 2013
‫ݎܯ ݎ‬
‫ ܪ‬ଵቌ ‫ ݎ‬ଶ‫ݎ ݎ ݎ‬
3
= 1−
+ +
= cos
that these filters are robust against different types of skin
colors.
[ ( , , )]
1
[( − ) + ( − )]
2
[( − ) + ( − )( − )] /
ଵଶ
ቍ
Here I, H and S stands for Intensity, Hue and Saturation
respectively. In HSI color model, chromatic and intensity
components are separated. Therefore, by removing the
intensity component, the effect of illumination changes can be
reduced.
III. MOUTH REGION LOCALIZATION
A. Mouth Region Extraction
Since the corners of the mouth have little alteration effected
by expression and it can be located easily, so we define the
two mouth corners as the feature points of the mouth area.
Hsu et al. found out that the red component chrominance Cr is
more prevalent in the mouth region than the blue component
chrominance C b. It was shown that the mouth region had a
high Cr2 response, but a low C r/Cb response. The mouth map is
found
as:
‫ ܯ‬ଶ‫ܥ‬൬‫ܥ‬ଶ ‫ܥߟܥ‬൰ଶ
ఢ‫ ܥ‬ଶ
ఢ‫ܥ‬
‫ܥ‬
=
−
Where η is the ratio of the average C r2 to the average C r/C b,
1
(, )
∑
(, )
= 0.95
1 ∑ ( , )
(, )
(, )
ߟ
C r2 and Cr/Cb are in the range [0,255]0020and n is the
number of the pixels in the face mask F.
B. High Frequency noisy removing
IV. LIP FEATURE EXTRACTION
The most basic of filtering operations called “low-pass”
A. LIP Contour Tracking
filter is used to average out the rapid change in intensities. It
The accuracy and robustness of the lip contour extraction
simply calculates the average of a pixel and all of its eight
immediate neighbours. The result overwrites the actual value method are crucial for a recognition system that uses lip shape
of the pixel. This process is repeated for each pixel in the information. The quasi-automatic technique is employed for
lip contour extraction. The algorithm starts by putting one
image captured/to be tested,
single seed above the mouth and near its vertical symmetry
axis to initialize the jumping snake. The upper lip boundary is
found by after the convergence of the snake and the three
points forming the Cupid’s brow on the upper lip, i.e., P2,P3
and P4, are detected by a simple maxima-minima function.
Another key point P6 on the lower lip boundary near the
vertical axis of the mouth is located by analysing the onedimensional gradient of the pseudo-hue along the vertical axis
passing by P3. Mouth corners P1 and P5 are detected using
both the minima of luminance computed along each vertical
pixel group and an edge criterion.
C. Face Region Segmentation
Our face segmentation algorithm uses the colour
B. Skin and Lips Color Analysis
information to segment the facial regions. The pixels of the In RGB space, skin and lip pixels have quite different
input image can be classified into skin color and non skin components. For both the regions, red is a prevalent color.
color. The facial region has a special color distribution, which Moreover there is more green than blue in the skin color
differs significantly from those of the background objects. mixture and for lips these two components are the same.
Hence a skin-color reference filtering in YCbCr color space is Hulbert and Poggio [10] propose a pseudo hue definition that
adopted. In the CIE chrominance, a skin-color region can be exhibits this difference.
(, )
identified by the presence of a certain set of chrominance ( i.e.
ℎ( , ) =
( , )+ ( , )
Cr and Cb) values, which belongs to a narrow and consistent
distribution. According to different experiments, the suitable Unlike usual hue, the pseudo hue is bijective and higher for
ranges are given by SkinRcr= [133,173] and SkinRcb= [77, 127] lips than in skin
as skin-color reference filter. It has been experimentally found
‫ݕݔܤݕݔݕݔܩ ݕݔ‬
ISSN: 2231-5381
http://www.ijettjournal.org
Page 3871
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- Sep 2013
[3]
[4]
[5]
[6]
[7]
C. Feature Extraction
Once the lip shape has been extracted we use five kind of
features:
1. Distance from left corner of lip to centroid
2. Distance from right corner of lip to centroid
3. Min distance from centroid to upper half of lip
4. Min distance from centroid to lower half of lip
5. Distance between corners of lip
V. COMPARISON SYSTE M
[8]
[9]
[10]
[11]
[12]
The comparison system or identification stage will give the
probability of the input lip belongs to an individual. Threshold
exceeding performs successful identification. Here we are
using template matching. According to this result, a data will
be generated by the PC on its communication port using RS
232 protocol. The PIC uC receives this data and accordingly,
controls the locking operation.
VI. CO NCL USION S
In this paper, we have presented an automatic, robust and
accurate lip segmentation method. Then four points is fitted to
the outer lip boundary. Its high flexibility enables very
accurate and realistic results. It makes this method very
suitable for applications which require a high level of
precision such as lip reading. We introduced a new biometric
identification system based on lip shape biomeasures, a field
in which little research has being done. This is considered as
good result and encourage for its use combined with other
biometrics systems. A quasiautomatic system to extract and
analyze robust lipmotion features is presented for the open-set
speaker identification problem. Therefore, if available,
accurate and robust lip motion information is an asset to
improve the performance of unimodal (i.e. speechonly)
systems, which are mostly corrupted by noise.
Recognition Technique”, SASTECH Journal, vol. 9, issue 2,
September 2010.
Karthikeyan. A, and Sowndharya. J, “Fingerprint Based Ignition
System”, IJCER, Mar-Apr 2012, Vol. 2 , Issue No. 2, p. 236-243.
Sotiris Malassiotis, Niki Aifanti, and Michael G. Strintzis, Fellow,
IEEE, “Personal Authentication Using 3-D Finger Geometry”, IEEE
Transactions on Information Forensics and Security, Vol 1, No 1,
March 2006.
Sreekala P., Jose V., Joseph J., and Joseph S., “The human iris
structure and its application in security system of car”, Published in
2012 IEEE International Conference on Engineering Education:
Innovative Practices and Future Trends (AICERA), 19-21 July 2012.
“Finger Vein Authentication”, White Paper by Hitachi, 2006.
Anjali Bala, Abhijeet Kumar, and Nidhika Birla, “Voice Command
Recognition System based on MFCC and DTW”, International Journal
of Engineering Science and Technology, Vol. 2 (12), 2010, 7335-7342.
C. C. Chibelushi, F. Deravi, and J.S. Mason, “A review of speech
based bimodal recognition”, IEEE Trans. On Multimedia, vol. 4, no. 1,
pp. 23-37, 2002.
G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior,
“Recent Advances in the Automatic Recognition of Audio Visual
Speech,” Proc. Of the IEE, vol. 91, no. 9, Spetember 2003.
D.G. Stork and M.E. Hennecke, Speec
h Reading by Humans and Machines: Models, Systems and
Applications, Springer-Verlag, 1996.
E.Erzin, Y Yemez, and A.M. Tekalp, “Multimodal speaker
identification using an adaptive classifier cascade based on modalitiy
reliability,” accepted for publication on IEEE Transactions on
multimedia 2004.
VII.
AUTHOR DETAILS
Kolukula
Bhavani
Shanker
perusing M Tech (ECE), Pursuing
M.Tech (ECE) Holy Mary Institute of
Technology and science (HITS),
Bogaram,
Keesara,
Hyderabad.
Affiliated to JNTUH, Hyderabad,
A.P, INDIA
K Kanthi Kumar, Associate
professor (ECE) at Holy Mary
Institute of Technology and science
(HITS),
Bogaram,
Keesara,
Hyderabad. Affiliated to JNTUH,
Hyderabad, A.P, INDIA
K V Murali Mohan is working as a
Professor HOD ( ECE) at Holy Mary
Institute of Technology and science
(HITS),
Bogaram,
Keesara,
Hyderabad. Affiliated to JNTUH,
Hyderabad, A.P, INDIA
REFERE N CES
[1]
[2]
Omidiora E. O. (2006), A Prototype of Knowledge-Based System for
Black Face Recognition using Principal Component Analysis and
Fisher Discriminant Algorithms, Ph. D Thesis, Department of
Computer Science and Engineering, Ladoke Akintola University of
Technology, Ogbomoso, Nigeria.
B. Ramya, N. Anukrishnan, and Saima Mohan, “Design and
Development of Car Ignition Access Control System Based on Face
ISSN: 2231-5381
http://www.ijettjournal.org
Page 3872
Download