International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- Sep 2013 Performance Analysis of Robust Method to Identify the Speaker Using Lip Segmentation Kolukula B Shankar#1, K Kanthi Kumar*2 K V Murali Mohan*3 1 Kolukula B Shanker, Pursuing M.Tech (ECE) at Holy Mary Institute of Technology and science (HITS), Bogaram, Keesara, Hyderabad. Affiliated to JNTUH, Hyderabad, A.P, INDIA 2 K Kanthi Kumar, working as an Associate Professor (ECE) at Holy Mary Institute of Technology and science (HITS), Bogaram, Keesara, Hyderabad. Affiliated to JNTUH, Hyderabad, A.P, INDIA 3 K V Murali Mohan, working as a Professor HOD ( ECE) at Holy Mary Institute of Technology and science (HITS), Bogaram, Keesara, Hyderabad. Affiliated to JNTUH, Hyderabad, A.P, INDIA Abstract— This document addresses the problem of providing security to vehicles based on a unique biometric feature that is lip motions. This work proposes the use of explicit lip motion features for speaker identification so that the car can be unlocked depending on the identification process results. For identification process, lip boundaries are tracked over the images and compared to the database. For the locking or unlocking operation, an Embedded processor of PIC family is used along with other electronic peripherals. Keywords— Speaker Identification, Image Segmentation, Lip motion. I. INT ROD UCT IO N Biometrics refers to the automatic identification of a living person based on physiological or behavioural characteristics for authentication purpose [1]. For the purpose of car security, different works on various biometric parameters like face recognition [2], fingerprint recognition [3], finger-geometry [4], hand geometry [4], iris recognition [5], vein recognition [6], voice recognition [7] have been presented. Biometric method requires the physical presence of the person to be identified. This emphasizes its preference over the traditional method of identifying ‘what you have’ such as, the use of password, a smartcard etc. It is quite natural to assume that lip movement would also characterize the identity of an individual as well as what the individual is speaking. For the speaker identification problem however, the use of lip motion is a more sophisticated issue and has been addressed only in few works such as [8, 9, and 10]. The main challenge is that the principal components of the lip information are not usually sufficient to discriminate between the speakers. High frequency or non-principal components of the signal should also be valuable especially when the objective is to model the biometrics, i.e. specific lip movements of an individual rather than what is uttered. Every biometric recognition system can be summarized in the following three steps ISSN: 2231-5381 i. Feature Extraction: A set of characteristics are extracted from the samples collected and a user template is created. ii. Comparison: When a new user needs to be identified, an existing time sample is taken and matched against the stored samples. Different distances (e.g., Euclidean and Hamming), statistics methods (Gaussian mixture model) and classifiers have been successfully applied to perform this comparison task. iii. Decision: Based on the comparison results, an action has to be taken depending on whether the result is successful or not. A biometric security system based on lip detection is an interesting topic on which very less research has been done and it has its own benefits. This work explores the topic of car security system and a prototype has been developed in order to check the feasibility of the project. The prototype uses an Embedded micro-controller to control the locking system of the car. The method is divided into five stages 1. Face Region Extraction 2. Mouth Region Localization 3. Key points extraction 4. Feature Vector Comparison 5. Unlocking Action II. FACE REGION EXTRACTION To extract the face region, we study the chromatic distribution of the face region. To reduce the problem created because of chromatic variation in different faces, a histogram is constructed using many face images. A. Color Space Transformation and Lighting Compensation We convert the RGB image into HSI representation using the formula: 1 = ( + + ) 3 ݎܫ http://www.ijettjournal.org Page 3870 International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- Sep 2013 ݎܯ ݎ ܪଵቌ ݎଶݎ ݎ ݎ 3 = 1− + + = cos that these filters are robust against different types of skin colors. [ ( , , )] 1 [( − ) + ( − )] 2 [( − ) + ( − )( − )] / ଵଶ ቍ Here I, H and S stands for Intensity, Hue and Saturation respectively. In HSI color model, chromatic and intensity components are separated. Therefore, by removing the intensity component, the effect of illumination changes can be reduced. III. MOUTH REGION LOCALIZATION A. Mouth Region Extraction Since the corners of the mouth have little alteration effected by expression and it can be located easily, so we define the two mouth corners as the feature points of the mouth area. Hsu et al. found out that the red component chrominance Cr is more prevalent in the mouth region than the blue component chrominance C b. It was shown that the mouth region had a high Cr2 response, but a low C r/Cb response. The mouth map is found as: ܯଶܥ൬ܥଶ ܥߟܥ൰ଶ ఢ ܥଶ ఢܥ ܥ = − Where η is the ratio of the average C r2 to the average C r/C b, 1 (, ) ∑ (, ) = 0.95 1 ∑ ( , ) (, ) (, ) ߟ C r2 and Cr/Cb are in the range [0,255]0020and n is the number of the pixels in the face mask F. B. High Frequency noisy removing IV. LIP FEATURE EXTRACTION The most basic of filtering operations called “low-pass” A. LIP Contour Tracking filter is used to average out the rapid change in intensities. It The accuracy and robustness of the lip contour extraction simply calculates the average of a pixel and all of its eight immediate neighbours. The result overwrites the actual value method are crucial for a recognition system that uses lip shape of the pixel. This process is repeated for each pixel in the information. The quasi-automatic technique is employed for lip contour extraction. The algorithm starts by putting one image captured/to be tested, single seed above the mouth and near its vertical symmetry axis to initialize the jumping snake. The upper lip boundary is found by after the convergence of the snake and the three points forming the Cupid’s brow on the upper lip, i.e., P2,P3 and P4, are detected by a simple maxima-minima function. Another key point P6 on the lower lip boundary near the vertical axis of the mouth is located by analysing the onedimensional gradient of the pseudo-hue along the vertical axis passing by P3. Mouth corners P1 and P5 are detected using both the minima of luminance computed along each vertical pixel group and an edge criterion. C. Face Region Segmentation Our face segmentation algorithm uses the colour B. Skin and Lips Color Analysis information to segment the facial regions. The pixels of the In RGB space, skin and lip pixels have quite different input image can be classified into skin color and non skin components. For both the regions, red is a prevalent color. color. The facial region has a special color distribution, which Moreover there is more green than blue in the skin color differs significantly from those of the background objects. mixture and for lips these two components are the same. Hence a skin-color reference filtering in YCbCr color space is Hulbert and Poggio [10] propose a pseudo hue definition that adopted. In the CIE chrominance, a skin-color region can be exhibits this difference. (, ) identified by the presence of a certain set of chrominance ( i.e. ℎ( , ) = ( , )+ ( , ) Cr and Cb) values, which belongs to a narrow and consistent distribution. According to different experiments, the suitable Unlike usual hue, the pseudo hue is bijective and higher for ranges are given by SkinRcr= [133,173] and SkinRcb= [77, 127] lips than in skin as skin-color reference filter. It has been experimentally found ݕݔܤݕݔݕݔܩ ݕݔ ISSN: 2231-5381 http://www.ijettjournal.org Page 3871 International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- Sep 2013 [3] [4] [5] [6] [7] C. Feature Extraction Once the lip shape has been extracted we use five kind of features: 1. Distance from left corner of lip to centroid 2. Distance from right corner of lip to centroid 3. Min distance from centroid to upper half of lip 4. Min distance from centroid to lower half of lip 5. Distance between corners of lip V. COMPARISON SYSTE M [8] [9] [10] [11] [12] The comparison system or identification stage will give the probability of the input lip belongs to an individual. Threshold exceeding performs successful identification. Here we are using template matching. According to this result, a data will be generated by the PC on its communication port using RS 232 protocol. The PIC uC receives this data and accordingly, controls the locking operation. VI. CO NCL USION S In this paper, we have presented an automatic, robust and accurate lip segmentation method. Then four points is fitted to the outer lip boundary. Its high flexibility enables very accurate and realistic results. It makes this method very suitable for applications which require a high level of precision such as lip reading. We introduced a new biometric identification system based on lip shape biomeasures, a field in which little research has being done. This is considered as good result and encourage for its use combined with other biometrics systems. A quasiautomatic system to extract and analyze robust lipmotion features is presented for the open-set speaker identification problem. Therefore, if available, accurate and robust lip motion information is an asset to improve the performance of unimodal (i.e. speechonly) systems, which are mostly corrupted by noise. Recognition Technique”, SASTECH Journal, vol. 9, issue 2, September 2010. Karthikeyan. A, and Sowndharya. J, “Fingerprint Based Ignition System”, IJCER, Mar-Apr 2012, Vol. 2 , Issue No. 2, p. 236-243. Sotiris Malassiotis, Niki Aifanti, and Michael G. Strintzis, Fellow, IEEE, “Personal Authentication Using 3-D Finger Geometry”, IEEE Transactions on Information Forensics and Security, Vol 1, No 1, March 2006. Sreekala P., Jose V., Joseph J., and Joseph S., “The human iris structure and its application in security system of car”, Published in 2012 IEEE International Conference on Engineering Education: Innovative Practices and Future Trends (AICERA), 19-21 July 2012. “Finger Vein Authentication”, White Paper by Hitachi, 2006. Anjali Bala, Abhijeet Kumar, and Nidhika Birla, “Voice Command Recognition System based on MFCC and DTW”, International Journal of Engineering Science and Technology, Vol. 2 (12), 2010, 7335-7342. C. C. Chibelushi, F. Deravi, and J.S. Mason, “A review of speech based bimodal recognition”, IEEE Trans. On Multimedia, vol. 4, no. 1, pp. 23-37, 2002. G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior, “Recent Advances in the Automatic Recognition of Audio Visual Speech,” Proc. Of the IEE, vol. 91, no. 9, Spetember 2003. D.G. Stork and M.E. Hennecke, Speec h Reading by Humans and Machines: Models, Systems and Applications, Springer-Verlag, 1996. E.Erzin, Y Yemez, and A.M. Tekalp, “Multimodal speaker identification using an adaptive classifier cascade based on modalitiy reliability,” accepted for publication on IEEE Transactions on multimedia 2004. VII. AUTHOR DETAILS Kolukula Bhavani Shanker perusing M Tech (ECE), Pursuing M.Tech (ECE) Holy Mary Institute of Technology and science (HITS), Bogaram, Keesara, Hyderabad. Affiliated to JNTUH, Hyderabad, A.P, INDIA K Kanthi Kumar, Associate professor (ECE) at Holy Mary Institute of Technology and science (HITS), Bogaram, Keesara, Hyderabad. Affiliated to JNTUH, Hyderabad, A.P, INDIA K V Murali Mohan is working as a Professor HOD ( ECE) at Holy Mary Institute of Technology and science (HITS), Bogaram, Keesara, Hyderabad. Affiliated to JNTUH, Hyderabad, A.P, INDIA REFERE N CES [1] [2] Omidiora E. O. (2006), A Prototype of Knowledge-Based System for Black Face Recognition using Principal Component Analysis and Fisher Discriminant Algorithms, Ph. D Thesis, Department of Computer Science and Engineering, Ladoke Akintola University of Technology, Ogbomoso, Nigeria. B. Ramya, N. Anukrishnan, and Saima Mohan, “Design and Development of Car Ignition Access Control System Based on Face ISSN: 2231-5381 http://www.ijettjournal.org Page 3872