International Journal of Engineering Trends and Technology (IJETT) – Volume 34 Number 2- April 2016 Using supervised technique for label (tag) refining of weak labels of web facial images Ms Patel Safiya#1 # Student, Department of Computer Science & Engg at DIEMS, Aurangabad, India Abstract: Today’s world is the world of smart phones, internet, multimedia which has made the art of capturing images or image data very immense as well as easy. This has flooded the web with the ample of images all over & the problem of annotating the images and labeling them arises too. Sometimes, it may happen that we may search the web for a person’s image but we may get some different or irrelevant image because the labels of these images may be weak, noisy or with incomplete names. Therefore, this paper aims to give the proper labels of the facial images using supervised learning method through SBFA methodology. We aim to use supervised learning as it proves to be advantageous over other learning methods & give results far better than unsupervised ULR methodology. Keywords — SBFA, supervised learning, SVM, KNN I. INTRODUCTION In today’s world of media and smart phones and the era of social networking, where photo tagging [11] and photo sharing have become popular. Therefore, this has motivated the concept of face annotation which helps the users to tag photos, share and search them online and annotate them easily. This can also be helpful in security purpose for identifying or retrieving the images of particular person from videos and news channels [3] or for other purposes. Therefore annotation of facial images and refining their weak labels serves as an important study. The annotation of facial images through SBFA involves the steps as follows. 1) Collection of images from the web. We will collect some random images of some celebrities or some known personalities from the WWW which consists of irrelevant images with noisy or weak labels names. 2) Preprocessing on an image Pre-processing technique will extract the details from the images like face alignment, detection of face, face region extraction, facial features etc. with the help of GIST feature extraction techniques. 3) Identifying the label of facial image ISSN: 2231-5381 The labels which are weakly labeled in the database with the help of supervised techniques. We aim to compare the two supervised techniques such as KNN and SVM along with unsupervised technique [1] [6] to predict the model of classifiers for labeling the image. The advantage of using supervised learning method is that training data available for supervised learning is in abundance, also provide reasonably accurate classifiers and also helps in risk minimization [8] In the previous paper we use the SBFA architecture for annotating the web facial images and refining the labels through supervised learning method. But in this paper, we mainly focus on label refinement process through KNN and SVM supervised learning methods and comparing the results with unsupervised (ULR) method. Section II reviews the literature survey carried out for producing the proper experimentation results. Section III shows the experimentation carried out for annotating the labels of facial images from the database which are weakly labeled through supervised methods. Section IV shows performance evaluation based on the result Section V draws conclusion to the paper and specifies limitations and future scope. II. LITERATURE SURVEY The work of this paper is related to many areas of research as its aspect is broader one. The first step is the image collection from the web which consists of weak or incomplete labels. So acquiring facial images from the web and processing them are on the task. They simply can be accumulated through various searching techniques from the web. The next group serves the face detection and verification techniques of the web facial images. Recent studies in this domain have attracted the scholars and researchers, they have come up with some benchmarks such as LWF in recent times [10].More research shows advancement in face recognition techniques for annotation of tasks and its verification. There are various image annotation http://www.ijettjournal.org Page 63 International Journal of Engineering Trends and Technology (IJETT) – Volume 34 Number 2- April 2016 techniques existing for classification of training models and refinement of their labels. The next group is about the studies of image annotation. There are various classical approaches for generic image annotation. Various scholars have proposed various works and research in this domain [13][1].Some proposed to annotate images using model based annotation results [5], others proposed to annotate unlabelled images through iterative label propagation scheme also[12]. Recent studies has led the scholars to come up with the SBFA paradigm which has attracted many more attention [9][11].Also Dayong Wang et al. has proposed a novel approach for refinement of labels automatically called as unsupervised label refinement (ULR) scheme[13]. In this paper we tend to use supervised technique for refining the (weak) labels of images through various supervised methods and compare the result with ULR method proposed by [15]. III. EXPERIMENTATION We are using Matlab 2015 for our experimentations. Firstly we collected random images of various celebrities into a database from the web; we refer this image database as “retrieval image database” Therefore retrieval image database now consists of multiple random images with their weak labels. In order to track the different processing time used by different learning methods, we have created separate database varying in number of images and actors. So we created 3 image databases with number of images as 200, 285 and 335 consisting of 4, 4, 5 actors respectively. For the test dataset we divide are same retrieval image database in ratio of 80:20 where 80% consists of retrieval dataset and 20% as test dataset. The next task is applying pre-processing technique on images such as face detection and recognition. As we are working only on facial images, we need to detect the faces from the random image which is done using Viola Jones algorithm. We extract the GIST features of images using HOG (Histogram Oriented Gradient) features to detect object in the image. With the help of the training features available we prepare a classification model by training the classifier using 2 supervised techniques KNN and SVM. Following shows the model classifier used to predict the labels of the images based on the classifier. Model=fit[C/L] [Model type](X,Y);where C stands for classification or regression, model type is any supervised technique applied such as KNN or SVM, X and Y represents the training features and labels respectively. face recognition high dimensionality feature extraction can be used using CascadeObjectDetector feature in Matlab. IV. PERFORMANCE ANALYSIS The experimentation is based on calculating the processing time required to annotate the image from the complete search based process comparing the two supervised algorithms. There is a considerable amount of change in time in both the processing speed of both the supervised techniques. Table 1 shows the results w.r.t to the time required for each image annotation on different databases with different number of actors available in given dataset. I. TABLE I ML technique DB200 DB285 DB335 SVM KNN 18-19 0.10-0.25 28-30 0.10-0.35 38-45 0.10-0.45 Table 1. Above table shows the processing time required by each technique for returning the refined labels of the images. We can see from the above observations that time taken by KNN reduces drastically as compared to SVM. Graph 1. SVM carried out on 3 different databases named as DB200, DB285 & DB335 where each database consists of 200, 285 & 335 images respectively based on the readings shown in Table1. Label=predict (model, input) where input stands for image query and model acquires the above information. In order to increase the accuracy for ISSN: 2231-5381 http://www.ijettjournal.org Page 64 International Journal of Engineering Trends and Technology (IJETT) – Volume 34 Number 2- April 2016 V. CONCLUSION Graph 2. KNN carried out on 3 different databases named as DB200, DB285 & DB335 where each database consists of 200, 285 & 335 images respectively based on the readings shown in Table1. Database N-5 N-15 N-25 N-30 DB200 100 99.8 99.3 98.38 DB285 99.98 99.72 98.45 98.3 DB335 99.9 99.65 98.4 98.18 Table 2. Above table shows the recognition accuracy rate of n, where n is the number of images thrown on each database DB200, DB285 & DB335 and gets the image accuracy recognized in percentage. This paper reviews the need of annotating the images in today’s technologically growing world. Moreover this paper also focuses on the accurate labeling of those noisy or weakly labeled data images through SBFA technique. Further to refine the label quality for annotating the images, KNN & SVM supervised methods are used rather than unsupervised method of learning. The 2 supervised techniques give drastic change in processing time varying with different databases. We recognize the web facial images based on Histogram of oriented gradient features called as HoG features based on feature extraction and the labels are refined by predicting the model based on the training labels acquired from the weakly or noisy labels. Despite of refining the label based on SBFA framework which is new architecture, we also used the supervised method for refining weak labels instead of unsupervised method as mentioned in the previous base paper. Though the work is an achievement in its own sense yet it faces some limitations of refining the labels if 2 persons have the same label o share the same name. Duplicity is little bit difficult to trace. Evaluation of the same methodology on larger databases can be considered as a future scope of this project. Also different supervised as well as semi-supervised algorithms can be used for label refinement. Moreover different facial recognition algorithms can be used to facial recognition to improve the accuracy further. REFERENCES Graph 3. Based on the above observation, we have represented the readings in the graphic format as above where, n represents the number of images treated with different databases for recognition. The results are calculated on the basis of normal probability function. ISSN: 2231-5381 1. Dayong Wang, Steven C. H. Hoi, Member, IEEE, Ying He, and Jianke Zhu, “Mining weekly labeled web facial images for search based annotation” IEEE transaction on Knowledge and data engineering, Vol. 26, No. 1, January 2014. 2. Ms. Shalaka S. Dixit, Ms. Antara Bhattacharya, “A survey on search based web facial image annotation methods” IJERT ISSN: 2278-0181, Vol. 3, Issue 11, November 2014. 3. Lei Zhang, Long bin Chen, Ming Jing Lee, Hong Jiang Zang, “Face annotation of Human faces in family albums” 4. Dayong Wang, Steven C.H. Hoi, Ying He “A unified learning framework for auto face annotation by mining web facial images”, Proc. 21st ACM International conference information and knowledge management (CIKM) pp. 1392-1401, 2012 5. Fabian, Airolab, Tapio and Oliver “Fast and Simple GradientBased Optimization for Semi-Supervised Support Vector Machines”,pp 4-8. 6. D. Wang, S.C.H. Hoi, Y.He, and J. Zhu, “Retrieval based face annotations by weak label regularized local coordinate coding”, Proc. 19th ACM, intl. Conf. Multimedia pp. 353-362, 2011 7. Y. Tian, W. Liu, R, Xiao,, F. Wen, and X. Tang, “A Face Annotation Framework with Partial Clustering and Interactive Labelling,” Proc. IEEE Conf. Computer Vission and Pattern Recognition (CVPR), 2007 9. Matthieu Guillaumin, Jakob Verbeek and Cordelia Schmid, Grenoble, “Multimodal semi-supervised learning for image classification” Laboratoire Jean Kuntzmann. 10. Olivier Chapelle, Vikas Sindhwani, Sathiya S. Keerthi, “Optimization Techniques for Semi-Supervised Support Vector Machines” Journal of Machine Learning Research 9 (2008) 203233, Published Feb,2008. http://www.ijettjournal.org Page 65 International Journal of Engineering Trends and Technology (IJETT) – Volume 34 Number 2- April 2016 11. J. Cui, F. Wen, R. Xiao, Y. Tian and X. Tang, “EasyAlbum: An Interactive Photo Annotation System Based on Face Clustering and Re-Ranking,” Proc. SIGCHI Conf. Human Factors in Computing system (CHI), pp. 367-376, 2007. 12. Colin Campbell and Nello “Simple learning algorithms for training Support Vector Machines” 13. W. Dong, Z. Wang, W. Josephson, M. Charikar, and K. Li, “Modeling LSH for Performance Tuning,” Proc. 17th ACM Conf. Information and Knowledge Management (CIKM), pp. 669-678, 2008. 14.S.C.H. Hoi, R. Jin, J. Zhu, and M.R. Lyu,” Semi-supervised SVM Batch Mode Active Learning with Applications to Image Retrieval,”ACM Trans. Information Systems,vol.27,pp.1-29,2009. 15.Qi Gu, Zhifei Song, “Image Classification Using SVM, KNN and Performance Comparison with Logistic Regression” CS44. ISSN: 2231-5381 http://www.ijettjournal.org Page 66