FACE RECOGNITION IN UNCONSTRAINED ENVIRONMENTS - A LITERATURE SURVEY Sreelakshmi K R PG Schoalr, CSE Department College of Engg. Kidangoor Kottayam, Kerala, India Email Id: srilakshmy.kr@gmail.com Anitha R HOD, Computer Science Department College of Engg. Kidangoor Kottayam, Kerala, India Email Id: anitharshibu@gmail.com Abstract—Face recognition is one of the challenging areas in Image processing, especially in unconstrained environments. The challenges arise due to the variations in face pose, expressions, illuminations, occlusions etc. The same problems are studied and faced in Forensic areas for criminal identification. Law enforcement and intelligence agencies are trying to find the suspect with the aid of automatic FR systems using all sources of information available, which may include images, videos or face sketches etc... This survey lights out the possibilities suggested for unconstrained face recognition in forensic area and suggesting that multiple media based face recognition is the solution to be adopted. Keywords-Unconstrained face recognition, multiple media based face recognition, media collection. I. INTRODUCTION The growth of pattern recognition researches has engrossed face recognition technology very much interest over the past decades and so the performance of face recognition algorithms has advanced significantly. The application area includes Law enforcements and Forensic agencies, PC logon, e-Voting system, Surveillance system etc. As the application increases, need for new algorithms that come across with better accuracy is demanding. Particularly, face recognition in unconstrained environments is increasingly challenging and in application areas such as Forensic and law enforcement, it is of great importance. Face Recognition is nothing but identifying a human face that is identified before. Humans perform this task easily and we do the same several times every day, but the machine recognition of faces is an interesting and challenging crisis. And thus come the time of Automated Face Recognition. Apart from conventional face recognition, in unconstrained face recognition systems the images are captured under unreliable conditions and are taken without subject’s corporation [1]. The varying conditions include illumination variation, environmental impact, pose deviation, occlusion due to several factors… Fig.1. Automated Face Recognition Whatever it is, the process of face matching is takes place there and thus it requires two sets of images; one called gallery image and other is called probe images. And in the process, probe images are matched against gallery images for a true match. Every FRT is aided with a large database which contains identified images or videos of people. There are several datasets available, publicly too, for the face matching. In all database the faces are labeled after some training algorithms applied. So the matching is against a labeled gallery. A general diagram describing the process of automated face recognition is shown below in Fig-1. The first step involved in automated face recognition is obviously, the face detection which is nothing but detecting the faces in the given image in order to find its identity. Then the matching is aided with feature extraction; identifying the invariant key features of both probe and gallery images which includes finding the locations of the eyes, nose, and mouth, and measuring the distances between these characteristic locations. Face Recognition is benefitted with several algorithms nowadays with vast set features, and algorithms such as PCA, using Eigenfaces, ANN, algorithms using LBP features etc. But still it is challenging because of unreliable conditions on images. In unconstrained face recognition problem, especially in forensic areas, some kind of face preprocessing is necessary other than face normalizations, including pose correction of faces in the image. Pose correction generally refers to the process of making the face frontal since matching is benefitted with frontal face image databases. With the help of pose correction algorithms a 3D face model is generated. And a 2D face image is rendered from the model which should be of frontal face and can be matched against the gallery. There is variety of algorithms for modeling and software solutions are also there. In this paper a survey of various unconstrained face recognition methods are conducted and concentrating mainly in forensic areas as face recognition has become an unavoidable forensic tool used by criminal investigators and law enforcement agencies. II. LITERATURE SURVEY Compared to automated face recognition, face recognition in forensic area is more difficult because it must be capable of handling facial images captured under uncontrolled conditions. In [2] improvements in forensic face recognition all the way through research with facial aging, facial marks, forensic sketch recognition, face recognition in video, etc are discussed. But still there are limitations in forensic face recognition due to pose variations, illumination, occlusion etc. Unconstrained face recognition is studied well in theory for several times. But most of them considered single media as input to the system. Among them two database oriented solutions are widely discussed and studied. Face recognition has benefited greatly from many databases that are used to study it. Most of them databases are created under restricted conditions to facilitate the study of particular parameters on the face matching problem. These parameters include variations in position, lighting, face expressions, image background, quality of camera used to capture the image, face occlusions, age, and gender and so on. Gary B. Huang et al [2] proposed such a database called LFW: Labelled Faces in the Wild. It is an initial attempt to supply a set of labeled face photographs with the range of conditions that is encountered by people day by day. In this database there are 13,233 images of 5749 different persons each of size 250 by 250 and are in JPEG format. Among them 1680 persons have 2 or more images and remaining 4069 persons have only single image in the database. Each person is given a unique name which is centered in the picture. Most of the images are color but some gray scale images are also present. The authors defined two “Views” for the database; one for algorithm development for others and other view is for performance reporting. Several discussions are conducted based on LFW dataset including [7] – [12]. But still results show deviations across different methods. A next step towards database creation in unconstrained environments is for recognition of faces from the videos. This is demanding much more nowadays in forensic field, since most of the evidence available for them is from ubiquitous surveillance cameras all around the world. When a crime is happened the proof available only are videos captured by surveillance cameras, may be of low quality. So recognition from videos is very important for law enforcement agencies. Lior Wolf [4] and et al proposed a complete database of labeled videos of faces in uncontrolled conditions called the ’YouTube Faces’. As we all know videos in nature provide more information about the person of interest than single images. Certainly, several existing methods have obtained remarkable recognition performances by exploiting the fact that a single face may become visible in a video in several successive frames. They defined a new similarity measure, called the Matched Background Similarity .This similarity is based on information from multiple frames while being resistant to pose, lighting conditions and other ambiguous factors. The YouTube Faces (YTF) database, which is out in 2011, is the video counterpart to LFW for unconstrained face recognition in videos. That means the database contains videos of persons which have labeled images in LFW database. The database contains 3,425 videos of 1, 595 persons. The unconstrained face recognition using YTF is conducted in [5], [7] and [10]. Also video to video matching is performed several authors in [15]-[18]. However, the video face recognition in unconstrained environments demands much more attention due to the abundant ubiquitous video sources of poor quality. Due to varying and confusing factors such as pose, lighting, expression, occlusion and low resolution, current face recognition technology deployed in forensic applications work in a semi-automatic manner; that is a human operator reviews the top matches from the system to manually determine the final match. So, it is essential to investigate the accuracies achieved by both the recognition by machines and by humans on unconstrained face recognition tasks. A typical framework designed to measure human precision on unconstrained faces in still images and videos is crowd sourcing on Amazon Mechanical Turk [4]. Amazon Mechanical Turk simply, MTurk, is a website used for "crowd sourcing" from a large number of human participants. By crowd sourcing we mean retrieving valuable information by the combined effort of more than one participant and they are called the “workers”. Mturk helps the individuals known as “Requesters” to find solutions to the tasks that currently impossible to do with computers, by posting some Human Intelligence Tasks (HITS) and collecting the results from workers. The process is handled in an economical way such that both workers and requesters are benefitted from it. This seems nice that will help to analyze the human versus machine performance and it is of very useful in forensic and security field. Face images in unconstrained environments usually are of considerable pose variation, which should reduce the performance of algorithms developed to recognize the frontal faces. So some sort of pose correction is necessary prior to matching of faces. There is lots of work which helps in pose normalizing the faces and authors of [13] propose a face recognition framework able to handle a range of pose variations within ±90° of yaw. Rotation about the vertical axis is called yaw angle. The first step in this framework is transformation of original poseinvariant face recognition problem into a partial frontal face recognition problem. And then render a 2D face image of frontal nature from the 3D model generated. This is used for matching which yields a better matching rate. So it can be say that in forensic identification one major task is to pose correct the image for better accuracy before applying any matching algorithm. Several pose correction algorithms are there which are studied on different datasets [19] - [21]. The accuracies vary across algorithms. So better face synthesis methods are expected which are able to generate facial shape and textures under varied poses, by using more features in the image or by combining multiple synthesis strategies. A detailed study on unconstrained face recognition is conducted by Jordan Cheney and Ben Klein et al in [5]. They considered nine different face detection frameworks which are acquired through government rules, open source, or commercial licensing. The data set utilized here for analysis is the IJB-A, a recently released unconstrained face recognition dataset which enclosed 67,183 labeled faces of 5,712 images and 20,408 video frames. The result of this study is that top performing detectors are still not able to detect the faces with severe pose, partial occlusion, or poor illumination. So, still the FRT need to be advanced. Now think on another direction of FRT which lets us to classify existing face recognition system into two broad categories; one is single media based UFR and multiple media based UFR. Single media based UFR takes only single input in the form of an image or a video etc. But, multiple media based recognition considers the images, videos, face sketches everything available for a person of interest as a media to identify the person as an input. Recently, NIST (National Institute of Science and Technology) conducted a study on single media based UFR and Multiple media based UFR [14]. The results shows, for various face datasets discussed far, that the recognition accuracy is increasing when combining multiple media sources to recognize a person. FRGC (Face Recognition Grand Challenge v.2) is an example for UFR based on multiple media, which tests across single image and single3D image vs. sing 3D image. And it yields 79% accuracy at 0.1% FAR (False Acceptance Ratio). These comments put forward that in unconstrained environments, a single face media probe, of poor quality, may not be reliable to provide an adequate picture of a face. This suggests the use of collection of face media which can include any sources of information that is accessible for a probe image. It is, thus, significant to establish how the face recognition accuracy can improve when input with a set of face media of different types, even if of diverse qualities, as probe. Thus the survey points towards the solutions associated with multiple media based algorithms and one best discussion that shows significant improvement in this area is [7] proposed by Best Rowden and et al. In this paper a media collection is proposed, that includes an image, one or more video frames of the subject, a forensic face sketch drawn by the expert artists, and some kind of demographic information (age, gender, race etc,.). The final aim is to recognize a person of attention, based on poor quality face images and videos; it is indeed to utilize anything available about the person. While traditional face recognition methods generally consider a single media (i.e., still face images, video tracks, 3D face models, 2D face image rendered from 3D model or face sketch) as input, this paper considers these all sources as the input to get a true mate for the person of interest. Fig2 from [7] suggests the different sources of subject that can be considered to identify the subject by combining the media. multiple media of a person it will give more accuracy. III. CONCLUSION A set of face media can be utilized to better identify a subject in forensic face recognition. Some sort of pose correction algorithms should improve the identification accuracy. So with media fusion and a good 3D face modeling technique a better FRT in unconstrained environment can results in a good TAR with minimum FAR. Again investigations in the direction of better face media quality measures can improve the recognition rate decrease the effort in the unconstrained environment. Our ongoing work investigates for the fully algorithmic approach of multiple media based face recognition in unconstrained environments. ACKNOWLEDGMENT Fig.2: A sample collection of face media for a subject[7] Also another fact is that Pose-corrected face images from the LFW database, pose corrected video frames from the YTF database used in this paper have been made available publicly. By referring these URLs anyone can easily utilize the datasets for the experiments: http://viswww.cs.umass.edu/lfw for LFW database and http://www.cs.tau.ac.il/∼wolf/ytfaces/ for YTF datasets. So this is of interesting and further study on this idea is sound good. A work in the direction of multiple media can be suggested on the light of a case study as depicted in [22]. On April 15, 2013, 2 bombs are exploded near Boston Marathon (US). And it was a missed opportunity for the existing single media based Automated Face recognition system. The existing technology couldn’t identify the 2 suspects; actually they were brothers, even if the images of both exist in the face database. It was the motivation behind the work in [7]. Two viable face recognition systems used in this survey were NEC NeoFace 3.1 and PittPatt 5.2.2. NeoFace [23] was the top matcher in the National Institute of Standards and Technology (NIST) Multiple Biometrics Evaluation (MBE) 2010 test. These two but failed to identify the suspects in Boston Marathon attack. So, concluding that, more advancement must be made in overcoming challenges such as pose, resolution, and occlusion in order to increase the recognition accuracy of existing system. And if it is by utilizing Sincerely thankful to Guide, other faculties and friends for supporting and helping to complete this work. REFERENCES [1] A. K. Jain, B. Klare, and U. Park, “Face matching and retrieval in forensics applications,” IEEE Multimedia. [2] Anil K. Jain, Brendan Klare and Unsang Park “Face Recognition: Some Challenges in Forensics,” 9th IEEE Int'l Conference on Automatic Face and Gesture Recognition, Santa Barbara, CA, March, 2011. [3] G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database for studying face recognition in unconstrained environments,” Univ. Massachusetts Amherst, Amherst, MA, USA, Tech. Rep. 07-49, Oct. 2007. [4] L. Wolf, T. Hassner, and I. Maoz, “Face recognition in unconstrained videos with matched background similarity,” in Proc. IEEE Conf. CVPR, June 2011 [5] L. Best-Rowden, B. Klare, J. Klontz, and A. K. Jain, “Video-tovideo face matching: Establishing a baseline for unconstrained face recognition,” in Proc. IEEE 6th Int. Conf. BTAS, September./October. 2013 [6] Jordan Cheney, Ben Klein, Anil K. Jainy and Brendan F. Klare “Unconstrained Face Detection: State of the Art Baseline and Challenges” [7] Lacey Best-Rowden, “Unconstrained Face Recognition: Identifying a Person of Interest From a Media Collection”. Information Forensics and Security, IEEE Transactions on (Volume:9,Issue:12)Biometrics Compendium [19] Y. Lin, G. Medioni, and J. Choi, “Accurate 3D face reconstruction from weakly calibrated wide baseline images with profile contours,” in Proc. IEEE Conf. CVPR, June 2010. [8] K. Simonyan, O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Fisher vector faces in the wild,” in Proc. BMVC, 2013. [20] A. Asthana, T. K. Marks, M. J. Jones, K. H. Tieu, and M. Rohith, “Fully automatic pose-invariant face recognition via 3D pose normalization,” in Proc. IEEE ICCV, November 2011. [9] D. Chen, X. Cao, F. Wen, and J. Sun, “Blessing of dimensionality: High-dimensional feature and its efficient compression for face verification,” in Proc. IEEE Conf. CVPR, June 2013. [21] C. P. Huynh, A. Robles-Kelly, and E. R. Hancock, “Shape and refractive index from single-view spectro-polarimetric images,”Int..J.Comput. Vis., volume 101,no.1 [10] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “DeepFace: Closing the gap to human-level performance in face verification,” in Proc. IEEE Conf. CVPR, June 2014. [22] J. C. Klontz and A. K. Jain, “A case study on unconstrained facial recognition using the Boston marathon bombings suspects,” Michigan State Univ., Lansing, MI, USA, Tech. Rep. MSU-CSE-13-4, May 2013 [11] Y. Sun, X. Wang, and X. Tang, “Deep learning face representation from predicting 10,000 classes,” in Proc. IEEE Conf. CVPR, June 2014. [12] S. Liao, Z. Lei, D. Yi, and S. Z. Li, “A benchmark study of large-scale unconstrained face recognition,” in Proc. IAPR/IEEE IJCB, September/October 2014. [13] Changxing Ding, Chang Xu, and Dacheng Tao, “Multi-task Pose-Invariant Face Recognition,” Ieee Transactions On Image Processing, Volume 24. [14] National Institute of Standards and Technology (NIST). (Jun. 2013). Face Homepage. [Online]. Available: http://face.nist.gov [15] L. Baoxin and R. Chellappa, “A Generic Approach to Simultaneous Tracking and Verification in Video," IEEE Transactions on Image Processing, volume 11. [16] K. Lee, J. Ho, M. Yang, and D. Kriegman, “Videobased Face Recognition Using Probabilistic Appearance Manifolds," volume 1 [17] N. Ye and T. Sim, “Towards General Motionbased Face Recognition," Proc. 2010 IEEE conference on Computer Vision and Pattern Recognition (CVPR). [18] X. Liu and T. Cheng, “Video-based Face Recognition Using Adaptive Hidden Markov Models.2," Proc. 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 1. [23] www.nec.com/en/global/solutions/security/products/f ace recognition.html.