WHO IS HERE: LOCATION AWARE FACE RECOGNITION Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 INTRODUCTION CHALLENGES • Many facial expressions • Changes in appearance • • • • • Hair style Cosmetics With or without glasses Illumination Varying viewpoints SOLUTION • Use location to narrow down search space Increasing number of photos taken with mobile devices • Use the location information associated with the photo to narrow down the person in the photo • FACE RECOGNITION PROBLEM GIVEN: • Training set—set of face images labeled with the person’s identity • Testing set—set of unlabeled photos from the same group of people GOAL: • Identify each person in the testing photos ASSUMPTIONS • User will have different probabilities of appearing in a photo based on the location • • Example: Alice lives in Palo Alto, CA. Then pictures taken at Alice’s home have a high probability of belonging to Alice, Alice’s family, or Alice’s friends. The pictures have a low probability of belonging to someone in Norfolk, VA When trying to identify someone in the photo, only compare photos which are taken at places the person usually appears HOW IT WORKS • Each face image is associated with a location • The server creates clusters of locations from the training set • Each location cluster contains a set of users who have photos in that location, their photos, and photos of their friends • The client can take a photo and attach its location information, then send it to the server and query the person in the photo • The server will answer the query and return the identification of the person in the photo CHALLENGES • HOW TO FORM THE LOCATION CLUSTERS AND THE GRANULARITY FOR THE LOCATIONS • How to process the photo and extract useful features • How to search smartly in order to recognize the face and identify the person • How to accelerate the entire process and avoid long response time on client side MAIN CONTRIBUTIONS (1) • Make use of the location information from mobile-taken photos and propose a face recognition algorithm which reduces the search space • Build a hybrid face recognition algorithm FIRST SEARCH AND MATCH PHOTOS WITHIN THE GIVEN LOCATION; IF THIS FAILS, SEARCH OVER ALL PHOTOS MAIN CONTRIBUTIONS (2) • Take into account social network information • When a user appears frequently in a locations, the user’s friends also have a high probability of showing up in that location FRIENDS PHOTOS ARE USED TO TRAIN THE FACE CLASSIFIER FOR THE LOCATION MAIN CONTRIBUTIONS (3) • Transmit the compressed face descriptor to the server for the query, rather than sending the original image SAVES THE NETWORK TRAFFIC AND REDUCES RESPONSE TIME FRAMEWORK • Client side: • • • User takes a photo of people on a mobile phone and sends the recognition query to the server using wireless networks Face features and location information are transmitted to the server for recognition Server side: • • • Organize the face database by locations Maintain a backup classifier which is obtained from all images in the database Sends back the identification result LOCATION CLUSTERING GIVEN A COLLECTION OF LABELED PHOTOS WITH GEO-LOCATION INFORMATION, USE AGGLOMERATIVE CLUSTERING TO DISCOVER LOCATION CLUSTERS • Consider each geo-location data, using longitude and latitude as a point in the two-dimensional space • Initially, have n points and assign them to n clusters • In each iteration, merge two clusters if the distance between two clusters is the minimum among all pairs of clusters 1 𝑑 𝐴, 𝐵 = 𝐴 |𝐵| • 𝑑(𝑎, 𝑏) 𝑎∈𝐴 𝑏∈𝐵 Keep merging clusters until the minimum distance in each iteration is above a threshold or the number of clusters wanted is obtained FACE FEATURE • Training a classifier for each location: • • Describing each face: • • Convert photos associated with the location to feature descriptors Adopt a local descriptor based face feature pipeline Detecting faces: • • • Viola-Jones face detector used to detect face patches Nested nose detector applied Face patches normalized to the same size FACE FEATURE • Use algorithm to detect facial landmarks • Align each face patch • Remove effects of illumination FROM EACH LANDMARK, TWO SIFT DESCRIPTORS OF DIFFERENT SCALES ARE EXTRACTED AND CONCATENATED TO FORM THE FACE FEATURE DESCRIPTOR SYSTEM SERVER • Face descriptors are extracted from photos and trained by a Support Vector Machine (SVM) classifier • Face descriptor computed using the pipeline Each location has its own SVM classifier and is represented by the coordinate of the cluster center • Compresses the descriptor and sends to the server • • When a query is received, it checks the location info and finds the nearest location in the database • • MOBILE CLIENT This location is used for face recognition Confidence score is defined • If the confidence score is too low, use the backup database EVALUATIONS • Dataset: • • • • 2,001 images 60 people 6 locations Names and social network relations among the 60 people are known FACE RECOGNITION ACCURACY EVALUATION SUMMARY • Accuracy of 5 tests at a particular location • 80% of images used as training set; 20% used as testing set COMPARISON OF PERFORMANCE • Compared with a baseline method, i.e. method without location CONCLUSION • Seems to improve accuracy • Limitations: • • Only supports finding people already in the dataset Future work: • • • • Scalability—more locations will pose an issue Increase training set incrementally through social network or crowd wisdom Handling when people move from one location to another Predict which locations a person will appear since one person can be in many locations