Semantic Object Selection Ejaz Ahmed 1 , Scott Cohen 2 and Brian Price 2 1 University of Maryland, College Park and 2 Adobe Research, San Jose Interactive FG/BG Segmentation Magic Wand Graph Cut Intelligent Scissors Bayes Matting Grab Cut 2 Goal and Motivation “dog” “person” Image User Input Output Goal is to reduce user ef fort for object selection. User simply names the desired object (verbally, NLP, search box). E.g. PixelTone request, “Make the *cat* brighter”. Important for advancing image editing via natural language input. 3 Related Work : Saliency Aim to select the main object in an image by determining which image regions are most “salient”. Yan et al. saliency is determined by optimizing an energy function. Encourages pixels to be salient based on contrast. Cheng et al. similar approach combined with grabcut. Rubenstein et al. Object Discovery in internet images. Yan et al. Cheng et al. These methods select which ever region stands out. Cannot address SOS where object of interest is not the most salient object. 4 Related Work : Semantic Segmentation Semantic Segmentation attempts to automatically segment every object in the image. Object Selection becomes trivial if we have such labeling. This solves much larger problem and does not focus on object of interest. Large amount of pre-labeled data. Predetermined labeled set may not contain desired object. Computationally expensive (large number of classifiers). Exception non-parametric, Liu et al. and Tighe et al. 5 Related Work : Product Search User takes clean, close up picture of a product and the goal is to find similar product from a database. Iterates between localized image retrieval and selection estimation. Product image DB has object mask. These mask are transferred to query image. Similar goal to ours but simplified input (rigid object, large and centered, little view point variation). Mask for database images is required. 6 Proposed Approach Retrieved Dog Images Background Images Input Image Dog + Text- Image DB Segmentation Localization Information Localization/Confidence Map MRF/ Graph Cut 7 7 Localization Localization : Overview Input : name of the object, input image. Output : Object location prior for the target image. Construct image retrieval database. Positives corresponding to desired object and generalized negatives. Break target image into object proposals. Validate presence of object in object proposals. Use SIFT flow to transfer the location associated with validating exemplar areas to the target image. 9 Exemplar Data 10 Negative Data 11 Use Object Detection ? Having positives and negative makes it a detection problem. State-of-the art detectors can be used. Expensive to train such detectors. Exemplar positives are downloaded from the internet, therefore have lot of variation. Dif ficult to train a meaningful detector? We leverage concepts from image retrieval to solve this. 12 Spatially Constrained Similarity Measure Shen et al. propose spatially constrained similarity measure. BoW – Loss of spatial information Voting based scheme used to get similarity measure. Query Image Database Shen et al. CVPR 201 2 Retrieved Images 13 Detection via object proposal Validation Query each object proposal into the exemplar database. For each visual word k in the query, retrieve the image IDs and locations of k in these images using inverted files. Object center location and scores is then determined. Results in ranking of all the exemplars in the database Potential locations of the object in the exemplars is also obtained. Consider top t retrieved results for object verification. Some exemplars in top t belong to object others to bg. Also it might happen that exemplar belongs to object but localization is not centered on the exemplar. From top t majority voting is performed to determine whether object proposal contains desired object or not. 14 Detection via object proposal Validation Object Proposal Top 5 Retrieved Results Localization 15 Detection via object proposal Validation Object Proposal Top 5 Retrieved Results 16 Location Prior Retrieved images have no ground -truth mask. Our object proposals match to uncluttered retrieved images (objects on white background). We use saliency on retrieved images to obtain a soft mask on retrieved images. This is transferred to object proposal using SIFT flow warping. 17 Location Prior 18 Segmentation Segmentation : Framework bg prob Input Image Waldo, Object Proposal, Sift Flow, Saliency localization module fg prob location prior update appearance model tag : ”dog” + graph cut final segmentation 20 Result : MSRC class Ours Object Discovery Joulin '12 Kim Joulin '10 Mukherjee Bike 55.34 54.08 43.3 29.9 42.3 42.8 Bird 64.58 67.33 47.7 29.9 33.2 - Car 66.85 66.74 59.7 37.1 59.0 52.5 Cat 70.66 66.17 31.9 28.7 37.6 39.4 Chair 60.30 62.21 39.6 28.7 37.6 39.4 Cow 78.48 79.39 52.7 33.5 45.0 26.1 Dog 69.14 67.47 41.8 33.0 41.3 - Plane 58.85 56.71 21.6 25.1 21.7 33.4 Sheep 81.17 78.86 66.3 60.8 60.4 45.7 Average 67.26 66.55 44.96 34.08 42.01 39.9 21 MSRC Result : Qualitative 22 MSRC : Qualitative Comparison Source Image Our Object Discovery Joulin ‘12 Joulin ‘10 23 MSRC : Qualitative Comparison Source Image Our Object Discovery Joulin ‘12 Joulin ‘10 24 MSRC : Qualitative Comparison Source Image Our Object Discovery Joulin ‘12 Joulin ‘10 25 100 ImageNet Dog and 100 Object Discovery Aeroplanes Ours Object Discovery Joulin ‘10 Joulin ‘12 DPM+Grabcut Centered+Grab cut Grabcut on GT BB (upper bound) Object Discovery Aeroplane 64.27 55.81 15.36 11.72 39.47 37.29 50.87 Object Discovery Car 71.84 64.42 37.15 35.15 68.00 64.96 80.82 Object Discovery Horse 55.08 51.65 30.16 29.53 50.12 48.89 65.99 ImageNet Dog 69.91 - 28.65 24.69 48.24 34.53 79.52 26 Object Discovery : Qualitative 27 Object Discovery: Qualitative Comparison Ours OD Joulin DPM + Grabcut Conclusion and Future Work Far simpler interface for object selection. Step towards advancing image editing via NLP. New method for validating object proposal can be extended for object detection. Future work includes taking complex queries from user (dog on right or tree etc.) 29 Thank You!!! 30