Semantic Object Selection

advertisement
Semantic Object Selection
Ejaz Ahmed 1 , Scott Cohen 2 and Brian Price 2
1 University of Maryland, College Park and
2 Adobe Research, San Jose
Interactive FG/BG Segmentation
Magic Wand
Graph Cut
Intelligent Scissors
Bayes Matting
Grab Cut
2
Goal and Motivation
“dog”
“person”
Image
User Input
Output
 Goal is to reduce user ef fort for object selection.
 User simply names the desired object (verbally, NLP, search
box).
 E.g. PixelTone request, “Make the *cat* brighter”.
 Important for advancing image editing via natural language
input.
3
Related Work : Saliency
 Aim to select the main object in an image by determining
which image regions are most “salient”.
 Yan et al. saliency is determined by optimizing an energy
function.
 Encourages pixels to be salient based on contrast.
 Cheng et al. similar approach combined with grabcut.
 Rubenstein et al. Object Discovery in internet images.
Yan et al.
Cheng et al.
 These methods select which ever region stands out.
 Cannot address SOS where object of interest is not the most
salient object.
4
Related Work : Semantic Segmentation
 Semantic Segmentation attempts to automatically segment
every object in the image.
 Object Selection becomes trivial if we have such labeling.
 This solves much larger problem and does not focus on object
of interest.
 Large amount of pre-labeled data.
 Predetermined labeled set may not contain desired object.
 Computationally expensive (large number of classifiers).
 Exception non-parametric, Liu et al. and Tighe et al.
5
Related Work : Product Search
 User takes clean, close up picture of a product and the goal is
to find similar product from a database.
 Iterates between localized image retrieval and selection
estimation.
 Product image DB has object mask.
 These mask are transferred to query image.
 Similar goal to ours but simplified input (rigid object, large
and centered, little view point variation).
 Mask for database images is required.
6
Proposed Approach
Retrieved Dog Images
Background Images
Input Image
Dog
+
Text- Image
DB
Segmentation
Localization
Information
Localization/Confidence
Map
MRF/
Graph Cut
7
7
Localization
Localization : Overview
 Input : name of the object, input image.
 Output : Object location prior for the target image.
 Construct image retrieval database.
 Positives corresponding to desired object and generalized
negatives.
 Break target image into object proposals.
 Validate presence of object in object proposals.
 Use SIFT flow to transfer the location associated with
validating exemplar areas to the target image.
9
Exemplar Data
10
Negative Data
11
Use Object Detection ?
Having positives and negative makes it a detection problem.
State-of-the art detectors can be used.
Expensive to train such detectors.
Exemplar positives are downloaded from the internet,
therefore have lot of variation.
 Dif ficult to train a meaningful detector?
 We leverage concepts from image retrieval to solve this.




12
Spatially Constrained Similarity Measure
 Shen et al. propose spatially constrained similarity measure.
 BoW – Loss of spatial information
 Voting based scheme used to get similarity measure.
Query Image
Database
 Shen et al. CVPR 201 2
Retrieved Images
13
Detection via object proposal Validation
 Query each object proposal into the exemplar database.
 For each visual word k in the query, retrieve the image IDs
and locations of k in these images using inverted files.
 Object center location and scores is then determined.
 Results in ranking of all the exemplars in the database
 Potential locations of the object in the exemplars is also
obtained.
 Consider top t retrieved results for object verification.
 Some exemplars in top t belong to object others to bg.
 Also it might happen that exemplar belongs to object but
localization is not centered on the exemplar.
 From top t majority voting is performed to determine whether
object proposal contains desired object or not.
14
Detection via object proposal Validation
Object Proposal
Top 5 Retrieved Results
Localization
15
Detection via object proposal Validation
Object Proposal
Top 5 Retrieved Results
16
Location Prior
 Retrieved images have no ground -truth mask.
 Our object proposals match to uncluttered retrieved images
(objects on white background).
 We use saliency on retrieved images to obtain a soft mask on
retrieved images.
 This is transferred to object proposal using SIFT flow warping.
17
Location Prior
18
Segmentation
Segmentation : Framework
bg prob
Input Image
Waldo, Object
Proposal, Sift
Flow, Saliency
localization module
fg prob
location prior
update
appearance
model
tag : ”dog”
+
graph cut
final
segmentation
20
Result : MSRC
class
Ours
Object
Discovery
Joulin '12
Kim
Joulin '10
Mukherjee
Bike
55.34
54.08
43.3
29.9
42.3
42.8
Bird
64.58
67.33
47.7
29.9
33.2
-
Car
66.85
66.74
59.7
37.1
59.0
52.5
Cat
70.66
66.17
31.9
28.7
37.6
39.4
Chair
60.30
62.21
39.6
28.7
37.6
39.4
Cow
78.48
79.39
52.7
33.5
45.0
26.1
Dog
69.14
67.47
41.8
33.0
41.3
-
Plane
58.85
56.71
21.6
25.1
21.7
33.4
Sheep
81.17
78.86
66.3
60.8
60.4
45.7
Average
67.26
66.55
44.96
34.08
42.01
39.9
21
MSRC Result : Qualitative
22
MSRC : Qualitative Comparison
Source Image
Our
Object Discovery
Joulin ‘12
Joulin ‘10
23
MSRC : Qualitative Comparison
Source Image
Our
Object Discovery
Joulin ‘12
Joulin ‘10
24
MSRC : Qualitative Comparison
Source Image
Our
Object Discovery
Joulin ‘12
Joulin ‘10
25
100 ImageNet Dog and 100 Object Discovery
Aeroplanes
Ours
Object
Discovery
Joulin ‘10
Joulin ‘12
DPM+Grabcut
Centered+Grab
cut
Grabcut on
GT BB
(upper bound)
Object
Discovery
Aeroplane
64.27
55.81
15.36
11.72
39.47
37.29
50.87
Object
Discovery
Car
71.84
64.42
37.15
35.15
68.00
64.96
80.82
Object
Discovery
Horse
55.08
51.65
30.16
29.53
50.12
48.89
65.99
ImageNet
Dog
69.91
-
28.65
24.69
48.24
34.53
79.52
26
Object Discovery : Qualitative
27
Object Discovery: Qualitative Comparison
Ours
OD
Joulin
DPM + Grabcut
Conclusion and Future Work
 Far simpler interface for object selection.
 Step towards advancing image editing via NLP.
 New method for validating object proposal can be extended
for object detection.
 Future work includes taking complex queries from user (dog
on right or tree etc.)
29
Thank You!!!
30
Download