Presentation - eBay Research Labs | eBay Research Labs

advertisement
Using Attributes to
Describe What People
Wear
Andy Gallagher
October 14, 2013
with Huizhong Chen and Bernd Girod
Objective
Attribute
learning
List of attributes
Men’s
Black color
Sweater
Long sleeve
Solid pattern
Low skin exposure
…
3
Outline
 Attributes
 Describing Clothing with Attributes
 ! Miscellaneous Topics !
Attributes
Attributes
 Describing objects by their attributes, A
Farhadi, I Endres, D Hoiem, D Forsyth
Computer Vision and Pattern Recognition,
2009. CVPR 2009
 Learning To Detect Unseen Object Classes
by Between-Class Attribute Transfer, C.
Lampert, H. Nickisch, S. Harmeling, CVPR
2009
 Many others
Computer Vision
image
features
classification
Computer Vision
image
features
classification
[ .1
-.9
.1
.231
-.1]
?
Computer Vision
image
features
What feature representation should we use?
classification
Computer Vision
image
[ .1
-.9
.1
.231
-.1]
features
attributes
classification
Has hair, has
skin, has ear, has
eye, has arms
Now we
can talk…
Attributes
 Properties shared by many objects
 Explicit semantics
 Facilitate human-CPU communication
 Materials (glass, fur, wood, etc.)
 Parts (has wheel, has tail, etc.)
 Shape (boxy, cylindrical, etc.)
Based on a slide by David Forsyth
11
Example Attributes
Face Tracer Image Search
“Smiling Asian Men With Glasses”
Kumar et al., 2008
12
Example Attributes
Farhadi et al. 2009
13
Example Attributes
Lampert et al. 2009
14
Slide credit: Devi Parikh
Example Attributes
Welinder et al. 2010
15
Slide credit: Devi Parikh
Attribute Models
 Classifiers for binary attributes
Kumar et al. 2010
16
Slide credit: Devi Parikh
Why attributes?
 How humans naturally describe visual
concepts
 Image search
I want elegant
silver sandals
with high heels
17
Slide credit: Devi Parikh
Example Attributes
Verification
classifier
SAME
Kumar et al., 2010
Why attributes?
 An okapi is a mammal with a reddish dark
back, with striking horizontal white stripes
on the front and back legs. (Wikipedia)
19
Why attributes?
 An okapi is a mammal with a reddish dark
back, with striking horizontal white stripes
on the front and back legs. (Wikipedia)
20
Why attributes?
 An okapi is a mammal with a reddish dark
back, with striking horizontal white stripes
on the front and back legs. (Wikipedia)
21
Zero-shot Learning
 Aye-ayes
 Are nocturnal
 Live in trees
 Have large eyes
 Have long middle fingers
Which one of these is an aye-aye?
Humans can learn from descriptions (zero examples).
Slide adapted from Christoph Lampert by Devi Parikh
22
Is this a giraffe?
No.
Is this a giraffe?
Yes.
Is this a giraffe?
No.
Slide credit: Devi Parikh
23
Parkash and Parikh, 2012
Current belief
Focused feedback
Knowledge of the world
I think this is a
giraffe. What
do you think?
No, its neck is
too short for it
to be a giraffe.
 Learner learns better from its mistakes
 Accelerated discriminative learning with few
examples
[Animals with even shorter necks]
Ah! These must
not be giraffes
either then.
……
Feedback on one, transferred to many
Slide credit: Devi Parikh
24
Which Attributes to Describe?
(a)
(b)
(c)
(f)
(d)
(e)
Please choose a person to the left of the person who is
frowning
Sadovnik et al. 2013
25
Describing Clothing with Attributes
Objective
Attribute
learning
List of attributes
Men’s
Black color
Sweater
Long sleeve
Solid pattern
Low skin exposure
…
Recommend and Analyze
Recommendations
Formal
Sport
Related Work
Person identification with clothing
 Bounding box under face [Anguelov, 2007]
 Clothing segmentation [Gallagher, 2008]
Dataset Preparation
 1856 people from the web.
 Images are unconstrained.
Dataset Preparation
$400 spent for collecting 283,107 labels on
Amazon Mechanical Turk (AMT).
3 Multiclass
23 Binary
Dataset Statistics
The System
Feature 1
Combine
features
…
…
…
Feature N
SVM1
SVMN
Pose estimation
A: attribute
F: feature
Attribute
classifier 1
F1
F2
Attribute
classifier 2
…
Feature
extraction &
quantization
SVM
Attribute
classifier M
A2
F4
A1
A4
F3
A3
Multi-attribute
CRF inference
Predictions
Blue
Solid pattern
Outerwear
Wear scarf
Long sleeve
…

Pose
Estimation
[Eichner et. al., 2010]
Perform upper body detection, by using complementary results
from face detector and deformable part models.
 Foreground highlighting within the enlarged upper body bounding
box.
 Parse the upper body into head, torso, upper and lower parts of the
left and right arms.
Feature Extraction
 SIFT descriptor extracted over the
sampling grid.
 Similar procedure for the arm regions.
Feature Extraction
 Maximum Response Filters [Varma 2005]
 LAB color
 Skin probability
MRF bank
RGB
image
Skin
probability
Feature Extraction
 Raw features are quantized using soft Kmeans (K=5 in our implementation).
 Quantized features are aggregated over
various body regions, by max or average
pooling.
Feature type
Region
Pooling method
SIFT
Torso
Average
Texture
Left upper arm
Max
Color
Right upper arm
Skin probability
Left lower arm
Right lower arm
Feature Fusion
 SVM is a kernel-based classification technique.
 Feature fusion solution: combined SVM is
trained using weighted sum of the kernels.
 Combining features consistently outperforms
the single best feature.
K1
K1
SVM 1
Predict accuracy 1
K2
K2
SVM 2
Predict accuracy 2
…
KN
SVM
Combined
KN
SVM N
Predict accuracy N
Attribute
prediction
Recap
Feature 1
Combine
features
…
…
…
Feature N
SVM1
SVMN
Pose estimation
A: attribute
F: feature
Attribute
classifier 1
F1
F2
Attribute
classifier 2
…
Feature
extraction &
quantization
SVM
Attribute
classifier M
A2
F4
A1
A4
F3
A3
Multi-attribute
CRF inference
Predictions
Blue
Solid pattern
Outerwear
Wear scarf
Long sleeve
…
Attribute Dependencies
Necktie and T-Shirt?
Attribute Inference with CRF
 Each attribute is a node. All nodes are pair-wise
connected.
 The edge connecting 2 nodes corresponds to the
joint probability of these 2 attributes.
F6
F5
A6
F1
A5
F4
A1
F2
A2
F3
A3
A4
Ai: Attribute i
Fi: Features for Ai
CRF for Attribute Learning
P( A1, A2 F1, F2 )  P(F1, F2 A1, A2 )P( A1, A2 )
 P(F1 A1 )P(F2 A2 )P( A1, A2 )

[Following CRF model]
F2( A2 F2 )
P( A1 F1 ) P
F1
…P( A , A )
P( A1 )
P( A2 )
1
2
FM
 P( A1 F1 ) …  P( A2 F2 ) 
A2   log
  logP( A1 , A2 )   C
 logP( A1 , A2 F1 , F2 )   log
P( A1 ) 
P( A2 ) 



 
 


Node 1 potential
Node 2 potential
A1
A
( A )
( A ) M
 For a fully connected CRF, we maximize:
1
2
Edge potential
 ( A1 , A2 )
  ( A )    ( A , A )
Ai S
i
Node potential
( Ai , A j )E
i
j
Edge potential
 The CRF potential is maximized using standard belief
propagation technique [Tappen et. al. 2003] .
44
No necktie (Wear necktie)
Wear necktie
Has collar
Has collar
Men’s
Men’s
Has placket
Has placket
Low exposure
High exposure (Low exposure)
No scarf
No scarf
Solid pattern
Solid pattern
Black
Gray & black
Short sleeve (Long sleeve)
Long sleeve
V-shape neckline
V-shape neckline
Dress (Suit)
Suit
No necktie
Has collar
Men’s
Has placket
Low exposure
Wear scarf
Solid pattern
Brown & black
No sleeve (long sleeve)
V-shape neckline
Tank top (outerwear)
Experimental Results
 Questions that we are interested in:
 Does combining features improve
performance?
 Does the pose model help?
 Does the CRF work?
Pose Vs No Pose - Experiment Setup
 Positive and negative examples are
balanced.
 SVM classification
 Chi-squared kernel
 Leave-1-out cross validation
 Comparison with attribute learning
without pose model.
 Features are extracted within a scaled
clothing mask under the face.
 Evaluation performed under the same
The clothing mask
experiment settings.
[Gallagher 2008]
95%
Necktie
Collar
Gender
Placket presence
Skin exposure
Scarf
Pattern solid
Pattern floral
Pattern spot
Pattern graphics
Pattern plaid
Pattern stripe
Color red
Color yellow
Color green
Color cyan
Color blue
Color purple
Color brown
Color white
Color gray
Color black
>2 colors
sleevelength
neckline
category
Accuracy (binary-class) / MAP (multi-class)
Best feature (with pose)
Combined feature (with pose)
Combined feature (no pose)
90%
85%
80%
75%
70%
65%
60%
55%
50%
45%
Multiclass Confusion Matrix
Necktie
Collar
Gender
Placket presence
Skin exposure
Scarf
Pattern solid
Pattern floral
Pattern spot
Pattern graphics
Pattern plaid
Pattern stripe
Color red
Color yellow
Color green
Color cyan
Color blue
Color purple
Color brown
Color white
Color gray
Color black
>2 colors
sleevelength
neckline
category
G-mean
Before CRF
After CRF
95%
90%
85%
80%
75%
70%
65%
60%
55%
50%
45%
Steve Jobs:
“solid pattern, men’s clothing, black color,
long sleeves, round neckline, outerwear,
wearing scarf”
The predicted dressing style of weddings:
 Male: “solid pattern, suit, long-sleeves, Vshape neckline, wearing necktie, wearing
scarf, has collar, has placket”
 Female: “high skin exposure, no sleeves,
dress, other neckline shapes, white, >2
colors, floral pattern”
Gender Recognition
Face-based: Project faces in the Fisher space.
Clothing-based: The gender output of our
system.
Better gender recognition is achieved by
combining face and clothing.
Conclusions
 Clothing attributes can be better learned
with a human pose model.
 CRF offers improved performance by
exploring attribute relations.
 Proposed novel applications that exploit
the predicted attributes.
Miscellaneous
56
What do you have?
57
58
59
AutoCropping
60
AutoCropping
Auction Probability: 97%
61
AutoCropping
Eigenvector
Quantized Eigenvector
62
63
How do photos affect value?
Angled, high
contrast:
~$115
64
How do photos affect value?
Frontal,
Flash reflection
~$88
65
Thank You!
66
Future Work
 Expect even better performance by using
the (almost) ground truth pose estimated
by Kinect sensors [Shotton et. al., Best Paper CVPR 2011].
 Incorporate clothing information in person
identification.
The Loop
What we know
about people
Images and
Computer Vision
68
The Loop: This talk
 Examples of how social data has helped
understand images of people
 Some things I’ve learned about people
from computer vision
69
What is context?
75
Context
76
Which monster is larger?
Shepard RN (1990)
Mind Sights: Original
Visual Illusions,
Ambiguities, and other
Anomalies, New York:
WH Freeman and
Company
77
Your brain specializes in faces
78
Find The Face In the beans:
79
http://www.michaelbach.de/ot/sze_muelue/index.html
Understanding images of people
80
Download