Ci = [HOGi,CNi]

advertisement
C OLOR ATTRIBUTES FOR O BJECT D ETECTION
FAHAD S HAHBAZ K HAN1,∗ , R AO M UHAMMAD A NWER2,∗ , J OOST VAN DE W EIJER2 , A NDREW D. B AGDANOV2,3 , M ARIA VANRELL2 , A NTONIO M. L ÓPEZ2
1 C OMPUTER V ISION L ABORATORY, L INKÖPING U NIVERSITY, S WEDEN
2 C OMPUTER V ISION C ENTER ,U NIVERSITAT A UTONOMA DE B ARCELONA ,S PAIN
3 M EDIA I NTEGRATION AND C OMMUNICATION C ENTER , U NIVERSITY OF F LORENCE , I TALY
P ROBLEM
C OLOR D ESCRIPTORS
Goal: Augmenting existing intensity based
detectors with color information.
C OLOR D ESCRIPTOR S ELECTION
E XPERIMENTAL VALIDATION
Evaluation criteria: To avoid laborious cross
validation, we use the KL-ratio to compare the
discriminative power. A high KL-ratio reflects a
more discriminative descriptor.
PASCAL VOC2007: no early fusion method
improves over standard HOG.
Cartoon: Our approach provides a gain of 14%
over standard HOG.
KL-ratio =
E ARLY VS L ATE
HUE descriptor (HUE): cells are represented by
a histogram:
O1
hue = arctan
O2
→s=
√
O12 + O22
P
minj ∈C
/ m KL(pj ,pk )
P k∈C m
m
k∈C m mini∈C ,i6=k KL(pj ,pk )
(5)
where
KL(pi , pj ) =
PN
x=1
pi (x) log
pi (x)
pj (x)
(6)
PASCAL VOC: Our approach improves on 15 out
of 20 object categories compare to baseline.
(1)
Opponent descriptor (OPP): cells are
represented by a histogram over the opponent
angle:
angxO
It is known that late fusion obtains better results
in higher level pyramid representations.
= arctan
O1x
O2x
→s=
p
O12x + O22x
Method
HOG
Best 2007
UCI
UOCTTI
LEO
Oxford-MKL
LBP-HOG
Our Approach
(2)
Color names: are linguistic color labels which
human assign to colors in the world. In this work
we use the mapping learned from Google images
[Van de Weijer, TIP09].
CN = {p(cn1|R), p(cn2|R), ..., p(cn11|R)}
with
p(cni |R) =
1
p
P
x∈R
p(cni |f(x))
(3)
(4)
mean AP(VOC07)
32.3
23.3
27.1
29.6
32.1
34.3
34.8
mean AP(VOC09)
28.0
27.9
27.7
21.9
29.4
Method
mean AP (Cartoon)
mean AP (VOC07)
HOG
27.6
32.3
OPPHOG
34.2
30.6
RGBHOG
35.3
31.3
C-HOG
35.2
30.1
Our Approach
41.7
34.8
Cartoon Detection:
Conclusion: We select color names since it has
more discriminative power while being compact.
C OLORING O BJECT D ETECTION
Part-Based: We extend the HOG feature with
color attributes within the part-based framework
[Felzenswalb, PAMI11].
Ci = [HOGi , CNi ]
Feature
Dimension
HOG
31
OPPHOG
93
RGBHOG
93
(7)
C-HOG
93
CN-HOG
42
C ONCLUSIONS
C ARTOON C HARACTER D ETECTION
1. We propose the use of color attributes as explicit color representation.
18 Classes: The Simpsons, Tom, Jerry, Fred,
Barney, Sylvester, Mickymouse, Donaldduck,
Tweety, Coyote, Roadrunner, Buggs, Daffy,
Shaggy and Scooby.
Images: 586(train 304, testing 282).
Source: Google.
Properties:
Compactness: only an 11D histogram for each cell
is computed.
Invariance: possess a degree of photometric
invariance.
Discriminative power: separate bins represent the
achromatic colors: black, grey, and white.
ESS-Based: We add a separate color vocabulary
to the ESS framework [Lampert CVPR08]. Color
and shape are combined with late fusion.
Cartoon Character Detection:
mean AP
SIFT
8.8
CN-SIFT
12.9
C-SIFT
10.3
OPPSIFT
9.3
2. We introduce a new dataset of cartoon character images.
3. Early fusion based approaches yield inferior
results for object detection. Our approach
achieves state-of-the-art on PASCAL VOC
2007, 2009 and cartoon datasets despite its
simplicity.
Download