PowerPoint ****** - Computer Vision Lab. POSTECH

advertisement
Online Tracking by Learning Discriminative Saliency Map
with Convolutional Neural Network
1
Hong ,
1
You ,
2
Kwak ,
1
Han
Seunghoon
Tackgeun
Suha
Bohyung
1Dept. of Computer Science and Engineering, POSTECH
2INRIA - WILLOW Project team
Problem
1. Pre-trained CNN for feature descriptor
Using a pre-trained CNN to represent a general object
in visual tracking
• Base network: R-CNN[Girshick14] pre-trained with PASCAL VOC images
Advantage: CNN provides strong target representation
robust to various appearance changes.
2. Target-specific saliency map estimation
• Input: sub-image 𝒛𝒊 extracted from each target candidate proposal 𝒙𝒊
• output: outputs from the first fully-connected layer 𝜙(𝒙𝒊 )
Limitation: CNN feature is not appropriate for precise
localization due to spatial abstraction.
 Class-specific saliency map[Simonyan14]
Identify relevance of pixels w.r.t specific class by
𝜕𝑆𝑐 (𝐼)
𝑔𝑐 𝐼 =
𝜕𝐼
𝐼
: input image
𝑆𝑐 (𝐼): score of class c
Problem: No predefined class for target in tracking
 Target-specific saliency map
Our approach: Compute target-specific saliency map as
observation for tracking.
Online SVM as the last fully-connected layer of the network
4. Model update
• Generative model: temporal sliding of target filters
𝐻𝑡 = 𝐻𝑡−1 − 𝑀𝑡−𝑚 + 𝑀𝑡
𝑔𝐹𝐺
• Discriminative model:
Update incremental SVM with new examples { 𝑥𝑖 ′, 𝑦𝑖 ′ }
+1,
𝑦𝑖′ =
−1,
= 𝑥𝑡∗
BB 𝑥𝑡∗ ∩ BB 𝑥𝑡′
if
∗
′ <𝛿
BB 𝑥𝑡 ∪ BB 𝑥𝑡
if
① Computing target specific feature
Localization by sequential Bayesian filtering
𝑥𝑡∗
= argmax 𝑝 𝑥𝑡 𝑀1:𝑡 ) = argmax 𝑝 𝑀𝑡 𝑥𝑡 𝑝(𝑥𝑡 |𝑀1:𝑡−1 )
𝑥𝑡
𝑥𝑡
• Construct generative model 𝐻𝑡 by accumulating
𝑚 recent tracking results on saliency map
Target segmentation
• Compute likelihood
by convolution
𝑝 𝑀𝑡 𝑥𝑡 ∝ 𝐻𝑡 ⊗ 𝑀𝑡 𝑥𝑡
Employing GrabCut[Rother04] on saliency map
• Given tracking result,
select FG/BG seeds
based on saliency
value
Quantitative results
1
Evaluation based on bounding box (1,2) and segmentation (3) ground-truth
2
3
𝑇
𝜕𝜙(𝑥𝑖 )
𝜕𝜙 + (𝑥𝑖 )
=
𝜕𝑧𝑖
𝜕𝑧𝑖
Then compute the target-specific saliency map 𝑀 by
3. Target localization with saliency map
′
𝑥𝑖
𝜕𝑆𝐹𝐺 (𝑧𝑖 )
𝑧𝑖 =
= 𝑤+
𝜕𝑧𝑖
Qualitative results
𝜙𝑘+
𝑥𝑖
𝑤𝑘 𝜙𝑘 𝑥𝑖 , if 𝑤𝑘 > 0
=
0,
otherwise
② Computing gradient map 𝑔𝐹𝐺 𝑧𝑖 by back-propagating 𝜙𝑘+ 𝑥𝑖
③ Aggregating sample gradient maps
 Examples of obtained target-specific saliency map
Download