Document 13068193

advertisement
Object Detection
Object Detection
MAP Estimation
Data Association
Data Association
t
t
t
i
t
s.t X1 = q( X 2 )
X1 , X 2
g(.) can be seen as the objective function in detection problem (image
likelihood term); h(.) is the objective function in data association problem (temporal
smoothness term); q(.) is the coupling constraint to enforce agreement of the
solutions between two sub-problems.
The choices of g, h, q and their combination are flexible. Typically, we want g, h are
relatively easy to optimize.
For traditional detection-tracking scheme with independence likelihood assumption
(h(.) does not model the joint image likelihood), there is no need to introduce
coupling constraint; the objective function is equivalent to classic tracker such as
Multiple-Hypothesis-Tracking or Network-Flow-Tracker.
The overall optimization can be solved through Dual Decomposition
Function
X
max p( X | Y )
min g ( X1 , Y ) + h( X 2 )
General Form of the Objective Function
X is the joint state vector of all objects in the scene
Y is the observation vector for the entire image, which depends on the states of
all objects. Example of Y: binary image obtained after background subtraction
X
≈ max ∏ p(Yt | X t )∏ p( xi ,1 )∏ p( xi ,t | xi ,t −1 )
X
= max ∏ p(Yt | X t )[ p( X1 )∏ p( X t | X t −1 )]
X
= max p(Y | X ) p( X )
X
max p( X | Y )
Bayesian
-Bayesian Formulation
Formulation
Our System (Coupling)
Traditional Tracking System
(Detection-Tracking)
We present a novel framework for multiple object tracking in which the problems of
object detection and data association are expressed by a single objective function.
The framework follows the Lagrange dual decomposition strategy, taking advantage
of the often complementary nature of the two subproblems. The advantages of our
coupling framework are:
No problem of error propagation from which traditional “detection-tracking
approaches” to multiple object tracking suffer.
No need to apply “non-maximum suppression” during detection stage.
Abstract
d3
Localization in 3D (Triangulation)
d2
i
N
D
d2
D
d2
d3
2
2
t-1
3
2
1
t
6
5
T
S
4
t+1
9
8
7
S
S
t-1
1
t
5
source
s.t.
f
8
(t )
i ,n
T
sink
t+1
T
i
∑f
t
i
(t )
n, j
Object location
track
1
j
= ∑ f , ∀t∀n
j
h : min ∑∑∑ c f
(t ) (t )
i, j i, j
Each edge on the network is associated with a flow variable f and a cost c to
measure how likely an object moves from one location to another. Push certain
amount of flows into the network with minimum cost such that each path of the flow
connects a set of detections across time to form a unique track.
d1
d1
≤ 1, X i ∈ {0,1}
Min-cost Flow Data Association
d1
i
∑X
Localization in 2D (Ground Plane)
s.t.
X
i
g : min || Y − ∑ Di X i ||0 ,
Multiple Templates
X
g : min || Y − DX ||0 , X ∈ {0,1}
Single Template
N
Given a dictionary D that encodes the shape and spatial information of objects in
image, instantiate binary templates at selected positions through selector X such
that the generated image (DX) looks similar to the observation Y.
Sparsity-constrained
Concrete
Example Detection
Zheng Wu, Ashwin Thangali, Stan Sclaroff, Margrit Betke
(t )
i ,n
j
=∑ f ,
(t )
n, j
X
j
(t )
n, j
X t ,n = ∑ f ,
i
∑f
t
t
j
T
∀t∀n
∀t∀n
i
f
t
127
127
Our CP
75
RT[1]
Our CP
75
19
Our CP
RT[1]
19
RT[1]
36
36
OM[2]
Our CP
23
Our CP
23
23
OM[2]
ILP[3]
#Objects
Method
95
60
71
68
19
19
24
20
22
20
20
Mostly Track
5
8
1
0
0
0
2
7
0
8
1
Mostly Lost
0.87
-0.34
0.92
0.51
0.90
0.80
0.89
0.64
0.94
0.26
0.88
MOTA
i
j
11.4cm
11.6cm
9.7cm
9.9cm
9.5cm
9.0cm
0.61
0.67
0.70
0.67
0.76
MOTP
[1] Z. Wu, N. I. Hristov, T. H. Kunz, and M. Betke. Tracking-reconstruction or reconstruction-tracking?
Comparison of two multiple hypothesis tracking approaches to interpret 3D object motion from several camera
views. In IEEE Wkshp Motion and Video Computing (WMVC), 2009
[2] A. Andriyenko, S. Roth, and K. Schindler. An analytical formulation of global occlusion reasoning for multitarget tracking. In 11th IEEE Intl. Workshop on Visual Surveillance, 2011.
[3] A. Andriyenko and K. Schindler. Globally optimal multi-target tracking on a hexagonal lattice. In 11th
European Conf. on Computer Vision, 2010.
Reference
High
Median
Infrared S2
Infrared S3
Low
Infrared S1
Median
Low
PETS S2L1
PETS S1L12
Density
Sequence
(t )
i, j
h(λ) = min ∑∑∑(c − λt ,i ) f
(t )
i, j
Flow conservation: If there is a flow
coming into a node, there is a flow
coming out.
Variable agreement: If there is a
detection at a node, there is a flow
going through.
Dual Decomposition
Datasets: PETS2009 (single view), Infrared Video (three views)
Experiment
t
g (λ ) = min ∑ (|| Yt − Dt X t ||0 +λ X t )
s.t.
X,f
min ∑|| Yt − Dt X t ||0 + ∑∑∑ c f
(t ) (t )
i, j i, j
Coupling Detection and Data Association
Department of Computer Science
Coupling Detection and Data Association for Multiple Object Tracking
Download