Document 13068193

Object Detection Object Detection MAP Estimation Data Association Data Association t t t i t s.t X1 = q( X 2 ) X1 , X 2 g(.) can be seen as the objective function in detection problem (image likelihood term); h(.) is the objective function in data association problem (temporal smoothness term); q(.) is the coupling constraint to enforce agreement of the solutions between two sub-problems. The choices of g, h, q and their combination are flexible. Typically, we want g, h are relatively easy to optimize. For traditional detection-tracking scheme with independence likelihood assumption (h(.) does not model the joint image likelihood), there is no need to introduce coupling constraint; the objective function is equivalent to classic tracker such as Multiple-Hypothesis-Tracking or Network-Flow-Tracker. The overall optimization can be solved through Dual Decomposition Function X max p( X | Y ) min g ( X1 , Y ) + h( X 2 ) General Form of the Objective Function X is the joint state vector of all objects in the scene Y is the observation vector for the entire image, which depends on the states of all objects. Example of Y: binary image obtained after background subtraction X ≈ max ∏ p(Yt | X t )∏ p( xi ,1 )∏ p( xi ,t | xi ,t −1 ) X = max ∏ p(Yt | X t )[ p( X1 )∏ p( X t | X t −1 )] X = max p(Y | X ) p( X ) X max p( X | Y ) Bayesian -Bayesian Formulation Formulation Our System (Coupling) Traditional Tracking System (Detection-Tracking) We present a novel framework for multiple object tracking in which the problems of object detection and data association are expressed by a single objective function. The framework follows the Lagrange dual decomposition strategy, taking advantage of the often complementary nature of the two subproblems. The advantages of our coupling framework are: No problem of error propagation from which traditional “detection-tracking approaches” to multiple object tracking suffer. No need to apply “non-maximum suppression” during detection stage. Abstract d3 Localization in 3D (Triangulation) d2 i N D d2 D d2 d3 2 2 t-1 3 2 1 t 6 5 T S 4 t+1 9 8 7 S S t-1 1 t 5 source s.t. f 8 (t ) i ,n T sink t+1 T i ∑f t i (t ) n, j Object location track 1 j = ∑ f , ∀t∀n j h : min ∑∑∑ c f (t ) (t ) i, j i, j Each edge on the network is associated with a flow variable f and a cost c to measure how likely an object moves from one location to another. Push certain amount of flows into the network with minimum cost such that each path of the flow connects a set of detections across time to form a unique track. d1 d1 ≤ 1, X i ∈ {0,1} Min-cost Flow Data Association d1 i ∑X Localization in 2D (Ground Plane) s.t. X i g : min || Y − ∑ Di X i ||0 , Multiple Templates X g : min || Y − DX ||0 , X ∈ {0,1} Single Template N Given a dictionary D that encodes the shape and spatial information of objects in image, instantiate binary templates at selected positions through selector X such that the generated image (DX) looks similar to the observation Y. Sparsity-constrained Concrete Example Detection Zheng Wu, Ashwin Thangali, Stan Sclaroff, Margrit Betke (t ) i ,n j =∑ f , (t ) n, j X j (t ) n, j X t ,n = ∑ f , i ∑f t t j T ∀t∀n ∀t∀n i f t 127 127 Our CP 75 RT[1] Our CP 75 19 Our CP RT[1] 19 RT[1] 36 36 OM[2] Our CP 23 Our CP 23 23 OM[2] ILP[3] #Objects Method 95 60 71 68 19 19 24 20 22 20 20 Mostly Track 5 8 1 0 0 0 2 7 0 8 1 Mostly Lost 0.87 -0.34 0.92 0.51 0.90 0.80 0.89 0.64 0.94 0.26 0.88 MOTA i j 11.4cm 11.6cm 9.7cm 9.9cm 9.5cm 9.0cm 0.61 0.67 0.70 0.67 0.76 MOTP [1] Z. Wu, N. I. Hristov, T. H. Kunz, and M. Betke. Tracking-reconstruction or reconstruction-tracking? Comparison of two multiple hypothesis tracking approaches to interpret 3D object motion from several camera views. In IEEE Wkshp Motion and Video Computing (WMVC), 2009 [2] A. Andriyenko, S. Roth, and K. Schindler. An analytical formulation of global occlusion reasoning for multitarget tracking. In 11th IEEE Intl. Workshop on Visual Surveillance, 2011. [3] A. Andriyenko and K. Schindler. Globally optimal multi-target tracking on a hexagonal lattice. In 11th European Conf. on Computer Vision, 2010. Reference High Median Infrared S2 Infrared S3 Low Infrared S1 Median Low PETS S2L1 PETS S1L12 Density Sequence (t ) i, j h(λ) = min ∑∑∑(c − λt ,i ) f (t ) i, j Flow conservation: If there is a flow coming into a node, there is a flow coming out. Variable agreement: If there is a detection at a node, there is a flow going through. Dual Decomposition Datasets: PETS2009 (single view), Infrared Video (three views) Experiment t g (λ ) = min ∑ (|| Yt − Dt X t ||0 +λ X t ) s.t. X,f min ∑|| Yt − Dt X t ||0 + ∑∑∑ c f (t ) (t ) i, j i, j Coupling Detection and Data Association Department of Computer Science Coupling Detection and Data Association for Multiple Object Tracking

Document 13068193

Related documents

Products

Support

Document 13068193

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib