Kanade–Lucas–Tomasi Tracking (KLT tracker) Tomáš Svoboda, svoboda@cmp.felk.cvut.cz Czech Technical University in Prague, Center for Machine Perception http://cmp.felk.cvut.cz Last update: November 26, 2007 importance for Computer Vision gradient based optimization good features to track Talk Outline experiments Importance in Computer Vision Improved many times, most importantly by Carlo Tomasi [5, 4] Free implementation(s) available1. Firstly published in 1981 as an image registration method [3]. 2/22 1 2 After more than two decades, a project2 at CMU dedicated to this single algorithm and results published in a premium journal [1]. Part of plethora computer vision algorithms. http://www.ces.clemson.edu/∼stb/klt/ http://www.ri.cmu.edu/projects/project 515.html Tracking of dense sequences — camera motion 3/22 I J Tracking of dense sequences — object motion 4/22 I J Alignment of an image (patch) 5/22 Goal is to align a template image T (x) to an input image I(x). x column vector containing image coordinates [x, y]>. The I(x) could be also a small subwindow withing an image. Original Lucas-Kanade algorithm I 6/22 Goal is to align a template image T (x) to an input image I(x). x column vector containing image coordinates [x, y]>. The I(x) could be also a small subwindow withing an image. Set of allowable warps W(x; p), where p is a vector of parameters. For translations x + p1 W(x; p) = y + p2 W(x; p) can be arbitrarily complex The best alignment minimizes image dissimilarity X x [I(W(x; p)) − T (x)]2 Original Lucas-Kanade algorithm II 7/22 X [I(W(x; p)) − T (x)]2 x is a nonlinear optimization! The warp W(x; p) may be linear but the pixels value are, in general, non-linear. In fact, they are essentially unrelated to x. It is assumed that some p is known and best increment ∆p is sought. The the modified problem X [I(W(x; p + ∆p)) − T (x)]2 x is solved with respect to ∆p. When found then p gets updated p ← p + ∆p Original Lucas-Kanade algorithm III 8/22 X [I(W(x; p + ∆p)) − T (x)]2 x linearized by performing first order Taylor expansion X [I(W(x; p)) + ∇I x ∂W ∆p − T (x)]2 ∂p ∂I ∂I , ∂y ] is the gradient image3 computed at W(x; p). The term ∇I = [ ∂x the Jacobian of the warp. ∂W ∂p is 3 As a vector it should have been a column wise oriented. However, for sake of clarity of equations row vector is exceptionally considered here. Original Lucas-Kanade algorithm IV 9/22 Derive X [I(W(x; p)) + ∇I x ∂W ∆p − T (x)]2 ∂p with respect to ∆p > X ∂W ∂W 2 ∇I I(W(x; p)) + ∇I ∆p − T (x) ∂p ∂p x setting equality to zero yields > X ∂W −1 ∆p = H ∇I [T (x) − I(W(x; p))] ∂p x where H is the Hessian matrix > X ∂W ∂W H= ∇I ∇I ∂p ∂p x The Lucas-Kanade algorithm—Summary 10/22 Iterate: 1. Warp I with W(x; p) 2. Warp the gradient ∇I with W(x; p) 3. Evaluate the Jacobian image ∇I ∂∂W p ∂W ∂p 4. Compute the Hessian H = 5. Compute ∆p = H−1 P h x at (x; p) and compute the steepest descent P h ∇I x ∇I ∂W ∂p i> ∂W ∂p i> h ∇I ∂W ∂p i [T (x) − I(W(x; p))] 6. Update the parameters p ← p + ∆p until k∆pk ≤ Example of convergence 11/22 video Example of convergence 12/22 Convergence video: Initial state is within the basin of attraction Example of divergence 13/22 Divergence video: Initial state is outside the basin of attraction What are good features (windows) to track? 14/22 How to select good templates T (x) for image registration, object tracking. −1 ∆p = H > X ∂W [T (x) − I(W(x; p))] ∇I ∂p x where H is the Hessian matrix > X ∂W ∂W H= ∇I ∇I ∂p ∂p x The stability of the iteration is mainly influenced by the inverse of Hessian. We can study its eigenvalues. Consequently, the criterion of a good feature window is min(λ1, λ2) > λmin (texturedness). What are good features (windows) to track? 15/22 Consider translation W(x; p) = 1 0 ∂W = ∂p 0 1 x + p1 . The Jacobian is then y + p2 i> h i ∂W ∂W ∇I ∇I x ∂p " ∂p ∂I # P 1 0 1 0 ∂I ∂I ∂x = [ ∂x , ∂x ] ∂I x 0 1 0 1 ∂y 2 2 ∂I ∂I P 2 ∂x2 ∂x∂y = x ∂I ∂I H = P h ∂x∂y ∂y The image windows with varying derivatives in both directions. Homeogeneous areas are clearly not suitable. Texture oriented mostly in one direction only would cause instability for this translation. What are the good points for translations? 16/22 The Hessian matrix H= X x ∂I 2 ∂x ∂I 2 ∂x∂y ∂I 2 ∂x∂y 2 ∂I ∂y Should have large eigenvalues. We have seen the matrix already, where? Harris corner detector [2]! Experiments - no occlusions 17/22 video Experiments - occlusions 18/22 video Experiments - occlusions with dissimilarity 19/22 video Experiments - object motion 20/22 video References 21/22 [1] Simon Baker and Iain Matthews. Lucas-Kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(3):221–255, 2004. [2] C. Harris and M. Stephen. A combined corner and edge detection. In M. M. Matthews, editor, Proceedings of the 4th ALVEY vision conference, pages 147–151, University of Manchaster, England, September 1988. on-line copies available on the web. [3] Bruce D. Lucas and Takeo Kanade. An iterative image registration technique with an application to stereo vision. In Proceedings of the 7th International Conference on Artificial Intelligence, pages 674–679, August 1981. [4] Jianbo Shi and Carlo Tomasi. Good features to track. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 593–600, 1994. [5] Carlo Tomasi and Takeo Kanade. Detection and tracking of point features. Technical Report CMU-CS-91-132, Carnegie Mellon University, April 1991. End 22/22