Motion Computing in Image Analysis - Mani V Thomas CISC 489/689 Roadmap Optic Flow Constraint Optic Flow Computation Gradient Based Approach Feature Based Approach Estimation Criterion Block Matching algorithms Conclusion Some slides and illustrations are from M. Pollefeys and M. Shah Importance of Visual Motion Apparent motion of objects on the image plane is a strong cue to understand structure and 3D motion Biological visual systems infer properties of the 3D world via motion Two sub-problems of motion Problem of correspondence estimation Which elements of a frame correspond to which elements of the next frame Problem of reconstruction Given the correspondence and the camera’s intrinsic parameters can we infer 3D motion and/or structure Courtesy: E. Trucco and A. Verri, “Introductory techniques for 3D Computer Vision” Apparent Motion Apparent motion of objects on the image plane Caution required!! Consider a perfectly uniform sphere that is rotating but no change in the light direction Perfectly uniform sphere that is stationary but the light is changing Optic flow is zero Optic flow exists Hope – apparent motion is very close to the actual motion Courtesy: E. Trucco and A. Verri, “Introductory techniques for 3D Computer Vision” Optic Flow Computation Two strategies for computing motion Differential Methods Spatio temporal derivatives for estimation of flow at every position Multi-scale analysis required if motion not constrained within a small range Dense flow measurements Matching Methods Feature extraction(Image edges, corners) Feature/Block Matching and error minimization Sparse flow measurements Courtesy: E. Trucco and A. Verri, “Introductory techniques for 3D Computer Vision” Optic Flow Computation Image Brightness Constancy assumption Let E be the image intensity as captured by the camera Using Taylor series to expand E E x x, y y, t t E x, y, t E E E x y t x y t E x x, y y, t t E x, y, t E x E y E Lt t 0 t 0 t x t y t t Lt Apparent brightness of moving objects remains constant E dx E dy E dE 0 x dt y dt t dt Optic Flow Computation Image Brightness Constancy assumption Apparent brightness of moving objects remains constant E dx E dy E x dt y dt t 0 E x, E y E The dx dt, dy dt v are the image gradient while the motion field are the components of the E T v Et 0 Courtesy: E. Trucco and A. Verri, “Introductory techniques for 3D Computer Vision” Aperture Problem We can measure Terms that can be measured E x, E y , E t Terms to be computed dx dt , dy dt Number of equations - 1 The component of the motion field that is orthogonal to the spatial image gradient is not constrained by the image brightness constancy assumption Intuitively The component of the flow in the gradient direction is determined The component of the flow parallel to an edge is unknown Courtesy: E. Trucco and A. Verri, “Introductory techniques for 3D Computer Vision” Different physical motion but same measurable motion within a fixed window Roadmap Optic Flow Constraint Optic Flow Computation Gradient Based Approach Feature Based Approach Estimation Criterion Block Matching algorithms Conclusion Some slides and illustrations are from M. Pollefeys and M. Shah Optic Flow Constraint How to get more equations for a pixel? Basic idea: impose additional constraints Most common is to assume that the flow field is smooth locally One method: pretend the pixel’s neighbors have the same (u,v) If we use a 5x5 window, that gives us 25 equations per pixel! E pi .u v Et pi 0 E x p1 E y p1 Et p1 E p E p E p u y 2 x 2 t 2 v E p E p x 25 y 25 Et p 25 A252 d 21 b251 Lucas-Kanade Optic Flow We now have more equations than unknowns A252 d 21 b251 min Ad b Solve the least squares problem Minimum least squares solution (in d) is given by A A Ex Ex E y E x T d A 22 21 T E E E E x y b 225 251 u E x Et E E v y t y y First proposed by Lucas-Kanade in 1981 Summation performed over all the pixels in the window Lucas-Kanade Optic Flow Lucas-Kanade Optic flow Ex Ex E y E x E E E E x y u E x Et E y Et y v y When is the Lucas-Kanade equations solvable ATA should be invertible ATA should not be too small (effects of noise) Eigenvalues of ATA, 1 and 2 should not be small ATA should be well conditioned 1/2 should not be large (1 = larger eigenvalue) Edge Gradient is large in magnitude Large 1 but small 2 Low texture region Gradients has small magnitude Small 1 and small 2 High texture region Gradients are different with large magnitudes Large 1 and large 2 Improving the Lucas-Kanade method When our assumptions are violated Brightness constancy is not satisfied The motion is not small A point does not move like its neighbors Iterative Lucas-Kanade Algorithm Estimate velocity at each pixel by solving Lucas-Kanade equations Warp H towards I using the estimated flow field use image warping techniques Repeat until convergence Iterative Lucas-Kanade method u=1.25 pixels u=2.5 pixels u=5 pixels image H Gaussian pyramid of image H u=10 pixels image I Gaussian pyramid of image I Iterative Lucas-Kanade method run iterative L-K warp & upsample run iterative L-K . . . image H J Gaussian pyramid of image H image I Gaussian pyramid of image I Roadmap Optic Flow Constraint Optic Flow Computation Gradient Based Approach Feature Based Approach Estimation Criterion Block Matching algorithms Conclusion Some slides and illustrations are from M. Pollefeys and M. Shah Feature Based Method Feature Extraction Maxima in first derivative of the Image Local peak in the first derivative T f f G f x, y x y Numerical Approximation Gx f i, j 1 f i, j G y f i 1, j f i, j Compute the motion parameters from the best bipartite graph Correspondence between the feature points in one image with those in the other For more information: Ramesh Jain, Rangachar Kasturi, Brian Schunck: Machine Vision 1995 (140 159) Roadmap Optic Flow Constraint Optic Flow Computation Gradient Based Approach Feature Based Approach Estimation Criterion Block Matching algorithms Conclusion Some slides and illustrations are from M. Pollefeys and M. Shah Estimation Criterion Pixel domain Criterion MAE/MSE Lorentzian Correlation Frequency Domain Criterion Cross Correlation Phase Correlation Estimation Criterion(contd.) Pixel Domain Criterion Estimation criterion aim at minimizing n, d k (d ) I k (n ) I k 1 (n d ) prediction error is sensitive to noise if number of pixels is not large or if region is poorly textured Common choice of estimation criterion (d ) I k n I k 1 n d n ,dR Quadratic function is not good since a single large error can bias the estimate of the field Absolute value function is better than the quadratic since cost grows linearly with error Does not require multiplications and is better suited for real-time video encoders Estimation Criterion(contd.) A more robust criterion is based on the Lorentzian function 2 log 1 2 2 Grows slower than |x| for larger errors Similarity measure using Correlation C (d ) I k (n )I k 1 (n d ) n ,dR Computationally complex because of the multiplications This criterion requires maximization I k n I k n I k n d I n d k r I k n I nd ,1 r 1 k Usually the normalized Cross correlation is computed For more details: M. Black, “Robust Incremental Optical Flow” Estimation Criterion(contd.) Estimation Criterion 6 5 4 3 2 1 0 -6 -4 absolute -2 0 lorentzian w=0.1 2 4 lorentzian w=0.3 6 Estimation Criterion(contd.) Frequency Domain Criterion F I k (n ) Iˆk (u ) F I k 1 (n d ) Iˆk 1 (u)e j 2u T z F I k ( n ) F I k 1 ( n z ) 2u T z Amplitudes of both the FT are independent of z Argument difference depends linearly on translation Global motion is recovered by evaluating the phase difference over a number of frequencies and solving the resulting system of equations In practice, this method will work only for a single object moving across a uniform background Estimation Criterion(contd.) Phase Correlation ˆ ˆ I (u ) I (u ) k 1,k ( n ) F 1 k k1 Iˆk (u ) Iˆk 1 (u ) In the case of a single global translation, the correlation surface becomes a Kronecker delta function k 1,k (n ) I 0 ( x) 1 k ( n ) I k 1 ( n z ) F e j 2uz (n z ) x0 x0 In practice, there are numerous peaks which correspond to the dominant displacements between the two images The locations are relatively independent to illumination changes Roadmap Optic Flow Constraint Optic Flow Computation Gradient Based Approach Feature Based Approach Estimation Criterion Block Matching algorithms Conclusion Some slides and illustrations are from M. Pollefeys and M. Shah Block Matching Algorithms Sparse motion measurements Motion is spatially constant and temporally linear over a rectangular region of support b1 x x xt t x t xt d t , x b2 The minimization problem is min dm d m P d m I k n I k 1 n d m nBm m P n n1 , n2 : P n1 P,P n2 P Bm is an M x N block of pixels with the top-left corner co-ordinate at m m1 , m2 Block Matching Algorithms(contd.) N -p -p N M M p Current Picture p Reference Frame v (x,y) -p N (x+u,y+v) M -p p p u Block Matching Algorithms(contd.) Principle of Locality of Reference Block Matching algorithms Exhaustive Search Always finds the “deepest” minimum Computationally very expensive If I x J is the picture resolution and rate is F fps the overall operations in comparing MxN blocks would be IJF 2 p 12 * MN * 3 MN This corresponds to 29.89 GOPS for p=15 at 30fps for a 720x480 image (3 operations per pixel of one subtraction, one absolute value and one addition) Block Matching Algorithms(contd.) Logarithmic Search Sub-optimal and may get trapped in a local minima Computationally feasible for real-time video encoders Search Method Divide the search space at [-p/2, -p/2] Search at (0,0) and at 8 major points at the perimeter of the rectangle at [-p/2, p/2] Using best match position as starting point, search in the eight perimeter points at the half distance window If I x J is the picture resolution and rate is F fps the overall operations in comparing MxN blocks would be IJF 8k 1 * MN * 3 MN k log2 p This corresponds to 1.03 GOPS for p=15 at 30fps for a 720x480 image For more information refer the work by Dr. Lai-Man Po and C. K. Cheung (http://www.ee.cityu.edu.hk/~lmpo/publications/index.html) Block Matching Algorithms(contd.) Hierarchical Search Sub-optimal for regions containing detail and increased storage requirements Computationally feasible for real-time video encoders Search method Form several low resolution images by low pass filtering At the lowest resolution perform a sub-optimal search like log search Propagate search vectors to higher resolution images and perform search If I x J is the picture resolution and rate is F fps the overall operations in comparing MxN blocks would be IJF MN 2 p MN *3 2 1 180 * 4 16 This corresponds to 507.38 MOPS for p=15 at 30fps for a 720x480 image Conclusion Motion estimation Aperture problem Different algorithms to perform motion analysis Lucas-Kanade algorithm Estimation criterion for motion field computation Block Matching Algorithms Computational complexity of motion analysis