Or where do the pixels move? Alon Gat Problem Definition Given: two or more frames of an image sequence Wanted: Displacement field between two consecutive frames optical flow Vector Plot: Subsample vector field and use arrows for visualization Color Plot: Visualize direction as color and magnitude as brightness Extraction of Motion Information • robot navigation/driver assistance • surveillance/tracking • action recognition Processing of Image Sequences • video compression • ego motion compensation Related Correspondence Problems • stereo reconstruction • structure-from-motion • medical image registration • How to estimate pixel motion from two images? – Find pixel correspondences • Given a pixel in img1, look for nearby pixels of the same color in img2 • Key assumptions – color constancy: a point in img1 looks “the same” in img2 • For grayscale images, this is brightness constancy – small motion: points do not move very far Brightness Constancy Assumption ( x u t , y v t ) Optical Flow: the vector field ( x, y ) ( x, y ) tim e t t time t Displacement: (u , v ) (x, y) (u t , v t ) Assume brightness of patch remains same in both images: I ( x u t, y v t, t t ) I ( x, y, t ) Brightness Constancy Assumption I ( x u t , y v t , t t ) I ( x, y, t ) The Linearized Brightness Constancy Assumption Idea: If u and v are small and I is sufficiently smooth, one may linearize this constancy assumption via a first-order Taylor expansion around the point I ( x, y, t ) I I I u v I ( x, y, t ) x y t I xu I y v I t 0 Known Unknow n Smoothness Constraint : meaning neighbor pixels in the picture has similar velocities. 2 es u v dxdy, 2 ((ux2 u 2y ) (vx2 v 2y ))dxdy, In other words, nearby pixels moves together We seek the set u ( x, y ), v( x, y) that minimize: E (u ( x, y ), v( x, y )) I x u I y v I t (( u x2 u y2 ) (v x2 v y2 )) dxdy 2 Data term brightness constancy Smoothness term • data term - penalizes deviations from constancy assumptions • smoothness term - penalizes dev. from smoothness of the solution • regularization parameter α - determines the degree of smoothness Output – the optical flow! Idea: In order to reduce the influence of noise and outliers, we convolve I0 with a Gaussian of mean μ = 0 and standard deviation I ( x, y, t ) I 0 ( x, y, t ) Gaussian E(u, v) F ( x, y, u, v, ux , u y , vx , vy )dxdy Euler-Lagrange equations Fu Fu x Fu y 0 x y Fv Fvx Fv y 0 x y According to the calculus of variations, a minimizer of E must fulfill the Euler-Lagrange equations Which are highly non linear system of equations… So we linearize again! E (u ( x, y ), v( x, y )) I x u I y v I t (( u x2 u y2 ) (v x2 v y2 )) dxdy 2 Euler-Lagrange equations 0 ( I xu I y v I t ) I x (u xx u yy ) 0 ( I xu I y v I t ) I y (vxx v yy ) Or u ( I x u I y v I t ) I x v ( I x u I y v I t ) I y 2 2 2 2 x y u ( I x u I y v I t ) I x v ( I x u I y v I t ) I y flow derivatives here discredited via u ui 1, j ui , j 2 x h ui 1, j ui , j 2 x h ui , j 1 ui , j 2 x h ui , j 1 ui , j hx2 linear system of equations ( I xklukl I ykl vkl I tkl ) I xkl (ukl ukl ) 0 ( I xklukl I ykl vkl I tkl ) I ykl (vkl vkl ) 0 I xkl I xkl kl kl I I y y I xkl I ykl ukl ukl I xkl I tkl kl kl I x I y vkl vkl I ykl I tkl I xkl I xkl kl kl I I y y Update Rule: I xkl I ykl ukl ukl I xkl I tkl kl kl I x I y vkl vkl I ykl I tkl ukln 1 ukln v n 1 kl v n kl I xkl ukln I ykl vkln I tkl [(I ) ( I ) ] kl 2 x kl 2 y I u I v I kl x n kl kl y kl 2 x n kl kl t kl 2 y [(I ) ( I ) ] I xkl I ykl Hard to find boundaries. Two approximations. Less accurate. Instead of approximating the brightness constancy to the 1st Tylor expansion, we’ve add one order. Second and more important, instead of solving E-L equations (which needed to be linearized) we wrote the Functional as n*m equations and minimized it with regular minimization methods (Gradient decent, Quasi Newton, and others) Where P are the Image derivatives, and u, v is the optical flow Mathematica Symbolic Toolbox for MATLAB--Version 2.0 (http://library.wolfram.com/infocenter/MathSource/5344/) Quasi Newton Iteration The problem was that calculation time of inverse of non sparse matrix was long. And then multiplying two non sparse matrix… Gradient Decent. Non of the above problems but linear convergence rate. And convergence to local min. In order to get things going faster (moving loooooooong string from matlab to mathematica takes awhile), we found the functional matrix’s constancy and calculate it in matlab. for i=2:imageSizeN-1 for j=2:imageSizeM-1 gradC(i,j)=gradC(i,j)+alpha*(2*(-c(i-1,j)+c(i,j))+2*(-c(i,j-1)+c(i,j))-2*alpha*(-c(i,j)+c(i,j+1)) - 2*alpha*(c(i,j)+c(i+1,j))+4*exp((-1+c(i,j)^2+s(i,j)^2)^2)*c(i,j)*(-1+c(i,j)^2+s(i,j)^2) +2*(Ix(i,j)*m(i,j)+Ixz(i,j)*m(i,j)+2*Ixx(i,j)*c(i,j)*m(i,j)^2+Ixy(i,j)*m(i,j)^2*s(i,j))*(Iz(i,j)+Izz(i,j)+Ix(i,j)*c(i,j)*m(i,j)+Ix z(i,j)*c(i,j)*m(i,j)+Ixx(i,j)*c(i,j)^2*m(i,j)^2+Iy(i,j)*m(i,j)*s(i,j) +Iyz(i,j)*m(i,j)*s(i,j)+Ixy(i,j)*c(i,j)*m(i,j)^2*s(i,j)+Iyy(i,j)*m(i,j)^2*s(i,j)^2)); gradS(i,j)=gradS(i,j)+4*exp((-1+c(i,j)^2+s(i,j)^2)^2)*s(i,j)*(1+c(i,j)^2+s(i,j)^2)+2*(Iy(i,j)*m(i,j)+Iyz(i,j)*m(i,j)+Ixy(i,j)*c(i,j)*m(i,j)^2+2*Iyy(i,j)*m(i,j)^2*s(i,j))*(Iz(i,j)+Izz(i,j) +Ix(i,j)*c(i,j)*m(i,j)+Ixz(i,j)*c(i,j)*m(i,j)+Ixx(i,j)*c(i,j)^2*m(i,j)^2+Iy(i,j)*m(i,j)*s(i,j)+Iyz(i,j)*m(i,j)*s(i,j)+Ixy(i,j)*c(i,j)* m(i,j)^2*s(i,j)+Iyy(i,j)*m(i,j)^2*s(i,j)^2) +alpha*(2*(-s(i-1,j)+s(i,j))+2*(-s(i,j-1)+s(i,j))-2*alpha*(s(i,j)+s(i,j+1))-2*alpha*(-s(i,j)+s(i+1,j))); gradM(i,j)=gradM(i,j)+beta*(2*(-m(i-1,j)+m(i,j))+2*(-m(i,j-1)+m(i,j)))-2*beta*(-m(i,j)+m(i,j+1))-2*beta*(m(i,j)+m(i+1,j))+2*(Ix(i,j)*c(i,j)+Ixz(i,j)*c(i,j) +2*(Ixx(i,j)*c(i,j)^2*m(i,j)+Iy(i,j)*s(i,j)+Iyz(i,j)*s(i,j)+2*Ixy(i,j)*c(i,j)*m(i,j)*s(i,j)+2*Iyy(i,j)*m(i,j)*s(i,j)^2)*(Iz(i,j)+Izz( i,j)+Ix(i,j)*c(i,j)*m(i,j)+Ixz(i,j)*c(i,j)*m(i,j) +Ixx(i,j)*c(i,j)^2*m(i,j)^2+Iy(i,j)*m(i,j)*s(i,j)+Iyz(i,j)*m(i,j)*s(i,j)+Ixy(i,j)*c(i,j)*m(i,j)^2*s(i,j)+Iyy(i,j)*m(i,j)^2*s(i,j)^2 )); end end end All methods need couple of unknown parameters which need to be selected by an educated guess. (Condor to the rescue) All the image derivatives and gradients are calculated in a linear manner. Takes about 45min (highend algorithms take around 20sec…) The Ground Truth.