Colorado School of Mines Computer Vision Professor William Hoff Dept of Electrical Engineering &Computer Science Colorado School of Mines Computer Vision http://inside.mines.edu/~whoff/ 1 Corners Colorado School of Mines Computer Vision 2 Sum of squared differences • Say we want to track a patch from image I0 to image I1 – We’ll assume that intensities don’t change, and only translational motion – So we expect that I1 ( x i u ) I 0 ( x i ) – In practice we will have noise, so they won’t be exactly equal • But we can search for the displacement u=(x,y) that minimizes the sum of squared differences E (u) wx i I1 (x i u) I 0 (x i ) 2 i Colorado School of Mines Computer Vision 3 Stability of SSD • We would like the SSD score to have a distinct minimum at the correct value of u – If not, there will be uncertainty in the value of u • The image texture in the patch will determine the stability of the SSD score – Bland, featureless patches will not be able to be matched uniquely • We want to choose patches to track that will be stable • We can predict how stable a patch will be by comparing its SSD score with itself, at various displacements Du Dx 2 E (Du) wx i I 0 (x i Du) I 0 (x i ) , where Du i Dy • If the surface E(Du) has a distinct minimum, it is a good feature to track Colorado School of Mines Computer Vision w(x) is optional weighting function 4 From Computer Vision: Algorithms and Applications, Richard Szeliski, 2010, Springer Colorado School of Mines Computer Vision 5 Stability of SSD Score • We can estimate the stability of the SSD score • First, take the Taylor series expansion of the image – Recall Taylor series expansion of a function f(x) near x0 df f ( x) f ( x0 ) ( x x0 ) dx x0 – For vectors x, u I 0 (x i Du) I 0 (x i ) I 0 (x i ) Du • Then E (Du) w(x i )I 0 (x i Du) I 0 (x i ) 2 i w(x i )I 0 (x i ) I 0 (x i ) Du I 0 (x i ) 2 i w(x i )I 0 (x i ) Du where • xi = (xi,yi)T is some image point • Du = (Dx,Dy)T is the displacement from that point • 𝛻I0(xi) is the gradient of the image at xi 2 i Colorado School of Mines Computer Vision 6 Stability of SSD Score (continued) • Expand the SSD score E (Du) w(x i )I 0 (x i ) Du 2 i I (x ) I (x ) w(x i ) Dx 0 i Dy 0 i x y i 2 2 I (x ) 2 I ( x ) I ( x ) I ( x ) w(x i ) Dx 0 i Dx 2Dx 0 i 0 i Dy Dy 0 i Dy x y i x y 2 2 I 0 (x i ) I 0 (x i ) I 0 (x i ) I 0 (x i ) Dy Dy w(x i ) Dy Dx w(x i ) Dx 2Dx w(x i ) x x y y i i i Dx w I xx Dx 2Dx w I xy Dy Dy w I yy Dy w I xx Dx Dy w I xy • where w I xy Dx w I yy Dy I I I I I x2 , I xy , I y2 x x y y Colorado School of Mines 2 2 Computer Vision 7 Stability of SSD Score (continued) • The SSD score is I I I I , I xy , x x y 2 w I xx E (Du) Dx Dy w I xy DuT ADu w I xy Dx w I yy Dy 2 x I I y2 y 2 • where I x2 A w I xy I xy 2 Iy • • w is a small NxN averaging filter such as a box filter it essentially averages the values over a small local neighborhood Note – the “*” indicates correlation of w with each element of A • A can tell us how stable the SSD score is, at a given point Colorado School of Mines Computer Vision 8 Example Patches 2500 300 400 300 2000 250 250 300 200 200 150 150 100 100 50 50 0 5 0 5 0 -5 -5 0 0 0 22 500 100 0 5 0 5 5 0 5 0 0 -5 22 0 0 5 0 0 -5 -5 -5 0 0 -5 -5 12 1 1 12 I x2 A w I xy Colorado School of Mines 1000 200 5 0 1500 21 -21 -21 21 I xy 2 Iy Computer Vision 9 Stability of SSD Score (continued) • E=DuT A Du is a second order polynomial a b A b c – Has the form E(x,y)=ax2+bxy+cy2 – The matrix A gives the curvature of the surface 5 3 4 ax2 2 cy2 3 2 1 1 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 • A large value of a means that E curves up sharply as x varies – Similarly, large c means that E curves up sharply as y varies – Both a,c large mean that we have a good minimum in E 2500 2000 1500 • Problem – if b is not zero, the surface may be flat along the diagonal 1000 500 0 5 0 5 0 -5 Colorado School of Mines Computer Vision -5 10 Estimating stability (continued) • Recall eigenvalues A vi = li vi where li is the ith eigenvalue, vi is the ith eigenvector – We can write A v1 l1 0 v1 v 2 0 l 2 v2 – Or AR=LR • Now, R = (v1, v2) is an orthonormal matrix (its columns are mutually orthogonal, and all unit vectors). So RT = R-1 • So RTAR=L – Where the columns of R are the eigenvectors of A – L is a diagonal matrix whose elements are the eigenvalues of A Colorado School of Mines Computer Vision 11 Estimating stability (continued) • R is a rotation matrix – If we rotate the coordinate system to Du=R Dv – Then E = DuT A Du= (DvTRT) A (R Dv ) = DvT L Dv – Which has the form E(v1,v2) = l1 v12 + l2 v22 • So if both eigenvalues are large, we have a good minimum (there is no diagonal component) 400 300 200 100 0 5 5 0 0 -5 -5 300 • If one eigenvalue is low, surface is flat along that direction 250 200 150 100 50 0 5 5 0 0 -5 Colorado School of Mines Computer Vision -5 12 Possible interest point measures • Look at the smallest eigenvalue – if it is sufficiently large, then we have a candidate point (Shi-Tomasi) • Alternative measures that don’t require square roots: (Harris) det( A) tr ( A) 2 l0 l1 l0 l1 2 From Computer Vision: Algorithms and Applications, Richard Szeliski, 2010, Springer (Harmonic mean) ll det( A) 0 1 tr ( A) l0 l1 Colorado School of Mines Computer Vision 13 Matlab Corner Interest Operator % Find corner points clear all close all I = double(imread('test000.jpg')); % Apply Gaussian blur sd = 1.0; I = imfilter(I, fspecial('gaussian', round(6*sd), sd)); imshow(I, []); % Compute the gradient components Gx = imfilter(I, [-1 1]); Gy = imfilter(I, [-1; 1]); % Compute the outer product g'g at each pixel Gxx = Gx .* Gx; Gxy = Gx .* Gy; Gyy = Gy .* Gy; Colorado School of Mines Computer Vision 14 Matlab Corner Interest Operator (continued) % Size of neighborhood over which to compute corner features. N = 13; w = ones(N); % Sum A11 = A12 = A22 = % The neighborhood the G's over the window size. imfilter(Gxx, w); imfilter(Gxy, w); imfilter(Gyy, w); % At each pixel (x,y), we have the 2x2 matrix % [A11(x,y) A12(x,y); % A21(x,y) A22(x,y)] % Of course, A21 = A12. % Compute the interest score: det(A)/trace(A) detA = A11.*A22 - A12.^2; % Computes det(A) at each pixel traceA = A11 + A22; % Computer trace(A) at each pixel s = detA./traceA; % The interest score % If trace(A)=0 anywhere, we get "not a number". s(isnan(s)) = 0; % If any NaN's, just set them to zero Colorado School of Mines Computer Vision 15 Finding Local Maxima • Simply finding all points greater than a threshold will yield too many points interest scores interest scores, thresholded • We should find points that are – greater than the threshold – a local maximum Colorado School of Mines Computer Vision 16 Finding Local Maxima • If peaks are smooth, you can test each point to see if it is higher than its neighbors 6 A 1D cross section score 5 4 In 2D, check 4 nearest neighbors thresh 3 2 1 x 0 1 • 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Matlab code Lmax = false(size(S)); Lmax(S(2:end-1, 2:end-1) = ... ( S(2:end-1, 2:end-1) > S(3:end, 2:end-1) ... & S(2:end-1, 2:end-1) > S(1:end-2, 2:end-1) ... & S(2:end-1, 2:end-1) > S(2:end-1, 3:end) ... & S(2:end-1, 2:end-1) > S(2:end-1, 1:end-2) ); Colorado School of Mines Computer Vision 17 Finding Local Maxima (continued) • Problem – what if you don’t have smooth peaks? – Too many detected features score Local max within ±2 6 5 • Instead, find points that are local regional maxima – they are the largest points within a region (e.g., square of size W) 4 3 • Can do by calculating the largest value within distance N (this is a dilation) 2 1 x 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 • Then find those points that equal the local maxima Smax = (S==imdilate(S, ones(N))); Colorado School of Mines Computer Vision 18 Finding Local Maxima (continued) Score array S Score array dilated with square of side 15 Sd Points where S=Sd Points where S=Sd and are > threshold Smax Colorado School of Mines Computer Vision 19 Matlab % Choose the suppression radius, for non-maxima suppression r = N; % Find local maxima within each neighborhood of radius 2r Lmax = (s==imdilate(s, strel('disk',2*r))); % Note - we don't want to detect points too close to the border, so just % zero out everything near the border. Lmax(1:N,:) = false; Lmax(:,1:N) = false; Lmax(end-N:end,:) = false; Lmax(:,end-N:end) = false; % Get a list of the indices of all the potential interest points [rows cols] = find(Lmax); % Get the values of those interest points vals = s(Lmax); Look at range of scores Colorado School of Mines Computer Vision 20 Identifying interest points • Once you have the scores, you can decide which points to identify as interest points – Could choose the points with the top M scores – Or, choose all points with scores greater than a threshold % Identify interest points above a threshold, and draw % a box around them, of size NxN for i=1:length(vals) if vals(i) > threshold x = cols(i); y = rows(i); rectangle('Position', [x-N/2 y-N/2 N N], ... 'EdgeColor', 'r', ... 'Linewidth', 2.0); % default is 0.5 text(x,y, sprintf('%d', i), ... 'Color', 'r', ... % label with id number 'FontSize', 16); % default is 10 end end Colorado School of Mines Computer Vision 21 Corner interest points: • Window size = 15 • Threshold = 100 • Local maxima within 15 pixels Colorado School of Mines Computer Vision 22 Tracking Example • • • • Colorado School of Mines Computer Vision Tracked points from truckmounted camera (Hoff 2010) Size of window = 19x19 Red = tracked point in 2D Yellow = have 3D information 23 Handling More General Viewpoint Changes • Cross correlation assumes translation only – But appearance of image patches changes with rotation, scale, viewpoint • Need features that are invariant to translation, rotation, scale, and viewpoint 24 Colorado School of Mines Computer Vision Affine Corner Detector • Affine transformation • where I 2 Ax d I1 (x) a xx A a yx a xy a yy – d is the translation, A is the deformation matrix • This can account for rotation and scale changes • Also viewpoint changes, if planar patch is small compared to distance from camera 25 Colorado School of Mines Computer Vision KLT Tracker • Find interest points in the first frame • Track (assuming translation only) points from each frame to subsequent frame • Test consistency with original frame by computing affine transformation from: Shi and Tomasi, “Good Features to Track”, IEEE CVPR 1994 26 Colorado School of Mines Computer Vision