Colorado School of Mines Computer Vision Professor William Hoff Dept of Electrical Engineering &Computer Science Colorado School of Mines Computer Vision http://inside.mines.edu/~whoff/ 1 Structure From Motion Examples Colorado School of Mines Computer Vision 2 Example – cube images • First find corresponding points by hand – An easy way is to use the Matlab cpselect function – Use lots of points for best results (I used N=24) I1 = imread('cube1.jpg'); I2 = imread('cube2.jpg'); % Start the GUI to select corresponding points [Pts1,Pts2] = cpselect(I1,I2, 'Wait', true); save('Pts1.mat', 'Pts1'); save('Pts2.mat', 'Pts2'); – Then put the points into the arrays u1,u2 (size is 3xN) and save them N = size(Pts1,1); u1 = [Pts1'; ones(1,N)]; u2 = [Pts2'; ones(1,N)]; save('u1.mat', 'u1'); save('u2.mat', 'u2'); Colorado School of Mines Computer Vision 3 Example – cube images 2 1 3 3 4 1 5 6 7 8 9 5 14 10 10 13 11 12 19 11 8 13 24 9 14 19 16 18 17 20 24 23 21 21 23 22 22 % Draw points on images to make sure they are correct imshow(I1, []); for i=1:length(u1) x = round(u1(1,i)); y = round(u1(2,i)); rectangle('Position', [x-4 y-4 8 8], 'EdgeColor', 'r'); text(x+4, y+4, sprintf('%d', i), 'Color', 'r'); end figure, imshow(I2, []); for i=1:length(u2) x = round(u2(1,i)); y = round(u2(2,i)); rectangle('Position', [x-4 y-4 8 8], 'EdgeColor', 'r'); text(x+4, y+4, sprintf('%d', i), 'Color', 'r'); end Colorado School of Mines 7 15 18 17 20 6 12 15 16 4 2 Computer Vision Data is from http://perception.csl.uiuc.edu/ec e497ym/lab2.htm. 4 Example – cube images • Run program “essential” to calculate essential matrix • Be sure to use correct intrinsic camera parameters True essential matrix (from known ground truth) E= 0.0788 -0.0695 0.2098 -0.2410 -0.0823 -0.9423 -0.2723 0.9581 0.0066 Calculated essential matrix? -0.5003 0.3416 -1.3529 1.3185 0.6416 5.2696 1.6933 -5.3247 -0.0286 Colorado School of Mines Computer Vision 5 % Calculate the essential matrix. clear all, close all % intrinsic camera parameters K = [ 655.3076 0 340.3110; 0 653.5052 245.3426; 0 0 1.0000]; load u1 load u2 I1 = imread('cube1.jpg'); I2 = imread('cube2.jpg'); % Display points on the images for visualization imshow(I1, []); for i=1:length(u1) x = round(u1(1,i)); y = round(u1(2,i)); rectangle('Position', [x-4 y-4 8 8], 'EdgeColor', 'r'); text(x+4, y+4, sprintf('%d', i), 'Color', 'r'); end figure, imshow(I2, []); for i=1:length(u2) x = round(u2(1,i)); y = round(u2(2,i)); rectangle('Position', [x-4 y-4 8 8], 'EdgeColor', 'r'); text(x+4, y+4, sprintf('%d', i), 'Color', 'r'); end Program “essential.m” (1 of 2) % Get normalized image points p1 = inv(K)*u1; p2 = inv(K)*u2; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Scale and translate image points so that the centroid of % the points is at the origin, and the average distance of the points to the % origin is equal to sqrt(2). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% xn = p1(1:2,:); % xn is a 2xN matrix N = size(xn,2); t = (1/N) * sum(xn,2); % this is the (x,y) centroid of the points xnc = xn - t*ones(1,N); % center the points; xnc is a 2xN matrix dc = sqrt(sum(xnc.^2)); % dist of each new point to 0,0; dc is 1xN vector davg = (1/N)*sum(dc); % average distance to the origin s = sqrt(2)/davg; % the scale factor, so that avg dist is sqrt(2) T1 = [s*eye(2), -s*t ; 0 0 1]; p1s = T1 * p1; Colorado School of Mines Computer Vision 6 xn = p2(1:2,:); % xn is a 2xN matrix N = size(xn,2); t = (1/N) * sum(xn,2); % this is the (x,y) centroid of the points xnc = xn - t*ones(1,N); % center the points; xnc is a 2xN matrix dc = sqrt(sum(xnc.^2)); % dist of each new point to 0,0; dc is 1xN vector davg = (1/N)*sum(dc); % average distance to the origin s = sqrt(2)/davg; % the scale factor, so that avg dist is sqrt(2) T2 = [s*eye(2), -s*t ; 0 0 1]; p2s = T2 * p2; % % % % A Program “essential.m” (2 of 2) Compute essential matrix E from point correspondences. We know that p1s' E p2s = 0, where p1s,p2s are the scaled image coords. We write out the equations in the unknowns E(i,j) A x = 0 = [p1s(1,:)'.*p2s(1,:)' p1s(1,:)'.*p2s(2,:)' p1s(1,:)' ... p1s(2,:)'.*p2s(1,:)' p1s(2,:)'.*p2s(2,:)' p1s(2,:)' ... p2s(1,:)' p2s(2,:)' ones(length(p1s),1)]; % The solution to Ax=0 is the singular vector of A corresponding to the % smallest singular value; that is, the last column of V in A=UDV' [U,D,V] = svd(A); x = V(:,size(V,2)); % get last column of V % Put unknowns into a 3x3 matrix. Transpose because Matlab's "reshape" % uses the order E11 E21 E31 E12 ... Escale = reshape(x,3,3)'; % Force rank=2 and equal eigenvalues [U,D,V] = svd(Escale); Escale = U*diag([1 1 0])*V'; % Undo scaling E = T1' * Escale * T2; disp('Calculated essential matrix:'); disp(E); save('E.mat', 'E'); Colorado School of Mines % Save to file Computer Vision 7 Example – cube images • Run program “drawepipolar” to visualize epipolar lines 1 – Example: • Choose point 1 in the second image • Draw the corresponding epipolar line on the first image • Verify visually that the line passes through (or very close to) the corresponding point in the first image Colorado School of Mines Computer Vision 8 % Draw epipolar lines, from the essential matrix. clear all close all I1 = imread('cube1.jpg'); I2 = imread('cube2.jpg'); % intrinsic camera parameters K = [ 655.3076 0 340.3110; 0 653.5052 245.3426; 0 0 1.0000]; Program “drawepipolar.m” (1 of 2) load E load u1 load u2 % Display images subplot(1,2,1), imshow(I1, []), title('View 1'); for i=1:length(u1) rectangle('Position', [u1(1,i)-4 u1(2,i)-4 8 text(u1(1,i)+4, u1(2,i)+4, sprintf('%d', i), end subplot(1,2,2), imshow(I2, []), title('View 2'); for i=1:length(u2) rectangle('Position', [u2(1,i)-4 u2(2,i)-4 8 text(u2(1,i)+4, u2(2,i)+4, sprintf('%d', i), end 8], 'EdgeColor', 'r'); 'Color', 'r'); 8], 'EdgeColor', 'g'); 'Color', 'g'); pause p1 = inv(K) * u1; p2 = inv(K) * u2; % get normalized image coordinates % get normalized image coordinates % Draw epipolar lines on image 1 for i=1:length(p2) subplot(1,2,1), imshow(I1,[]); % The product l=E*p2 is the equation of the epipolar line corresponding % to p2, in the first image. Here, l=(a,b,c), and the equation of the % line is ax + by + c = 0. l = E * p2(:,i); % Calculate residual error. The product p1'*E*p2 should = 0. % difference is the residual. res = p1(:,i)' * E * p2(:,i); fprintf('Residual is %f to point %d\n', res, i); Colorado School of Mines Computer Vision The 9 % Let's find two points on this line. First set x=-1 and solve % for y, then set x=1 and solve for y. pLine0 = [-1; (-l(3)-l(1)*(-1))/l(2); 1]; pLine1 = [1; (-l(3)-l(1))/l(2); 1]; % Convert from normalized to unnormalized coords pLine0 = K * pLine0; pLine1 = K * pLine1; line([pLine0(1) pLine1(1)], [pLine0(2) pLine1(2)], 'Color', 'r'); subplot(1,2,2), imshow(I2,[]); rectangle('Position', [u2(1,i)-4 u2(2,i)-4 8 8], 'EdgeColor', 'g'); text(u2(1,i)+4, u2(2,i)+4, sprintf('%d', i), 'Color', 'g'); Program “drawepipolar.m” (2 of 2) pause; end % Draw epipolar lines on image 2 for i=1:length(p1) subplot(1,2,2), imshow(I2, []); % The product l=E'*p1 is the equation of the epipolar line corresponding % to p1, in the second image. Here, l=(a,b,c), and the equation of the % line is ax + by + c = 0. l = E' * p1(:,i); % Let's find two points on this line. First set x=-1 and solve % for y, then set x=1 and solve for y. pLine0 = [-1; (-l(3)-l(1)*(-1))/l(2); 1]; pLine1 = [1; (-l(3)-l(1))/l(2); 1]; % Convert from normalized to unnormalized coords pLine0 = K * pLine0; pLine1 = K * pLine1; line([pLine0(1) pLine1(1)], [pLine0(2) pLine1(2)], 'Color', 'r'); subplot(1,2,1), imshow(I1,[]); rectangle('Position', [u1(1,i)-4 u1(2,i)-4 8 8], 'EdgeColor', 'r'); text(u1(1,i)+4, u1(2,i)+4, sprintf('%d', i), 'Color', 'r'); pause; end Colorado School of Mines Computer Vision 10 Example – cube images • Run program “twoview” to calculate structure and motion Reconstructed pose of camera2 wrt camera1: 0.9394 0.0382 -0.3406 0.9644 -0.0739 0.9930 -0.0924 0.2481 0.3347 0.1120 0.9357 0.0918 0 0 0 1.0000 3.4 3.2 3 2.8 Reconstructed points wrt camera1: -0.953001 -0.742815 2.708128 -0.444555 -0.669706 2.395233 0.090243 -0.589506 2.057979 0.446757 -0.719637 2.614346 -0.977814 -0.435416 2.808429 : 2.6 2.4 2.2 0.5 0 -0.5 -1 -0.5 0 0.5 3D plot of reconstructed points Colorado School of Mines Computer Vision 11 % Calculate structure and motion. clear all close all I1 = imread('cube1.jpg'); I2 = imread('cube2.jpg'); % intrinsic camera parameters K = [ 655.3076 0 340.3110; 0 653.5052 245.3426; 0 0 1.0000]; Program “twoview.m” (1 of 3) load E load u1 load u2 % These are the normalized image points p1 = inv(K)*u1; p2 = inv(K)*u2; % Display images subplot(1,2,1), imshow(I1, []), title('View 1'); for i=1:length(u1) rectangle('Position', [u1(1,i)-4 u1(2,i)-4 8 text(u1(1,i)+4, u1(2,i)+4, sprintf('%d', i), end subplot(1,2,2), imshow(I2, []), title('View 2'); for i=1:length(u2) rectangle('Position', [u2(1,i)-4 u2(2,i)-4 8 text(u2(1,i)+4, u2(2,i)+4, sprintf('%d', i), end 8], 'EdgeColor', 'r'); 'Color', 'r'); 8], 'EdgeColor', 'g'); 'Color', 'g'); pause %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Extract motion parameters from essential matrix. % We know that E = [tx] R, where % [tx] = [ 0 -t3 t2; t3 0 -t1; -t2 t1 0] % % If we take SVD of E, we get E = U diag(1,1,0) V' % t is the last column of U [U,D,V] = svd(E); Colorado School of Mines Computer Vision 12 W = [0 -1 0; 1 0 0; 0 0 1]; Hresult_c2_c1(:,:,1) = [ U*W*V' U(:,3) ; 0 Hresult_c2_c1(:,:,2) = [ U*W*V' -U(:,3) ; 0 Hresult_c2_c1(:,:,3) = [ U*W'*V' U(:,3) ; 0 Hresult_c2_c1(:,:,4) = [ U*W'*V' -U(:,3) ; 0 0 0 0 0 0 0 0 0 1]; 1]; 1]; 1]; % make sure each rotation component is a legal rotation matrix for k=1:4 if det(Hresult_c2_c1(1:3,1:3,k)) < 0 Hresult_c2_c1(1:3,1:3,k) = -Hresult_c2_c1(1:3,1:3,k); end end disp('Calculated possible poses, camera 2 to camera 1:'); disp(Hresult_c2_c1); pause Program “twoview.m” (2 of 3) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Find the correct motion. We will need to reconstruct the actual 3D % position of one of the points (say, point 1). % % We have matching image points p1,p2. We know that p1 x M1 P = 0 and % p2 x M2 P = 0, where M1,M2 are the projection matrices and P is the % unknown 3D point. Or, we have % A P = 0, where % A = ( [p1x] M1 ) % ( [p2x] M2 ) % Here, M1 is identity and M2 is H_c1_c2. M1 = [ 1 0 0 0; 0 1 0 0; 0 0 1 0]; % Get skew symmetric matrices for point number 1 p1x = [ 0 -p1(3,1) p1(2,1); p1(3,1) 0 -p1(1,1); -p1(2,1) p1(1,1) 0 ]; p2x = [ 0 p2(3,1) -p2(2,1) -p2(3,1) 0 p2(1,1) p2(2,1); -p2(1,1); 0 ]; % See which of the four solutions will yield a 3D point position that is in % front of both cameras (ie, has its z>0 for both). for i=1:4 Hresult_c1_c2 = inv(Hresult_c2_c1(:,:,i)); M2 = Hresult_c1_c2(1:3,1:4); Colorado School of Mines Computer Vision 13 A = [ p1x * M1; p2x * M2 ]; % The solution to AP=0 is the singular vector of A corresponding to the % smallest singular value; that is, the last column of V in A=UDV' [U,D,V] = svd(A); P = V(:,4); % get last column of V P1est = P/P(4); % normalize P2est = Hresult_c1_c2 * P1est; if P1est(3) > 0 && P2est(3) > 0 % We've found a good solution. Hest_c2_c1 = Hresult_c2_c1(:,:,i); break; % break out of for loop; can stop searching end end Program “twoview.m” (3 of 3) % Now we have the transformation between the cameras (up to a scale factor) fprintf('Reconstructed pose of camera2 wrt camera1:\n'); disp(Hest_c2_c1); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Hest_c1_c2 = inv(Hest_c2_c1); M2est = Hest_c1_c2(1:3,:); % Reconstruct point positions (these are good to the same scale factor) fprintf('Reconstructed points wrt camera1:\n'); for i=1:length(p1) p1x = [ 0 -p1(3,i) p1(2,i); p1(3,i) 0 -p1(1,i); -p1(2,i) p1(1,i) 0 ]; p2x = [ 0 -p2(3,i) p2(2,i); p2(3,i) 0 -p2(1,i); -p2(2,i) p2(1,i) 0 ]; A = [ p1x * M1; p2x * M2est ]; [U,D,V] = svd(A); P = V(:,4); P1est(:,i) = P/P(4); % get last column of V % normalize fprintf('%f %f %f\n', P1est(1,i), P1est(2,i),P1est(3,i)); end % Show the reconstruction result in 3D figure, plot3(P1est(1,:),P1est(2,:),P1est(3,:),'d'); axis equal; axis vis3d Colorado School of Mines Computer Vision 14 Example – cube images • If the size of one check is 100 mm, what is the true translation of the camera? • Solution 1 2 5 6 3 7 4 8 14 10 13 11 12 – Points 1 and 2 are 4 checks apart, so the true distance between them is 400 mm – In our reconstruction, we found that the distance between them was d = 0.615 9 19 15 18 16 24 17 20 23 21 22 >> d = norm( P1est(1:3,1) - P1est(1:3,2) ) – So the scale factor of the solution is s = 400/d = 665.04, meaning that the camera translation and all point locations must be multiplied by this number – The true translation of the camera (in mm) is >> s * Hest_c2_c1(1:3,4) Colorado School of Mines Computer Vision 641.3491 164.9887 61.0310 15