1 ECSE 6650 Computer Vision Final Project A An naallyyssiiss ooff SStteerreeoo IIm maaggee R Reeccttiiffiiccaattiioon nA Allggoorriitth hm mss Zhiwei Zhu, Ming Jiang, Chih-ting Wu 1. Introduction and motivations The matching problems, while recovering the 3D objects, can be solved much more efficiently if the images are rectified. This step consists of transforming the images so that the Epipolar lines are aligned horizontally. In this case stereo matching algorithms can easily take advantage of the Epipolar constraint and reduce the search space to one dimension (i.e. corresponding rows of the rectified images). In the previous project(ECSE 6650 Computer Vision, Project #2, 3-D Reconstruction), we practically applied the rectification theory and recovered the 3D reconstruction, given a pair of stereo images . However, in the procedure of implement, three algorithms led to different results. They are from the class handout [1], textbook [2], and Fusiello et al [3]. In the first place, we were confused with the difference and constraints between and beneath these three algorithms, and thereby we supposed them as three different methods, and regarded them as worthy information to discover . Therefore we were full of interested to investigate the problems and decided to extend it as our final project. After a series of mathematical approaches and numerical analysis, the results are not “significant” as we expected. Even though the results were not thrilling, we still won’t say this is a fruitless results. Actually, this project brought us to the profound levels of understanding the rectification. 2. Full Perspective projection Camera Model The pinhole model we used is assumed to be the full perspective projection in homogeneous coordinated system. Thus, each calibration point ( xi yi z i ) projects onto an image plane point with coordinates (ci ri ) determined by the following equation: 2 xi p1t ci y ri P i p2t z 1 i p3t 1 x p14 i y p24 i z p34 i 1 (2-1) and (2-2) P WM where is a scale factor, P is the homogeneous projection matrix, fs x W 0 0 0 fs y 0 r1 c0 M ( R T ) r2 r0 is the intrinsic matrix, and r 1 3 tx t y is t z the extrinsic matrix. Hence, equation(2-1) can be rewritten as ci s x fr1 c0 r3 ri s y fr2 r0 r3 1 r3 x s x ft x c0 t z i y s y ft y r0 t z i zi tz 1 (2-3) 3. Epipolar Geometry C1 C2 1 1 Figure(1)Epipolar Geometry The performance of searching the corresponding elements is in the Epipolar geometry as Figure 1. shown. The three points [C1, C2, P] form what is called an epipolar plane and the intersections of this plane with the two image planes form the epipolar lines. The line connecting the two centers of projection [C1, C2] intersects the image planes at the 3 conjugate points e1 and e2, which are called epipoles. Assume that the 3D point P projects into the two image planes as the points p1 and p2 which are expressed in homogeneous coordinates (u1, v1, 1) and (u2, v2, 1) 4. Analysis of The Rectification Algorithms 4.1 The algorithm in the lecture notes We need the extrinsic parameters of the stereo system to construct the rectification matrix. For a point P in the world reference frame we have Pr Rr P Tr (4-1) Pl Rl P Tl (4-2) and where Pr is the coordinates of the 3D point in the right camera frame Pl is the coordinates of the 3D point in the left camera frame, Rr and Rl are the rotation transformation matrices of the point in the right and left camera frame respectively, while Tr and Tl are the rotation transformation matrices of the point in the right and left camera frame respectively. From equation (4-1) and (4-2), we have Pl Rl P Tl Rl [ Rr1 ( Pr Tr )] Tl Rl Rr1 Pr Rl Rr1Tr Tl (4-3) Since the relationship between Pl and Pr is given by Pl RPr T , we equate the terms to get R Rl Rr1 Rl RrT (4-4) and T Tl Rl RrT Tr Tl R Tr (4-5) as the extrinsic parameters matrices of the stereo system. Once obtain these extrinsic parameters, we can then construct the rectification matrix Rrect which consists of a triple of mutually 4 orthogonal unit vectors: e1 e2 T || T || T (4-6) Tx y 2 x T 0 T 2 y (4-7) e3 e1 e2 (4-8) e1T Rrect e2T eT 3 (4-9) The following reasoning shows how the image is rectified with the rectification matrix and the adjustment made on the intrinsic camera parameters to ensure the correctness of the rectification process. Apply the rectification matrix to both sides of Pl RPr T RrectPl RrectRPr RrectT (4-10) Let Pl ' R r e c tl P (4-11) Pr' R r e c t Rr P (4-12) be coordinates of the point in the rectified left and camera frame, since || T RrectT 0 0 || (4-13) We have || T || Pl P 0 0 ' (4-14) ' r At this stage, we know from (14) that the point has the same y and z coordinates in both the rectified left and right camera frame. We then investigate the rectification effect on the left and right images. Let xl' Pl ' yl' zl' and xr' Pr' y r' z r' (4-15) 5 We set the camera intrinsic parameters in rectification procedure as fs xl Wlrect 0 0 0 fs yl 0 c 0l r0l 1 and fs xr Wrrect 0 0 0 fs yr 0 c 0r r0r 1 (4-16) These intrinsic camera parameters could be different from those we obtain through the calibration procedure, since those intrinsic parameters may be adjusted in order to ensure the rectification, but we have not been able to know the details up to now. The following argument reveals the details of the requirements on the intrinsic parameters. By projecting the points Pr and Pl onto the image frames, we have u l' x l' fs xl x l' c 0l z l' ' ' ' ' v l Wlrect y l fs yl y l r0l z l w ' z' z l' l l (4-17) u r' x r' fs xr x r' c 0 r z r' ' ' ' ' v r Wrrect y r fs yr y r r0 r z r w ' z' z r' r r (4-18) and The objective of the rectification procedure is to make the points Pr and Pl have the same coordinates along the vertical direction in the images, which means fs yl yl' r0l zl' zl' fs yr y r' r0 r z r' z r' (4-19) We already know from (14) that z l' z r' and yl' y r' Thus fsyl yl' r0l zl' fsyr yr' r0r zr' (4-20) ( fsyl fsyr ) yl' (r0l r0r ) zl' 0 (4-21) This is the same as To make the above equation hold for any y and z coordinates, we must have fs yl fs yr and r0l r0 r (4-22) Meeting the constraints (4-22) will ensure the correctness of the 6 rectification. So we must use a new set of camera intrinsic parameters to rectify the left and right image, since the parameters obtained from the camera calibration procedure is likely to have significant errors and can’t meet the constraints of (4-22), which will result different y coordinates for the same 3D point in the two images. In practice, we can set the intrinsic parameters matrices of both cameras to their average or set them to either one, as long as they meet the requirements of (4-22). In the following steps, we will name this new camera intrinsic matrix as Wnew , which sill be applied to both left and right rectification. For the rectification of the left image, relate the original image with the rectified image: cl l rl Wol * Rl * P 1 (4-23) cl' r rl' 1 (4-24) Wnew Rrect * Rl * P Where Wol is the intrinsic matrix obtained from camera calibration, s are the scale factors. From (4-23) and (4-24) then cl' rl' 1 cl 1 Wnew * Rrect * Wol * rl 1 (4-25) The rectification procedure is realized by equation (4-25), while other methods such as bilinear may be used to improve the quality of the rectification image. We have also noticed in previous experiments that not all of the rectified pixels can be seen in the rectified image. This is a serious problem if the object we want to do 3D reconstruction is not contained in the rectified image. Is there any method to adjust the rectification 7 procedure so that we can control at least partially which part of the scene to show on the rectified image? While changing the focus length can keep all the points within images of the same size as the original, we propose an alternative approach by shifting the image center along the horizontal direction. It is obvious from (4-21) and (4-22) that changing c0l or c0 r will not destroy the rectification effect. By changing the image center, we can move some points outside of the images to inside of images and find matching points with new shifted images. 4.2 The rectification method in the textbook The following discussion resembles that in the above section. The major issue is that the coordinate transformation defined in this method is different from that in the course lecture notes which results in different forms of rectification matrices. (4-26) Pr R( Pl T ) Pl RT Pr T (4-27) Multiply both sides of (4-27) by Rrect RrectPl RrectR T Pr RrectT (4-28) It is clear now that for such coordinates definition as (4-26), the rectification matrices for the left and right images are: (4-29) Rl Rrect T (4-30) Rr Rrect * R where R Rr RlT (4-31) T Tl RT Tr (4-32) and 4.3. A compact algorithm for rectification of stereo pairs In this section, we present another algorithm proposed by Fusiello et al. [3] to rectify a calibrated stereo rig of unconstrained geometry and mounting general cameras. We analyze the components of this algorithm step by step: 8 (1) Optical Center In homogeneous coordinates, we have the following projection transformation: x c y r P z 1 1 where P P (4.3-1) is the projection matrix. Let’s represent the projection matrix as: q1 P q2T q2T T q14 q24 [WR WT ] q34 (4.3-2) where W is the intrinsic parameters and R and T are the rotation matrix and the translation vector. From equation (6-1), we can get the following equations: q1T w q14 c q T w q 3 34 T q w q24 r 2T q3 w q34 where w (x y (4.3-3) z )T . Here, the focal plane is referred as the plane that is parallel to the image plane and contains the optical center. It is the locus of the points projected to infinity, hence its equation is q3T w q34 0 . The two planes defined by q1T w q14 0 q2T w q24 0 and intersect the image plane in the vertical and horizontal axis of the image coordinates respectively. The optical center C is the intersection of these three planes; hence its coordinates c are the solution of 9 0 c P 0 1 0 (4.3-4) From the above equation, we can obtain the coordinates of the left and right camera Optical Centers as follows: c (WR ) 1WT (2) The rotation matrix (4.3-5) Rrect After rectification, the rotated cameras have the same orientation but different positions. Positions of the optical centers are the same as the old cameras, while orientation changes because we rotate both cameras around the optical centers in such a way that focal planes become coplanar and contain the baseline. The matrix Rrect is the same for both rotated left and right cameras, and is computed as follows: ※ The new X axis is parallel to the baseline : r1 (c1 c2 ) / c1 c2 ※ The new Y axis is orthogonal to X (mandatory) and to k: ※ The new Z axis is orthogonal to XY (mandatory): r2 k r1 r3 r1 r2 where k is an arbitrary unit vector, which fixes the position of the new Y axis in the plane orthogonal to X. So, the matrix R is as follows: r1 R r2T r3T T (4.3-6) which are the X,Y and Z axes respectively of the camera standard reference frame, represented in world coordinates. 10 (3) How to do the rectification: For the original left image, we have the following equation: x c y left r Wleft [ Rleft , T ] z 1 1 (4.3-7) If we assume that the coordinate system of object frame has the same origin as the coordinate system of the camera frame, then the translation vector T is equal to 0. Further, we have the following simplified equation: c x left r Wleft Rleft y 1 z (4.3-8) For the rectified left image, we have the following simplified equation: cn x n rn Wnew Rrect y 1 z (4.3-9) Combining the above two equations, we get the following equation: cn c left t 1 Wnew Rrect Rleft Wleft r rn n 1 1 (4.3-10) Which is final equation used to do the image rectification. 5. Reconstruction from Rectified Image Pair To original pixel coordinates can be recovered from the rectified coordinates (4-25) and (4-26) 11 The 3D coordinates can be solved through the perspective projection x c y (5-1) λl r W M . z 1 1 With the above equations, we have x x cl' cl ' y y 1 1 λl rl Wnew RrectWol rl Wnew RrectWol Wol M l Wnew Rrect M l (5-2) z z 1 1 1 1 and x x c r' cr ' y y 1 1 λr rr Wnew Rrect RW or rr Wnew Rrect RW or Wor M r Wnew Rrect RM r (5-3) z z 1 1 1 1 Let Pl Wnew RrectM ll and Pr Wnew RrectRM r represent the projective matrix for left image and right image respectively, thus we have x x cl Pl 11 Pl 12 Pl 13 Pl 14 cl 0 y y Pl λl rl Pl 21 Pl 22 Pl 23 Pl 24 λl rl 0 z z 1 P 1 0 l 31 Pl 32 Pl 33 Pl 34 1 1 x x c r Pr 11 Pr 12 Pr 13 Pr 14 cr 0 y y Pr λr rr Pr 21 Pr 22 Pr 23 Pr 24 λr rr 0 z z 1 P 1 0 r 31 Pr 32 Pr 33 Pr 34 1 1 (5-4) (5-5) The combination of the equation (7-3)and(7-4), we have Pl 11 Pl 21 P l 31 Pr 11 P r 21 P r 31 Pl 12 Pl 13 cl Pl 22 Pl 23 rl Pl 32 Pl 33 1 Pr 12 Pr 13 0 Pr 22 Pr 23 0 Pr 32 Pr 33 0 0 P x l 14 0 Pl 24 y 0 Pl 34 z c r Pr 14 rr l Pr 24 1 r Pr 34 (5-6) 12 The least-squares solution of the linear system AX B is given by X ( AT A) 1 AT B (5-7) The 3-D coordinates can be thus obtained from the two corresponding image points. 6. Experiments and Results: We set the new intrinsic parameter matrices for the left and right images to different values. (a) First Experiment For the left image, we set the following intrinsic parameter matrix for the left camera: Wnew 0 315.7700 860.0405 0 850.6958 278.5259 0 0 1.0000 The corresponding rectified image is shown in Figure (2). (6-1) 13 Figure (2): the rectified left image For the right image, we set the following intrinsic parameter matrix for the right camera: Wnew 0 500.0000 860.0405 0 850.6958 278.5259 0 0 1.0000 (6-2) The corresponding rectified image is shown in Figure (3). Figure (3): the rectified right image After obtaining the left and right rectified images, we found that the correspondence points in the left image can be found accurately in the same row of the right rectified image as shown in Figure (4). Also, from Figure (4), we found that the right rectified image is shifted to right by setting the c 0 a different value in the intrinsic parameter matrix Wnew , and the whole part of the stone are visible in the image. 14 Figure (4): the correspondence points marked by the white lines in the rectified left and right images. (b) Second Experiment For the left image, we set the following intrinsic parameter matrix for the left camera: Wnew 0 157.8850 600.0000 0 425.3479 139.2630 0 0 0.5000 The corresponding rectified image is shown in Figure (5). (6-3) 15 Figure (5): The rectified left image For the right image, we set the following intrinsic parameter matrix for the right camera: Wnew 0 450.0000 600.0000 0 425.3479 139.2630 0 0 0.5000 The corresponding rectified image is shown in Figure (6). (6-4) 16 Figure (6): the rectified right image Figure (7) will show that for each point in the rectified left image, the corresponding point in the rectified right image can be located in the same row of the right image. Figure (7) The correspondence points marked by the white lines in the rectified left and right images From the above experiments, we made some conclusions as follows: 17 First, for each point in the left rectified image, in order to locate its correspondence point in the rectified right image, we have to make sure that two entries of intrinsic parameter matrix, sx f and r0 , should be same in the new W for both left and right cameras. Second, we can adjust the entry c 0 of the intrinsic parameter matrix to shift the rectified image to be visible. 7. Summary and Conclusions In this project, three algorithms for image rectification for a pair of stereo images are studied. For each algorithm, the main steps are analyzed systematically, and the necessary verifications are given. Finally, we explored the necessary constraints for them to work successfully. Under the proposed constraints, via mathematical derivation, each of the three algorithms is proved to work successfully on the pair of stereo images. Experiments conducted on the real image also show that the correctness of each algorithm. Reference: [1]ECSE 6650 Computer Vision Class Handouts, RPI 2002 [2]E. Trucco and A. Verri Introductory Techniques for 3-D Computer Vision, Prentice Hall.,1998 [3] Andrea Fusiello, Emamuele Trucco, Alessandro Verri. “A compact algorithm for rectification of stereo pairs”, Machine Vision and Applications, Springer-Verlag 2000.