cvfinal

advertisement
1
ECSE 6650 Computer Vision
Final Project
A
An
naallyyssiiss ooff SStteerreeoo IIm
maaggee R
Reeccttiiffiiccaattiioon
nA
Allggoorriitth
hm
mss
Zhiwei Zhu, Ming Jiang, Chih-ting Wu
1. Introduction and motivations
The matching problems, while recovering the 3D objects, can be
solved much more efficiently if the images are rectified. This step
consists of transforming the images so that the Epipolar lines are
aligned horizontally. In this case stereo matching algorithms can easily
take advantage of the Epipolar constraint and reduce the search space
to one dimension (i.e. corresponding rows of the rectified images).
In the previous project(ECSE 6650 Computer Vision, Project #2, 3-D
Reconstruction), we practically applied the rectification theory and
recovered the 3D reconstruction, given a pair of stereo images . However,
in the procedure of implement, three algorithms led to different results.
They are from the class handout [1], textbook [2], and Fusiello et al [3].
In the first place, we were confused with the difference and
constraints between and beneath these three algorithms, and thereby
we supposed them as three different methods, and regarded them as
worthy information to discover . Therefore we were full of interested to
investigate the problems and decided to extend it as our final project.
After a series of mathematical approaches and numerical analysis,
the results are not “significant” as we expected. Even though the results
were not thrilling, we still won’t say this is a fruitless results. Actually,
this project brought us to the profound levels of understanding the
rectification.
2. Full Perspective projection Camera Model
The pinhole model we used is assumed to be the full perspective
projection in homogeneous coordinated system. Thus, each calibration
point ( xi yi z i ) projects onto an image plane point with
coordinates (ci ri ) determined by the following equation:
2
 xi 
   p1t
 ci 
 
y  
  ri   P i    p2t
z
1
 i   p3t
 
1 
 
x 
p14  i 
 y 
p24  i 
z
p34  i 
1
(2-1)
and
(2-2)
P  WM
where  is a scale factor, P is the homogeneous projection matrix,
 fs x

W  0
 0

0
fs y
0
 r1
c0 


M

(
R
T
)

 r2
r0  is the intrinsic matrix, and
r
1 
 3
tx 

t y  is
t z 
the extrinsic matrix. Hence, equation(2-1) can be rewritten as
 ci   s x fr1  c0 r3
  
  ri    s y fr2  r0 r3
1 
r3
  
x 
s x ft x  c0 t z  i 
 y 
s y ft y  r0 t z  i 
 zi 
tz
 1 
 
(2-3)
3. Epipolar Geometry
C1
C2
1
1
Figure(1)Epipolar Geometry
The performance of searching the corresponding elements is in the
Epipolar geometry as Figure 1. shown. The three points [C1, C2, P] form
what is called an epipolar plane and the intersections of this plane with
the two image planes form the epipolar lines. The line connecting the
two centers of projection [C1, C2] intersects the image planes at the
3
conjugate points e1 and e2, which are called epipoles. Assume that the
3D point P projects into the two image planes as the points p1 and p2
which are expressed in homogeneous coordinates (u1, v1, 1) and (u2,
v2, 1)
4. Analysis of The Rectification Algorithms
4.1 The algorithm in the lecture notes
We need the extrinsic parameters of the stereo system to construct
the rectification matrix. For a point P in the world reference frame we
have
Pr  Rr P  Tr
(4-1)
Pl  Rl P  Tl
(4-2)
and
where Pr is the coordinates of the 3D point in the right camera frame
Pl is the coordinates of the 3D point in the left camera frame, Rr and Rl
are the rotation transformation matrices of the point in the right and
left camera frame respectively, while Tr and Tl are the rotation
transformation matrices of the point in the right and left camera frame
respectively.
From equation (4-1) and (4-2), we have
Pl  Rl P  Tl  Rl [ Rr1 ( Pr  Tr )]  Tl  Rl Rr1 Pr  Rl Rr1Tr  Tl
(4-3)
Since the relationship between Pl and Pr is given by Pl  RPr  T , we
equate the terms to get
R  Rl Rr1  Rl RrT
(4-4)
and
T  Tl  Rl RrT Tr  Tl  R Tr
(4-5)
as the extrinsic parameters matrices of the stereo system.
Once obtain these extrinsic parameters, we can then construct the
rectification matrix Rrect which consists of a triple of mutually
4
orthogonal unit vectors:
e1 
e2 
T
|| T ||
 T
(4-6)
Tx
y
2
x
T
0

T
2
y
(4-7)
e3  e1  e2
(4-8)
 e1T 
 
Rrect   e2T 
 eT 
 3
(4-9)
The following reasoning shows how the image is rectified with the
rectification matrix and the adjustment made on the intrinsic camera
parameters to ensure the correctness of the rectification process.
Apply the rectification matrix to both sides of Pl  RPr  T
RrectPl  RrectRPr  RrectT
(4-10)
Let
Pl '  R r e c tl P
(4-11)
Pr'  R r e c t Rr P
(4-12)
be coordinates of the point in the rectified left and camera frame, since
 || T

RrectT   0
 0

|| 




(4-13)
We have
 || T || 


Pl  P   0 
 0 


'
(4-14)
'
r
At this stage, we know from (14) that the point has the same y and z
coordinates in both the rectified left and right camera frame. We then
investigate the rectification effect on the left and right images.
Let
 xl' 
 
Pl '   yl' 
 zl' 
 
and
 xr' 
 
Pr'   y r' 
 z r' 
 
(4-15)
5
We set the camera intrinsic parameters in rectification procedure as
 fs xl

Wlrect   0
 0

0
fs yl
0
c 0l 

r0l 
1 
and
 fs xr

Wrrect   0
 0

0
fs yr
0
c 0r 

r0r 
1 
(4-16)
These intrinsic camera parameters could be different from those we
obtain through the calibration procedure, since those intrinsic
parameters may be adjusted in order to ensure the rectification, but we
have not been able to know the details up to now. The following
argument reveals the details of the requirements on the intrinsic
parameters.
By projecting the points Pr and Pl onto the image frames, we have
 u l' 
 x l'   fs xl x l'  c 0l z l' 
 '
 ' 
'
'
 v l   Wlrect  y l    fs yl y l  r0l z l 
w ' 
z'  

z l'
 l
 l 

(4-17)
 u r' 
 x r'   fs xr x r'  c 0 r z r' 
 ' 
 ' 
'
' 
 v r   Wrrect  y r    fs yr y r  r0 r z r 
w ' 
z'  

z r'
 r
 r 

(4-18)
and
The objective of the rectification procedure is to make the points Pr
and Pl have the same coordinates along the vertical direction in the
images, which means
fs yl yl'  r0l zl'
zl'

fs yr y r'  r0 r z r'
z r'
(4-19)
We already know from (14) that z l'  z r' and yl'  y r'
Thus
fsyl yl'  r0l zl'  fsyr yr'  r0r zr'
(4-20)
( fsyl  fsyr ) yl'  (r0l  r0r ) zl'  0
(4-21)
This is the same as
To make the above equation hold for any y and z coordinates, we must
have
fs yl  fs yr and r0l  r0 r
(4-22)
Meeting the constraints (4-22) will ensure the correctness of the
6
rectification. So we must use a new set of camera intrinsic parameters
to rectify the left and right image, since the parameters obtained from
the camera calibration procedure is likely to have significant errors and
can’t meet the constraints of (4-22), which will result different y
coordinates for the same 3D point in the two images.
In practice, we can set the intrinsic parameters matrices of both
cameras to their average or set them to either one, as long as they meet
the requirements of (4-22). In the following steps, we will name this new
camera intrinsic matrix as Wnew , which sill be applied to both left and
right rectification.
For the rectification of the left image, relate the original image with
the rectified image:
 cl 
 
l  rl   Wol * Rl * P
1
 
(4-23)
 cl'

r  rl'
1

(4-24)


  Wnew Rrect * Rl * P


Where Wol is the intrinsic matrix obtained from camera calibration,  s
are the scale factors.
From (4-23) and (4-24) then
 cl'

  rl'
1


 cl 

 
1
  Wnew * Rrect * Wol *  rl 

1
 

(4-25)
The rectification procedure is realized by equation (4-25), while other
methods such as bilinear may be used to improve the quality of the
rectification image.
We have also noticed in previous experiments that not all of the
rectified pixels can be seen in the rectified image. This is a serious
problem if the object we want to do 3D reconstruction is not contained
in the rectified image. Is there any method to adjust the rectification
7
procedure so that we can control at least partially which part of the
scene to show on the rectified image? While changing the focus length
can keep all the points within images of the same size as the original, we
propose an alternative approach by shifting the image center along the
horizontal direction.
It is obvious from (4-21) and (4-22) that changing c0l or c0 r will not
destroy the rectification effect. By changing the image center, we can
move some points outside of the images to inside of images and find
matching points with new shifted images.
4.2 The rectification method in the textbook
The following discussion resembles that in the above section. The
major issue is that the coordinate transformation defined in this
method is different from that in the course lecture notes which results
in different forms of rectification matrices.
(4-26)
Pr  R( Pl T )
Pl  RT Pr  T
(4-27)
Multiply both sides of (4-27) by Rrect
RrectPl  RrectR T Pr  RrectT
(4-28)
It is clear now that for such coordinates definition as (4-26), the
rectification matrices for the left and right images are:
(4-29)
Rl  Rrect
T
(4-30)
Rr  Rrect * R
where
R  Rr RlT
(4-31)
T  Tl  RT Tr
(4-32)
and
4.3. A compact algorithm for rectification of stereo pairs
In this section, we present another algorithm proposed by Fusiello et
al. [3] to rectify a calibrated stereo rig of unconstrained geometry and
mounting general cameras. We analyze the components of this
algorithm step by step:
8
(1) Optical Center
In homogeneous coordinates, we have the following projection
transformation:
 x
 
c
 
 y
  r   P 
z
 1

 
 1 
 
where
P
P
(4.3-1)
is the projection matrix. Let’s represent the projection matrix
as:
q1
P  q2T

q2T
T
q14 
q24   [WR WT ]

q34 
(4.3-2)
where W is the intrinsic parameters and R and T are the rotation
matrix and the translation vector. From equation (6-1), we can get the
following equations:
q1T w  q14

c  q T w  q
3
34
T

q w  q24
r  2T

q3 w  q34
where
w  (x
y
(4.3-3)
z )T . Here, the focal plane is referred as the plane
that is parallel to the image plane and contains the optical center. It is
the locus of the points projected to infinity, hence its equation is
q3T w  q34  0 . The two planes defined by q1T w  q14  0
q2T w  q24  0
and
intersect the image plane in the vertical and horizontal
axis of the image coordinates respectively. The optical center C is the
intersection of these three planes; hence its coordinates c are the
solution of
9
0 
c   
P   0
1  
0
(4.3-4)
From the above equation, we can obtain the coordinates of the left and
right camera Optical Centers as follows:
c  (WR ) 1WT
(2) The rotation matrix
(4.3-5)
Rrect
After rectification, the rotated cameras have the same orientation but
different positions. Positions of the optical centers are the same as the
old cameras, while orientation changes because we rotate both cameras
around the optical centers in such a way that focal planes become
coplanar and contain the baseline.
The matrix
Rrect
is the same for both rotated left and right cameras,
and is computed as follows:
※ The new X axis is parallel to the baseline :
r1  (c1  c2 ) / c1  c2
※ The new Y axis is orthogonal to X (mandatory) and to k:
※ The new Z axis is orthogonal to XY (mandatory):
r2  k  r1
r3  r1  r2
where k is an arbitrary unit vector, which fixes the position of the new
Y axis in the plane orthogonal to X.
So, the matrix R is as follows:
r1 
R  r2T 
 
r3T 
T
(4.3-6)
which are the X,Y and Z axes respectively of the camera standard
reference frame, represented in world coordinates.
10
(3) How to do the rectification:
For the original left image, we have the following equation:
 x
 
c
 
 y
left  r   Wleft [ Rleft , T ] 
z
 1

 
 1 
 
(4.3-7)
If we assume that the coordinate system of object frame has the same
origin as the coordinate system of the camera frame, then the
translation vector T is equal to 0. Further, we have the following
simplified equation:
c
 x
 
 
left  r   Wleft Rleft  y 
 1
 z
 
 
(4.3-8)
For the rectified left image, we have the following simplified equation:
 cn 
 x
 
 
n  rn   Wnew Rrect  y 
1
 z
 
 
(4.3-9)
Combining the above two equations, we get the following equation:
 cn 
c
  left
t
1  
Wnew Rrect Rleft Wleft  r 
 rn  

n
1
 1
 
 
(4.3-10)
Which is final equation used to do the image rectification.
5. Reconstruction from Rectified Image Pair
To original pixel coordinates can be recovered from the rectified
coordinates (4-25) and (4-26)
11
The 3D coordinates can be solved through the perspective projection
 x
 
c 
 
 y
(5-1)
λl  r   W M   .
z
1
 
 
1
 
With the above equations, we have
 x
 x
 cl' 
 
 
 cl 
 '
 
 y
 y
1
1
λl  rl   Wnew RrectWol  rl   Wnew RrectWol Wol M l    Wnew Rrect M l   (5-2)
z
z
1
1
 
 
 
1
1
 
 
 
and
 x
 x
 c r' 
 
 
 cr 
 '
 
 y
 y
1
1
λr  rr   Wnew Rrect RW or  rr   Wnew Rrect RW or Wor M r    Wnew Rrect RM r   (5-3)
z
z
1
1
 
 
 
1
1
 
 
 
Let Pl  Wnew RrectM ll and Pr  Wnew RrectRM r represent the projective
matrix for left image and right image respectively, thus we have
 x
 x
 
 cl   Pl 11 Pl 12 Pl 13 Pl 14  
 cl   0 
 y 
  
   
 y
Pl    λl  rl    Pl 21 Pl 22 Pl 23 Pl 24    λl  rl    0 
z
 z
 1  P
 1   0
 
   l 31 Pl 32 Pl 33 Pl 34  1 
   
1
 
 
 x
 x
 
 c r   Pr 11 Pr 12 Pr 13 Pr 14  
 cr   0 
 y 
  
   
 y
Pr    λr  rr    Pr 21 Pr 22 Pr 23 Pr 24    λr  rr    0 
z
 z
 1  P
 1   0
 
   r 31 Pr 32 Pr 33 Pr 34  1 
   
1
 
 
(5-4)
(5-5)
The combination of the equation (7-3)and(7-4), we have
 Pl 11

 Pl 21
P
 l 31
 Pr 11
P
 r 21
P
 r 31
Pl 12
Pl 13
 cl
Pl 22
Pl 23
 rl
Pl 32
Pl 33
1
Pr 12
Pr 13
0
Pr 22
Pr 23
0
Pr 32
Pr 33
0
0 
P 
 x   l 14 
0     Pl 24 
 y
0     Pl 34 
 z 

 c r     Pr 14 
 
 rr  l    Pr 24 

 1  r    Pr 34 
(5-6)
12
The least-squares solution of the linear system AX  B is given by
X  ( AT A) 1 AT B
(5-7)
The 3-D coordinates can be thus obtained from the two
corresponding image points.
6. Experiments and Results:
We set the new intrinsic parameter matrices for the left and right
images to different values.
(a) First Experiment
For the left image, we set the following intrinsic parameter matrix for
the left camera:
Wnew
0
315.7700
 860.0405



0
850.6958 278.5259

0
0
1.0000 

The corresponding rectified image is shown in Figure (2).
(6-1)
13
Figure (2): the rectified left image
For the right image, we set the following intrinsic parameter matrix for
the right camera:
Wnew
0
500.0000
 860.0405



0
850.6958 278.5259

0
0
1.0000 

(6-2)
The corresponding rectified image is shown in Figure (3).
Figure (3): the rectified right image
After obtaining the left and right rectified images, we found that the
correspondence points in the left image can be found accurately in the
same row of the right rectified image as shown in Figure (4). Also, from
Figure (4), we found that the right rectified image is shifted to right by
setting the
c 0 a different value in the intrinsic parameter matrix Wnew ,
and the whole part of the stone are visible in the image.
14
Figure (4): the correspondence points marked by the white lines in
the rectified left and right images.
(b) Second Experiment
For the left image, we set the following intrinsic parameter matrix for
the left camera:
Wnew
0
157.8850
 600.0000



0
425.3479 139.2630

0
0
0.5000 

The corresponding rectified image is shown in Figure (5).
(6-3)
15
Figure (5): The rectified left image
For the right image, we set the following intrinsic parameter matrix for
the right camera:
Wnew
0
450.0000
 600.0000



0
425.3479 139.2630 

0
0
0.5000 

The corresponding rectified image is shown in Figure (6).
(6-4)
16
Figure (6): the rectified right image
Figure (7) will show that for each point in the rectified left image, the
corresponding point in the rectified right image can be located in the
same row of the right image.
Figure (7) The correspondence points marked by the white lines in
the rectified left and right images
From the above experiments, we made some conclusions as follows:
17
First, for each point in the left rectified image, in order to locate its
correspondence point in the rectified right image, we have to make sure
that two entries of intrinsic parameter matrix,
sx f
and
r0 , should be
same in the new
W
for both left and right cameras. Second, we can
adjust the entry
c 0 of the intrinsic parameter matrix to shift the
rectified image to be visible.
7. Summary and Conclusions
In this project, three algorithms for image rectification for a pair of
stereo images are studied. For each algorithm, the main steps are
analyzed systematically, and the necessary verifications are given.
Finally, we explored the necessary constraints for them to work
successfully. Under the proposed constraints, via mathematical
derivation, each of the three algorithms is proved to work successfully
on the pair of stereo images. Experiments conducted on the real image
also show that the correctness of each algorithm.
Reference:
[1]ECSE 6650 Computer Vision Class Handouts, RPI 2002
[2]E. Trucco and A. Verri Introductory Techniques for 3-D Computer
Vision, Prentice Hall.,1998
[3] Andrea Fusiello, Emamuele Trucco, Alessandro Verri. “A compact
algorithm for rectification of stereo pairs”, Machine Vision and
Applications, Springer-Verlag 2000.
Download