Visualization Determining Depth From Stereo

advertisement
VisualizationDetermining Depth From Stereo
Saurav Basu
BITS Pilani
2002
Introduction
 Example of Human Vision
 Perception of Depth from Left and right eye
images
 Difference in relative position of object in
left and right eyes.
 Depth information in the 2 views??
 Basis for Stereo Imaging
LEFT VIEW
LEFT EYE
RIGHT EYE
RIGHT VIEW
The Stereo Problem
– The stereo problem is usually broken in to two
subproblems
• Extraction of Depth information from Stereo
Pairs
• Use of depth data to visualize the world
scene in 3-dimensions by a suitable
projection technique.
Stereo Images
Depth Estimation
Visualization
What are Stereo Images?
Images of the same world scene
taken from slightly displaced view points
are called stereo images.
To illustrate how a typical stereo
imaging system let us take a look at the
camera model for obtaining stereo
images
Camera Model Of A Stereo System
Image 1
(x1,y1)
y
x
Optical Axis
Image 2
y
(x2,y2)
W (X,Y,Z)
x
BaseLine distance
Important Points about the Model
 The cameras are identical
 The coordinate systems of both cameras are
perfectly alligned.
 Once camera and world co-ordinate systems are
alligned the xy plane of the image is alligned with
the XY plane of the world co-ordinate system,then
Z coordinate of W is same for both camera
coordinate systems.
Relating Depth with Image Coordinates
X
(x1,y1)
Z -
Z
Image 1

Origin Of
B
World
Coordinate
System
Image 2

(x2,y2)
W (X,Y,Z)
 By Similar Triangles:
X 1  x1 (  Z 1)

X 2  x 2 (  Z 2 )

X 2  X1 B
Z1  Z 2  Z

X 1  x1 (  Z 1)

X 1  B  x 2 (  Z 2 )


(Z   ) 
B
x1  x 2
Put Z -   Depth,  B  K
K

x1  x 2
where x1 - x2  Disparity
 Depth 
Depth 
1
Disparity
Important Result
 Thus Depth is inversely proportional to (x1-x2)
where x1 and x2 are pixel coordinates of the
same world point when projected on the stereo
image planes.
 (x1- x2) is called the DISPARITY
 The problem of finding x1 and x2 in the stereo
pairs is done by a stereo matching technique.
Stereo Matching
– The goal of stereo matching algorithms is to
find matching locations in the left and right
images .
– Specifically find the coordinates of the pixel on
the left and right images which correspond to
the same world point.
– It is also called the correspondence problem.
Correlation based approaches
– A common approach to finding
correspondences is to search for local regions
that appear similar
– Try to match a window of pixels on the left
image with a corresponding sized window on
the right image.
Matching Pixels
Left Image
Matching
Window
Right Image
Diagram to illustrate the Stereo Matching
Disparity of this pixel is 1 since x1=0 and x2=1,x2-x1=1
Assumption :Matching Pixels lie on same horizontal Raster Line(Rectified
stereo)
The SSD and SAD are commonly used
correlation functions
SSD:Sum of Squared Deviations
 ( I l ( x, y ), I r ( x  d , y ))  ( I l ( x, y )  I r ( x  d , y )) 2
SAD:Sum of Absolute Differences
 ( I l ( x, y), I r ( x  d , y))  I l ( x, y)  I r ( x  d , y)
The Multi Window Algorithm
 In this algorithm technique 9 different
windows are used for calculating disparity
of a single pixel.
 The window which gives the maximum
correlation is used for disparity calculations.
Left Image
Right Image
Demonstration of the 9 different windows used for the
Correlation
Disparity Map
 Based on the calculated disparities a
disparity map is obtained
 The disparity map is a gray scale map
where the intensity represents depth.
 The lighter shades (greater disparities)
represent regions with less depth as opposed
to the darker regions which are further away
from us.
Visualization
 Visualization is the process by which I use
the depth estimates from the stereo
matching to build projections .
 3-D information can be represented in many
ways :
-Orthographic projections
-Perspective Projections
Perspective Projections
 Perspective projections allow a more
realistic visualization of a world scene
 The visual effect of perspective projections
is similar to the human visual system and
photographic systems.
 Hence perspective projection of the 3-d data
was implemented for the stereo pairs.
A
B
A’
B’
Projection
Plane
Projectors
Center Of
Projection
•In Perspective
projections the
projectors are of
finite length and
converge at a point
called the center of
projection.
•In perspective
projection size of
an object is
inversely
proportional to its
distance from
ooint of projection
Specifying a 3-D View
 To specify a 3-d view we need to specify a
projection plane and a center of Projection:
 The Projection plane specified by
1. A point on the plane called the
View Reference Point (VRP)
2. The normal to the view plane,i.e.
View Plane Normal (VPN)
 We define a VRC (View Reference
Coordinate system) on the projection plane
with u,v,and n being its 3 axes forming a
right handed coordinate system
 The origin of the VRC system is the VRP
 The VPN defines the ‘n’ axis of the VRC
system
 A View Up Vector (VUP) determines the
‘v’ axis of the VRC system.The projection
of the VUP parallel to the view plane is
coincident with the ‘v’ axis.
The ‘u’ axis direction is defined such that the ‘u’,’v’ and ‘n’
form a right handed coordinate system.
A view Window on the view plane is defined ,projections
lying outside the view window are not displayed .
The coordinates (Umin,Vmin) and (Umax,Vmax) define this
window.
The center of projection Projection reference point(PRP).
V
Window
VUP
(Umax,Vmax)
CW
(Umin,Vmin)
VRP
U
N
VPN
VIEW PLANE
DOP
Center of
Projection
THE
3-D VIEWING REFERENCE COORDINATE SYSTEM
 The semi infinite pyramid formed by the
PRP and the projectors passing through the
corners of the view window form a view
volume.
 A Canonical view volume is one where the
VRC system is alligned with the World
Coordinate system.
Back Plane
X or Y
1
Front Plane
-Z
-1
PRP
-1
The 6 bounding planes
of the canonical view
volume have
equations:
x=z ,x=-z ,y=z, y=-z
z=zmin, z=-1
Canonical view volume for Perspective Projections
Perspective projection when VRC alligned
with World Coordinate system
V
P(X,Y,Z)
Y
U
N
X
P(Xp,Yp,d)
PRP
d
CW
Z
From Similar Triangles
X
Y
Xp 
Yp 
Zd
Z
Z
d
d
The Transforma tion can be represente d as
a matrix
1

0
per  
0
0

M



per . 


M
0
1
0
0
0
0
1
1
d
0

0

0
0

x 
y
y  [x y z z ]  [ x
d 1]
z 
d
z/d z/d

1
 Only true when view volume is canonical
 For arbitrary view volume
-First transform the view volume into
canonical form and then apply the above
formula to take projections
 For transforming a view volume we do the
following:
1)Translate VRP to origin
2)Rotate VRC to allign u,v and n axes
with the X,Y and Z axes.
3)Translate the PRP to origin
4)Shear to make center line of view volume
the the z-axis.
5)Scale such that the view volume becomes
the canonical perspective view volume
1. The translation matrix is
1
0
T (dx, dy, dz )  
0

0
0
1
0
0
0 dx 
0 dy 
1 dz 

0 1
Z
N
T (VRP )
VRP
2. The Rotation matrix is
 r1x r1 y r1z
r 2 x r 2 y r 2 z
R
r 3x r 3 y r 3z

0
0
0
U
Y
V
0
0
0

1
X
where
VPN
Rz  [r 3 x r 3 y r 3 z ] 
VPN
VUP  Rz
Rx  [r1x r1 y r1z ] 
VUP  Rz
3.
Ry  [r 2 x r 2 y r 2 z ]  Rz  Rx
T ( PRP )
 4. The Shear Matrix
1

0
per  
0
 0
SH
0 shx 0 

1 shy 0 

0 1 0
0 0 1 
Y
Let DOP  CW - PRP
CW
DOP : Direction of Projection
CW : Center of Window
dopx
shx  
dopz
dopy
shy  
dopz
PRP
-Z
5.The scale transformation
sx 0 0 0
 0 sy 0 0 
Y

Sper  
 0 0 sz 0 


0
0
0
1


Let VRP '  SHper.T ( PRP ).[0 0 0 1]
CW
2VRP z '
sx 
(u max  u min )(VRP z ' B )
2VRP z '
sy 
(v max  v min )(VRP z ' B )
1
sz 
VRP z ' B
Y=-Z
PRP
-Z
Y=-Z
 Once all the projected points have been
calculated, scale the coordinates to fit the
display screen.
 A wire frame display of the image is
obtained by joining the projections of all
points lying on the same row or column.
 Map the pixel colors of the image on to the
projected points to create a realistic effect.
Limitations
 Can work well only for stereo images where
minute details are not required.
 More suited for depth estimation of
landscape through images taken from top.
 No accurate metric calculations done.
Download