COLLISION DETECTION USING STEREOVISION Rakesh S. Patel B.E., K.K.Wagh C.O.E. India, 2007

advertisement
COLLISION DETECTION USING STEREOVISION
Rakesh S. Patel
B.E., K.K.Wagh C.O.E. India, 2007
PROJECT
Submitted in partial satisfaction of
the requirements for the degree of
MASTER OF SCIENCE
in
ELECTRICAL AND ELECTRONIC ENGINEERING
at
CALIFORNIA STATE UNIVERSITY, SACRAMENTO
SUMMER
2011
COLLISION DETECTION USING STEREOVISION
A Project
by
Rakesh S. Patel
Approved by:
__________________________________________________, Committee Chair
Fethi Belkhouche, Ph.D.
___________________________________________________, Second Reader
Preetham Kumar, Ph.D.
________________________________
Date
ii
Student: Rakesh S. Patel
I certify that this student has met the requirements for format contained in the university
format manual, and that this project report is suitable for shelving in the Library and
credit is to be rewarded for the Project.
_________________________ , Graduate Coordinator
Preetham B. Kumar, Ph.D.
Department of Electrical and Electronic Engineering
iii
________________
Date
Abstract
of
COLLISION DETECTION USING STEREOVISION
by
Rakesh S. Patel
Advanced Driving Assistance System (ADAS) is developed to avoid the
increasing number of accidents. Collision Detection and Warning is one of the main
features of ADAS. Different techniques are available for the collision detection such as
using ultrasonic sensors, laser, radar, vision, etc. The main aim of this project to is to
design a system which has the ability to detect the possibility of collision using cameras.
Stereovision method is use to design a collision detection system. In this method, two
cameras are used. First, camera calibration is done in which intrinsic and extrinsic
parameters of the cameras are calculated and then cameras are mounted on the vehicle.
These cameras are mounted at the same level from the reference plane. The cameras
capture the same scenario which is in front of the car at the same time. Videos of the
scene are stored and then converted into images for further processing. The images are
slightly different from each other as two cameras are separated by some distance. One
point P, normally centroid of the obstacle is detected and projected on the left image
iv
plane to calculate its 2D coordinates. Epipolar geometry is used to find the coordinates of
the same point in the right image plane. Triangulation geometry is used to calculate the
distance of an obstacle from the vehicle. Depending upon this distance, likelihood
possibility of collision is determined and further precautionary measures can be taken.
Different cases are considered to test the functionality of the design.
_________________________________________________, Committee Chair
Fethi Belkhouche, Ph.D.
________________
Date
v
ACKNOWLEDGEMENTS
I would like to express my sincere gratitude to all those people who helped me during
this project. These people helped me to achieve an important milestone in my life.
Firstly, I would like to thank my project guide, Dr. Belkhouche. He gave me
continuous and valuable advice which helped me to be on the right track to complete this
project. I would also like to extend my gratitude to Dr. Kumar for reviewing my report
and providing valuable comments.
Finally, I would like to thank my family and friends for supporting and helping me
through the entire project and making this project successful.
vi
TABLE OF CONTENTS
Pages
Acknowledgements…………………………………………………………..……….....vi
List of Tables….…………………………………………………………..……………...ix
List of Figures.…………………………………..……………………………...…….......x
Chapter
1.
INTRODUCTION…………………………………………....………………….1
2.
RELATED WORK………………………..…………………..…………………3
3.
CAMERA CALIBRATION…………………………………...………………..20
3.1 Intrinsic Calibration………………………………………………….21
3.2 Extrinsic Calibration…………………………………………………22
4.
STEREOVISION GEOMETRY AND TRIANGULATION SYSTEM………..26
4.1 Stereovision Geometry……………………………………………....26
4.2 Epipolar Concept…………………………………………………….27
4.3 Triangulation System………………………………………………...28
5.
EGO MOTION ESTIMATION…………………………………………………31
5.1 Feature Selection and Matching……………………………………..31
5.2 Estimation of tM...................................................................................34
5.3 Estimation of RM……………………………………………………..34
5.4 Estimation of Xi and Yi……………………………………………….35
5.5 Estimation of λ and µ………………………………………...………35
vii
6.
SIMULATION AND CALCULATION.………………………………………...36
6.1 Introduction…………………………………………………………..36
6.2 Case I………………………………………………………………...38
6.3 Case II………………………………………………………………..41
6.4 Case III……………………………………………………………….46
6.5 Analysis of Results…………………………………………………..50
7.
CONCLUSION AND FUTURE WORKS……………………………..……….51
8.
Appendix……………………………………………………………………...…52
9.
References………………………………………………………………………..54
viii
LIST OF TABLES
1. Table 6.1: Distance Between the Two Cars in Case I……………………………38
2. Table 6.2: Distance between the Two Cars in Case II………………………...…41
3. Table 6.3: Distance between the Two Cars in Case III…………….…………….46
ix
LIST OF FIGURES
1. Figure 2.1: Overview of the Approach.…………………………………………...4
2. Figure 2.2: FCW System Architecture……………………….……………………5
3. Figure 2.3: Boarder Scanner…………………………………………………………….7
4. Figure 2.4: Drivable Tunnel Model…………………………………………………........7
5. Figure 2.5: Drivable Tunnel Model Projection…………………………………….8
6. Figure 2.6: Procedure to Determine ROI………………………………………...12
7. Figure 2.7: Tracking of Obstacle………………………………………………...14
8. Figure 2.8: Moving Object and Obstacle Detection System……………………..16
9. Figure 2.9: Pinhole Camera Model……………………………………………....17
10. Figure 3.1: Place of Camera Calibration in Detection System…………………..21
11. Figure 3.2: System of World Coordinates and Camera Coordinates…………….25
12. Figure 4.1: Stereovision Geometry…………….………………………………...27
13. Figure 4.2: Triangulation Geometry for Standard Model ……………………….28
14. Figure 4.3: Triangulation Geometry for General Model………………………...29
15. Figure 5.1: Stereo and Ego Motion Relationship…………………………….......32
16. Figure 6.1: Stereovision System………………………………………………....36
17. Figure 6.2: Disparity Calculation………………………………………………...37
18. Figure 6.3: Different Positions of the Car in Case I…………………………......39
19. Figure 6.4: Different Positions of the Car in Case II…………………………….42
20. Figure 6.4: Different Positions of the Car in Case III………………....................47
x
21. Figure 6.5: Analysis of Results………………………………………………….50
xi
1
Chapter 1
INTRODUCTION
The increase in the number of cars has resulted in a higher number of accidents.
In order to decrease the number of accidents, we need to develop an efficient driver
assistant system which will warn the driver before collision takes place. Implementation
of such system will help reducing the number of accidents, improve road safety and thus
saving human lives. In recent years, various Advanced Driving Assistance Systems
(ADAS) were developed and implemented at faster rate to improve the safety of the
drivers on road. ADAS monitors the surroundings and detects other traffic participants. It
warns the driver about possible accidental situation or act autonomously, if necessary. It
is an active safety system, since action is taken before any accident occurs. ADAS system
includes features like Adaptive Cruise Control (ACC), Collision Warning (CW), Blind
Spot Detection (BSD), Emergency Break Assistance (EBA), Intelligent Headlamp
Control (IHC), Lane Departure Warning (LDW) and Traffic Sign Reorganization (TSR)
[1]. In ACC, the driver selects the speed of the car and its distance from the front vehicle.
When ACC detects that front vehicle is slowing down, it decreases the speed of the car
and maintains the safe distance and when traffic gets clear it increases the speed of the
car to a selected level. ACC is sometime combined with CW. This system alerts the
driver when it senses the possibility of collision. In case of an emergency, it will precharge the breaks to increase their sensitivity. The BSD system alerts the driver when
vehicles are in or approaching towards his blind spot. It may include a visual warning like
light indicator in corresponding mirrors or include audio warning [2]. The LDW feature
2
warns the driver when he is about to change his current lane. This system uses the camera
to detect the lanes by indentifying lane markers.
The main advantage of ADAS is safety. According to study of Insurance Institute
for Highway Safety, nearly 1.2 million accidents are avoided due to the Collision
Warning feature of ADAS. Accidents are directly related to person’s injury, death and
damage of cars, so avoiding accidents means saving human’s life and avoiding car
damage [3]. ACC helps to keep preset speed and safe distance between to two vehicles.
LDW helps to drive in one lane. Success of ADAS depends upon driver’s ability to learn
new technology and willingness to use it.
This project mainly focuses on the Collision Warning (CW) feature of ADAS.
Stereovision technique is used for collision detection. In stereovision, two cameras are
used. The cameras record the same scene at the same time. The videos taken by camera
are later converted into images and used to calculate the distance between two vehicles. If
this distance is less than the safe distance, then necessary action is taken.
3
Chapter 2
RELATED WORK
In this chapter, we discuss the different techniques used for collision detection. These
techniques mainly use the stereovision method for collision detection. In [4], obstacle
detection is divided into two stages. These stages are detection and confirmation. In the
detection stage, real time relation between the obstacle and ground is computed. In
confirmation stage, positions of the obstacles are determined. Figure 2.1 represents the
overview and approach of this technique. In stereovision two cameras are used to capture
images of the scene. The images from the left camera are indentified as left images, while
images from the right camera are indentified as right images. In the first stage, left and
right images are captured by the respective cameras. The tradeoff between smoothing
homogenous area and structural characteristic of images is done by the anisotropic
diffusion of the images. The U-V disparities of these images are calculated. This disparity
information is used for the detection of the obstacles. By using the above information the
region of interest can be determined. Obstacles are present in this area only. In a second
stage, obstacles are confirmed by using depth discontinuity of the images. The geometry
and location of each obstacle is also determined [4]. Advantages of this method are as
follows:
1) This method can detect negative obstacles and thin obstacles also.
2) In this technique the effect of reflection can be eliminated.
3) This system can be used under all conditions and any time of the day and night.
4
Getting Left and Right images
Smoothing input image and
preserving shape boundaries
Extracting approximately vertical or horizontal
edges and horizon
Indentifying region of interest and confirming
region in which obstacle exists
Compounding geometric information about
obstacle, obstacle-ground contact lines
and bounding boxes
Figure 2.1: Overview of the Approach [4]
5
In [5], an algorithm for Forward Collision Warning (FCW) is explained. This
method uses 3D information provided by the stereo vision system and ego motion of the
car to generate a collision warning. This method is designed for urban road traffic.
Left
Camera
Right
Camera
TYZX Hardware
Stereo Machine
Reconstruction of 3D Points
Coarse
Objects
Elevation
Maps
Car
Parameters
Tracked
Objects
Object
Delimiter
Drivable
Tunnel
Forward Collision Detection Module
FCW Output
Figure 2.2: FCW System Architecture [5]
Figure 2.2 shows the system architecture and the basic modules of this method.
The TYZX hardware board takes the input from two cameras and performs 3D
reconstruction of the scene. The output of this board is used by other modules to detect
6
and track objects, to determine elevation map, etc. In Coarse Object module, using
available stereo information, the rough objects are extracted. The object’s position, size
and speed are described in the Tracked Objects module. Kalman filtering is used for this
process. Elevation map uses stereo information provided by TYZX hardware to describe
the scene. This map is divided into drivable point, curb point and object point. Onboard
sensors are used to collect the different car parameters like speed, yaw rate, etc for each
video frame. The output of Elevation Map is used by the Object Delimiter to compute the
set of unconstructed polygon. The redial scanning algorithm is used for this process. In
Drivable Tunnel, the virtual area around ego-car trajectory is represented using
mechanical and movement characteristics of the car. Forward Collision Detection module
performs collision detection process and FCW output gives visual indication according to
the type of obstacle [5].
The main modules in this method are the Delimiter Object module, the Drivable
Tunnel module and the Forward Collision Detection module. Delimiter detection is
performed in four steps. In the first step, Top view of the image is generated then in the
second step each object from the elevation map is labeled. In the third step, contour
extraction is done using boarder scanning algorithm. In the final step, approximate curve
for the car motion is generated. In boarder scanning algorithm radial scanning is done by
a fixed given step with center at the Ego car position. Scanning is performed only in the
region of interest i.e. from Q from to Q to with Q rad slope. All the detected delimiter points
from region of interest added to a list known as CounterList. For each new label this list
is cleared and new list is generated. The Drivable Tunnel Model is represented by
7
polyhedron as shown in figure 2.4. It consists of number of hexahedron cells. Every cell
has two parallel faces left and right, bottom and top. The far and near faces are
perpendicular to the car motion [5].
Figure 2.3: Boarder Scanner [5]
Figure 2.4: Drivable Tunnel Model [5]
In this method, a collision detection polygon clipping algorithm is used. In this
algorithm vertices of the polygon are taken as input and it generates output which has one
or more polygon. The tunnel projection as shown in figure 2.5 is used as input for
polygon clipping algorithm. In the projected trajectory, right edges form the circular are
of radius R1 and left edges form circular arc of radius R2. These arcs define the drivable
tunnel’s boundaries and bounded between TopBoundry and BottomBoundry. As shown
8
in figure 2.5, the drivable tunnel projection is divided into four areas and they are defined
by their locations form the tunnel edges. These areas are represented by following
mathematical constraints, considering C(x0, y0) as the center of circular arcs and angle α
as the angle made by TopBoundry with BottomBoundry:
Figure 2.5: Drivable Tunnel Model Projection [5]
1) Right Area:
It is limited by the right edge of the tunnel. R1 is the radius of the right edge
circular arc. There are two cases for the right area.
For x0<0
9
2
R12   x  x0 2   y y 0   R 22
z   x  x 0  tan   z 0
For x0>0
 x  x0 2   y y 0 2  R12
z   x  x0  tan   z 0
2) Left Area:
It is limited by the left edge of the tunnel. R2 is the radius of the left edge circular
arc. There are two cases for left area.
For x0<0
 x  x 0 2   y  y 0  2  R 2 2
z   x  x 0  tan   z 0
For x0>0
 x  x 0 2   y  y 0  2  R 2 2
z   x  x 0  tan   z 0
3) Inside Area:
It is the area between the two arcs, this means inside the tunnel. It is given by
For x0<0
10
2
R 2 2   x  x 0 2   y  y 0   R12
For x0>0
2
R12   x  x 0 2   y  y 0   R 2 2
4) Top Area: It is area on top of tunnel.
z   x  x0  tan   z 0
These four areas are used to compute position of each delimiter vertex. The edge
direction is assigned to each vertex depending upon the endpoint position. An obstacle is
detected when these vertexes intercept right or left edge of the tunnel. The position of the
obstacle is determined by intercepting each chord of the tunnel with the current vertex.
Accuracy of this method depends upon dense stereo information, elevation map result
and object detection algorithm [5].
In [6], another stereovision technique for front vehicle detection is explained. This
method uses an asynchronous stereovision technique to overcome the limitation of
synchronous stereovision system. It uses edge information to detect the region of an
obstacle. It is divided into two modules. These modules are the search module and the
matching module. In the search module, the most likely position of the obstacle is
determined in the right image. In the matching module, the corresponding pixels are
found in the left image. The disparities of these points are used to calculate the distance
of an obstacle from the host vehicle [6].
11
Search Module:
Sobel edge detection algorithm is used to detect the edges of obstacles. The
change in the image brightness is key feature of this algorithm. The change in the
brightness of an image corresponds to change in depth, change in material, discontinuities
in surface, etc [7]. The complexity of image processing is reduced by down sampling
input image with factor 2n. The Sobel edge detection algorithm is applied to this image to
detect the edge points of the obstacle. The horizontal and vertical line segments are used
to connect the points which are at same level in the image. The horizontal line segments
which are near the bottom of the image are processed first because these points of the
obstacle are the closest to the host vehicle. The other horizontal segments are processed
bottom up. The length and position of each segment is calculated. The horizontal line
segments in the region of interest are given by
| SLx1  ILx1 | Th,| SLx 2  ILx 2 | Th, and SLy  ILy
SLx1: Left point of the seed line
SLx2: Right point of the seed line
ILx1: Left point of the input horizontal line segment
ILx2: Right point of the input horizontal line segment
SLy:
y coordinate of the seed line
ILy:
y coordinate of the input horizontal line segment [6].
The error is introduced in the region of interest due to shadows on the road, so
time is wasted in processing such line segments. The region of shadow is removed by
12
comparing horizontal and vertical projection histograms of the same image and deleting
interference region.
A: Original Input Image
C: The horizontal and vertical histogram
B: Edge Detection in the Input Image
D: Final Region of Interest
Figure 2.6: Procedure to Determine ROI [6].
Matching Module:
In the matching module, pixels of the horizontal line segments from a region of
interest are matched with corresponding pixels in the left image. Then, disparity of these
pixels is found. The zero –mean SAD full searching, adaptive shifting of searching area
13
and tracking detection is done in matching module [6]. As asynchronous stereo system is
used, there will be fixed gray value difference between the right and left image. This
difference is nullified before search processing. The right and left matching blocks are
defined by zero-mean SAD as follows
SADV i, j  
N /2

N /2
| RC x  k , y  l   M  LRi  x  k , j  y  l   M  |
R
L
k   N / 2  l   N / 2 
MR 
N /2
N /2
1

 RC i, j 
N  N i  N / 2  j  N / 2 
ML 
N /2
N /2
1

 LRi, j 
N  N i  N / 2  j  N / 2 
and  p  i, j  p
Where
N  N : Block Size
MR: Mean value of the matching block in the right image
ML: Mean value of the matching block in the left image.
RC(x, y): The Image pixel in the right image.
LR(x, y): The Image pixel in the left image [6].
The matched pixels of the right and the left image are not on the same line
segments, they are slightly shifted due to the asynchronous nature of the system. In the
adaptive shifting algorithm, this shifted area is calculated and added to the left image to
reduce the matching time. Once the vehicle detected in one image frame, it can also be
14
found in the next image frame by increasing or decreasing the region of interest. This is
illustrated in figure 2.7.
A: nth frame
B: n+1th Frame
C: Histogram of nth+1 Frame
D: ROI of nth+1 Frame
Figure 2.7: Tracking of Obstacle [6]
15
Advantages:
1) The setup of this system is much simpler than synchronous stereo system.
2) The stereovision object detection algorithms provide better performance with
asynchronous system.
3) Price and size of asynchronous stereo system is less than synchronous stereo
system.
In [8], an algorithm for moving objects and obstacle detection is explained. In this
algorithm disparity map is calculated using area based stereo matching technique for 3D
representation of scene and object detection. The optical flow method is used to
determine the motion of an object. This algorithm mainly works for indoor applications.
Figure 2.8 shows basic steps of this design. First, 3D representation of the scene is done
by using disparity information. In the second step, resampling and quantization of the
disparity map is done. The motion vector of the obstacle is determined using the velocity
vector of the camera in the Prediction of Optical Flow Map then blob detection algorithm
is used to determine the motion of the objects.
16
System Initialization
(Optical Flow,
Stereovision)
Computation Of Disparity
Map
Computation of Optical
Flow Map with Median
Filter
Resampling, Filtering and
Quantization
Prediction of Optical Flow
Map
Detection of anomalous
motion vector
Blob Detection for moving
object anomalous motion
vector
Figure 2.8: Moving Object and Obstacle Detection System [8]
3D Representation of Scenes:
This system uses the pinhole camera model for 3D reconstruction of a scene.
Figure 2.9 shows the pinhole camera model. By using pinhole camera model, a point
W(x, y, z) can be projected on the left and right image planes. This point is denoted as PL
17
(xl, yl) in the left image plane and PR(xr,yr) in the right image plane. The coordinates of
point W are given by following equations:
Z
b. f
b. f

;
xl  xr
d
X
xl.Z
;
f
Y
yl.Z
f
b: baseline, distance between two cameras
d: xl  xr disparity
Figure 2.9: Pinhole Camera Model [8]
R: Image Plane
C: Optical Center
f: Focal length
The main task in this step is to calculate the disparity (d). By using epipolar
geometry, it is assumed that once camera calibration is done, then a common point
projected on the two image planes lies on the same horizontal line and the disparity range
is [0, dmax]. The disparity is given by a function called Sum of Absolute Differences:
18
SAD  x, y, d  
n
 | L  x  j, y  i   R  x  j  d , y  i  |
i, j n
L(x, y): Intensity of the left image
R(x, y): Intensity of the right image
n: A small integer constant
A recursive formulation is used to simplify the calculation of SAD.
SAD  x, y  1, d   SAD  x, y, d   U  x, y  1, d 
Where
n
U  x, y  1, d     | L  x  j , y  n   R  x  d  j , y  n  | 
j n
 | L  x  j, y  n  1  R  x  d  j, y  n  1 |
Furthermore
U  x, y  1, d   U  x  1, y  1, d  
 | L  x  n, y  n  1  R  x  d  n, y  n  1 | 
 | L  x  n  1, y  n   R  x  d  n  1, y  n  | 
 | L  x  n, y  n   R  x  d  n , y  n  | 
 | L  x  n  1, y  n  1  R |  x  d  n  1, y  n  1 |
The minimum value of SAD gives final disparity
19
Optical Flow Based Motion Detection:
Optical flow is determined with the help of area based algorithm to estimate
motion of an object in scene. The images I1(x,y) and I2(x,y) are two consecutive images,
the motion vector (vx , vy) of point (x,y) is given by
SAD  x, y, vx, vy    | I 1  x  i, y  j   I 2  x  i  vx, y  j  vy  |
i, j
Where
- m/2 ≤ i ≤m/2; - m/2 ≤ j ≤ m/2
m: the size of the correlation window
A threshold value for SAD is selected and any motion vector below that value is
neglected. The local differences of vectors are attenuated by using a median filter. Horn
and Schunk algorithm is used to smoothen the distortion in the images [8]. The
environment consists of moving and non moving objects. They need to be treated
differently. The moving object’s motion vector (vx, vy) is different from the predicted
vector vx, vy . This difference is calculated and compared with the threshold value.
m  vx 2  vy 2
  tg 1
m '  v ' x2  v ' y2
| m  m | MThreshold
 v, v ' 
v'y
vy
   cos 1 
;  '  tg 1

vx
v'x
| x || y |
  DThreshold
The point which exceeds this threshold value marked as anomalous motion
vector. This vector acts as input to the blob detection algorithm. In this algorithm,
moving and non moving objects are detected using histogram technique.
20
Chapter 3
CAMERA CALIBRATION
In collision detection, vision is the main navigation tool. Camera is the heart of
the vision system. Camera calibration is the process in which the true parameters of the
camera are calculated. These parameters include intrinsic parameters like focal length,
the coordinates of center distortion in the image space, etc., and extrinsic parameters like
position and orientation of the camera. The calibration parameters are used to link the
coordinates of pixels in an image with the corresponding camera coordinates in the
camera reference frame this means the calibration computes the relationship between 2D
image coordinates and 3D world coordinates. It is necessary to calculate the true value of
these parameters because they affect the image processing.
Camera calibration procedure is explained in [9]. Figure 3.1 shows the place of
camera calibration in the detection system. As shown in fig 3.1, images taken from two
cameras are the inputs to the calibration unit and after calibration, they go to the detection
system. The calibration unit calculates the position and angle of the camera in a 3D
environment attached to a global coordinate system. These are the position of the two
cameras in world co-ordinate system. This unit is divided into two parts: intrinsic
calibration and extrinsic calibration [9].
21
C1
Digitization
C2
Calibration
Lane Recognition
Obstacle Detection
and Tracking
Figure 3.1: Place of Camera Calibration in the Detection System [9].
3.1 Intrinsic Calibration:
The intrinsic parameters of the camera are affected by optics and the imaging
sensor chip of camera. In intrinsic calibration, the focal lengths of the lens (αx, αy), the
coordinates of principle point (Cx, Cy) and the lens distortion are calculated. In the
pinhole camera model, a 3D point of the world coordinate system is projected on 2D

image plane in a linear manner. The Euclidean camera coordinates vector Xc is given by
 X    .x 
  

Xc   Y    . y   x
 Z    .z 
  

(3.1)
22
Where

Xc : Euclidean camera coordinates vector
~
x : Homogenous Image Coordinates
λ: Ambiguity of Liner Projection
Using homogenous coordinate’s relationship between pixel coordinate and image
coordinate
 xp 
  x 0 Cx   x 
 

  
 yp  =  0  y Cy    y 
 0 0 1  1 
1 

  
 
xp  C  x
Where
αx, αy: Focal length in retinal coordinates
Cx, Cy: Principal point coordinates (affected by mounting of chip in camera)
xp : Image coordinates
C : A 3x3 matrix of intrinsic parameters.
In an ideal camera, the lens distortion factor can be neglected [9].
(3.2)
23
3.2 Extrinsic Calibration:
Extrinsic calibration computes mounting positions and mounting angles of the
camera with respect to the static system of the world coordinates. These parameters of the
camera are denoted as the position vector of the optical center ( tcw ) and three rotational
angles wc 1,  2,  3  . Figure 3.2 represents the system of world and camera coordinates
for one camera. The center of rear axle is considered as origin and projected on ground.
The relationship between 3D points in camera and world coordinates is expressed as
 
Xc  RW C   XW  t CW 
(3.3)
Where
XW : 3D point in the world coordinate system
X C : Point in the camera coordinate system
tcw : Position vector of the optical center
RWC : Rotational matrix of three rotational angles
From the above equations, all points can be represented as

xp  C  x  C  XC  C  RWC  XW  tCW
According to this equation, the relationship between xp
make it linear we can write
bHp

and
XW is non linear. In order to
24
Where
b : Observational vector
H : Jacobian matrix of observation
p : It is 6x1 matrix of position vector ( tcw ) and rotational angle (  wc )
Noise may be present in the operation
ˆ
b    H.p
Using a Weighted Least Square (WLS) solution for the linearized, over-determined
equation system, leads to a minimum of the error measure
ˆT  K  ˆ
̂
K: convergence matrix of observation  .
An iterative method is applied to find the final extrinsic parameter.
In the
proposed calibration technique, every sensor measurement accompanies self assessment
parameter, like confidence and reliability measures. The convergence matrix K P can be
calculated from the raw data of sensors. K P is considered as confidence measure.
From equation (3.3), 3D relationship between two cameras can be given by
X 2  R12  X 1  t 12
X 1 : 3D point of first camera
25
X 2 : 3D point of second camera
R12 : Rotational matrix between camera 1 and camera 2
t 12 : Transformed translation vector between two cameras [9].
Figure 3.2: System of World Coordinates and Camera Coordinates [9]
26
Chapter 4
STEREOVISION GEOMETRY AND TRIANGULATION SYSTEM
In a stereovision system, two cameras are used that capture the picture of the
same scene at the same time. These cameras are separated by some distance, so that
images captured by these cameras are slightly different. As an object comes closer to the
cameras this difference increases. It means object distance is inversely proportional to
this difference [10]. This difference is called “disparity”. The disparity value can be used
to find the distance of an obstacle from the host vehicle.
4.1 Stereovision Geometry:
Figure 4.1 shows stereo vision geometry. The optical centers of two cameras
named Ol and Or are parallel to each other. They are separated by a distance ‘b’ called the
baseline distance. These cameras have their separate image planes. The image planes are
located at distance ‘f’ (focal length) from the optical center. The X axis of the reference
coordinate system is parallel to the base line, and the Z axis of the reference coordinate
system is parallel to the optical axis of the cameras. A point P(X, Y, Z) in the reference
coordinate system, is projected on the left image plane and denoted as Pl (Xl, Yl). Its
projection on the right plane is denoted as Pr (Xr, Yr) [11].
27
Figure 4.1: Stereovision Geometry [11]
4.2 Epipolar Concept:
In stereo vision, once one point is projected on the left image plane, the
corresponding projected point in the right image plane needs to be found. Epipolar
geometry can be used to find these corresponding points. Figure 4.1 shows that a point P
and two optical centers OL and OR are in one plane which is called “epipolar plane”. The
epipolar plane intersects each image plane where it forms epiploar lines Pl-EPl and PrEPr. For any position of point P, epipolar plane and epipolar line intersect epipoles. This
property reduces the search area to find the corresponding point in the right image. If the
projection of a point Pl is known, then we also know the epipolar line Pr-EPr. So, when a
point P is projected on the right image plane, it will lie on this epipolar line [12].
28
4.3 Triangulation System:
Triangulation is the process in which the distance of the object from the camera is
calculated. First consider the standard model as shown in Figure 4.2, which is used when
both cameras are parallel. Points L and R are two pinhole points with parallel optical
axes. The XY plane is parallel to the image plane. The distance between the two cameras
is b. The optical center of the left camera is the origin of the world coordinate system.
Figure 4.2: Triangulation Geometry for Standard Model [13]
From figure 4.2, using simple geometry, the following equations can be derived
Z
b f
x1  x 2
X 
x1  Z
f
29
Y
y1  Z
f
In the general model, the cameras are not parallel to each other as shown in figure
4.3. In this model it is assumed that the right camera can be moved in any direction with
respect to the left camera.
Figure 4.3: Triangulation Geometry for General Model [13]
1) Rotation around Y axis:
When the right camera is rotated around the Y axis it will intersect the Z
axis at point (0, 0, Z0). This point is called a fixation point. If the rotation angle is
θ, then Z0 can be calculated by following equation [13].
Z0 
b
tan 
30
By using this, the X, Y and Z coordinates can be found using the following
equations
X 
y1  Z
f
Y
Z
x1  Z
f
b f
x1  x 2    f  b 
 Z0 
2) Rotation around X axis:
In this case, the right camera is assumed to be rotated around the X axis by
angle θ. The rotation around the X axis will affect the Y coordinate only, the X
and Z coordinate remain same [13].
X 
x1  Z
f
Y
y1  Z
f  tan   Z 
Z
b f
x1  x 2
31
Chapter 5
EGO MOTION ESTIMATION
This chapter discusses ego motion estimation for stereovision system. In ego
motion, 3D position of the camera is determined with respect to its surrounding. Ego
motion estimation is normally divided into three steps. In the first step, the limited
number of common features between two stereo image pairs is extracted. The new
positions of these features in the successive images are determined in step two. In the
final step ego motion is computed by comparing these images. The method discussed
here uses geometric algebra to estimate the ego motion [15].
5.1 Feature Selection and Matching:
Harris and Stephen detection algorithm is used to detect the limited number of
features in the left and right images [15]. These features are denoted as x and x’. Epipolar
geometry can be used to find the corresponding features in the left and right images. The
normalized cross correlation is used for matching of the features in the left and right
image. The distance between two 3D points can be given by the following Euclidean
equation
Y  R X   t
R: Rotational operation
t: Translational operation
(4.1)
32
Figure 5.1 represents stereovision geometry from which the following equation can be
written
x
f
X  3
(4.2)
Figure 5.1: Stereo and Ego motion relationship [15]
x and x’ are the projection of a point P in the left image and right image,
respectively. A point P in the space can be represented by a vector X from the left camera
and X’ from right camera. X can be calculated from X’ using the rotational matrix RC and
translation matrix tC. When the whole system shifts to the other point, the new vector of a
33

point P is denoted by Y and can be calculated using rotational matrix RM and translation
matrix tM.
Yˆ  RM  X   tM
The distance of point P from the optical center of the camera is given by using
scalars λi and µi given by
i  Xi   3
and
 i  Yi   3
i
(4.3)
The initial value of λi and µi is obtained by projecting the vector Xi and Yi on the
stereo camera image. Vectors Xi and Yi can be calculated by correlating the positions of
features in the left and right images. However, these extracted new points will have an
error due to noise in the sensors and their resolution. This error is calculated by projecting
the resultant vector ( Ŷi ) of RM and tM at location ŷi and computing the square of the
distance between yi and ŷi . The other problem is that projection of the vector from the
first stereo image pair is considered but the projection of vector from second stereo image
pair is neglected. The inverted motion is considered for projecting these vectors
backward and minimizing error [15].
The initial value of RM and tM is calculated using least square estimator based on
(4.1). An iterative method is used to optimize the estimated motion, vectors Xi, Yi and
scalars λi, µi [15].
34
5.2 Estimation of tM:
Translation matrix tM can be obtained by taking the mean of reconstruction
product for the cameras and vector Xi, Yi.
tM 

 
 
 



1 n 
 t Xi , Yi  t Xi , Yi   t Xi , Yi  t Xi , Yi 
4n i1

(4.4)
Where

Xi  i  xi

Yi  i  yi
5.3 Estimation of RM:
 
Rotational matrix RM can be calculated from X , Yi , Xi , Yi and translation matrix
tM. The two rotational motions, R1 and R2, are considered. R1 is rotation of the
reconstructed vector of the first stereo pair into the vector µ(Yi). Similarly R2 is rotation
of the reconstructed vector of the second stereo pair into the vector µ(Xi). A geometric
algebra method is considered for calculation of rotations. In geometric algebra rotation is
calculated using spinors, which is sum of one scalar  and three scalar-bivector pair [15].
R  1   21   2   3 2   3   41   3
The final value of RM is calculated by the following equation
35
R1  R 2 1
RM 
|| R1  R 2 1 ||
5.4 Estimation of Xi and Yi:
The values of Xi and Yi are obtained by taking the average motion for jumping
between the two cameras and the estimated ego motion of these values:


 
 
 
Xi  RM 1 s Yi, Yi '  tM  s Xi, Xi '



 
 
Yi  RM s Xi, Xi '  tM  s Yi, Yi '


5.5 Estimation of λ and µ:
The values of λ and µ are obtained by projecting vectors Xi and Yi on stereo
camera image:
i 
i 
Yi  tM   RM xi   Xi  x
2 xi 2
RM  Xi   tM   RM  1  yi   Yi.  yi
2 yi 2
Advantages
1) There is less computation cost as stereovision technique is used for estimation of
ego motion.
2) As geometric algebra is used, this system can calculate exact Euclidean motion
[15].
36
Chapter 6
SIMULATION AND CALCULATION
6.1 Introduction:
In this chapter, a collision detection system is suggested and tested. In this system
two small RC cars are used. Two cameras are placed on Car1 with fixed distance. These
two cameras capture the video in front of the car. The video is converted into frames. An
object is detected in the captured frames and the coordinates of centroid of the object are
found. A stereovision triangulation system is used to find out the distance of the object
from the car. As the car approaches the object, this distance decreases. A collision
corresponds to a zero distance.
Figure 6.1: Stereovision System [14]
37
Figure 6.2: Disparity Calculation [14]
In figure 6.1, point P is the centroid of other car in our system. It is projected on
both left and right image planes. Their coordinates in the image are denoted as P 1(x1,y1)
and P2(x2,y2). C1 and C2 are the optical centers of the cameras. They are located at
distance f (focal length) from the projected point. The optical centers of the cameras are
separated by distance b. The distance between a point P and the host vehicle is calculated
using the disparity of point P in the left and right images. This disparity is equal to the
horizontal shift of point P from P1 to P2 in the image plane [14]. As both cameras are
mounted at the same level there is no vertical disparity in the images. By using
stereovision triangulation geometry, we obtain
D
b f
Disparity
D
b f
x1  x 2
38
6.2 Case I:
In case I, Car1 is not moving and Car2 is moving towards Car1. The focal length
of the cameras is same and the distance between them is 22 cm.
Frame No
X Coordinate of
X Coordinate of
Distance Between
Centroid in Left
Centroid in Right
Two Car
Image (x1)
Image (x2)
25
294.57
294.7
3.85
35
293.23
294.1
0.85
45
290.9
293.52
0.30
55
284.5
294.2
0.08
65
262.6
312.0
0.01
D
b f
x1  x 2
Table 6.1: Distance Between the Two Cars in Case I
From the above equation, we can say that disparity is inversely proportional to the
distance between the two objects. As the disparity increases, distance between the two
cars decreases. Table 6.1 shows that as Car2 come closer to Car1, the X coordinate of the
centroid in the left and the right images are shifted away from each other. This causes to
increase in the disparity between two corresponding points. So as time interval increases,
distance D between the two cars decreases and at time interval 65 unit, they almost
collide with each other.
39
Snap shots taken at different time intervals, are shown in the following figures.
The snap shots show that as time interval increases, difference between the position of
car in the left and right frame increases.
Frame
Left Frame
Right Frame
No
25
35
Figure 6.3: Different Position of the Car in Case I
40
45
55
65
Figure 6.3: Different Position of the Car in Case I (continued)
41
6.3 Case II:
In this case Car1 is moving and Car2 is still. The distance between two cameras is
13cm and the focal length of the cameras is 35mm.
Frame No
X Coordinate of
X Coordinate of
Distance Between
Centroid in Left
Centroid in Right
Two Car
Image (x1)
Image (x2)
150
297.20
306.71
0.047
170
297.10
307.16
0.045
200
298.14
308.31
0.044
230
299.58
309.89
0.044
270
299.55
316.46
0.026
300
299.0
333.97
0.013
320
297.94
345.50
0.009
D
b f
x1  x 2
Table 6.2: Distance between the Two Cars in Case II
Table 6.2 shows that as the time interval is increasing, the X coordinate of the
centroid in the right image is shifting away from its initial position. This causes to
increase the difference between the position of the centroid in the left and the right image.
As this difference increases, disparity increases and distance between the two cars
decreases. At time interval 320 unit, the two cars are colliding.
42
Snap shots taken at different time intervals, are shown in the following figures.
The snap shots show that as time interval increases, difference between the position of
car in the left and right frame increases.
Frame
Left Frame
Right Frame
No
150
170
Figure 6.4: Different Positions of the Car in Case II
43
Frame
Left Frame
Right Frame
No
200
230
Figure 6.4: Different Positions of the Car in Case II (continued)
44
Frame
Left Frame
Right Frame
No
270
300
Figure 6.4: Different Positions of the Car in Case II (continued)
45
Frame
Left Frame
Right Frame
No
320
Figure 6.4: Different Positions of the Car in Case II (continued)
46
6.4 Case III:
In Case III, both cars are moving in forward direction with different speeds.
Frame No
X Coordinate of
X Coordinate of
Distance Between
Centroid in Left
Centroid in Right
Two Car
Image (x1)
Image (x2)
50
260.38
287.76
0.015
100
265.44
284.31
0.022
150
266.53
283.09
0.025
200
276.54
285.05
0.049
250
279.74
288.30
0.049
300
284.63
291.01
0.065
D
b f
x1  x 2
Table 6.3: Distance between Two Cars in Case III
Table 6.3 shows that as time interval increases, the X coordinates of the centroid
in the left and right image comes closer. This causes to decrease the difference between
the position of centroid in the left and right image. As this difference decreases, disparity
decreases and the distance between the two cars increases. If we compare the entries of
Table 6.5, it shows that with increase in the time interval, disparity decreases and
distance between the two cars increases. As the distance between the two cars is
increasing, possibility of the collision becomes smaller.
47
Snap shots taken at different time intervals, are shown in the following figures.
The snap shots show that as time interval increases, difference between the position of
car in the left and right frame decreases.
Frame
Left Frame
Right Frame
No
50
100
Figure 6.5: Different Position of Car in Case III
48
Frame
Left Frame
Right Frame
No
150
200
Figure 6.5: Different Position of Car in Case III (continued)
49
Frame
Left Frame
Right Frame
No
250
300
Figure 6.5: Different Position of Car in Case III (continued)
50
6.5 Analysis of Results:
Figure: 6.6 Graph of Results
Figure 6.3 shows the graph of the distance as a function of the disparity for all
cases. The graph shows that as disparity (x2-x1) increases, the distance D between the two
cars decreases. This shows that distance D is inversely proportional to the disparity of the
centroid. From figure 6.6, we can say that as objects come closer to the host vehicle
disparity of the centroid increases. When the disparity of centroid becomes 50 units, the
two cars almost collide each other.
51
Chapter 7
CONCLUSION AND FUTURE WORKS
Conclusion:
In this project, collision detection using stereovision geometry and triangulation
system is studied. An algorithm for collision detection is developed. The algorithm is
implemented in MATLAB and results were used to determine the possibility of collision.
A number of different cases are considered to test the algorithm. This design can be used
in different areas such as automobiles, robotics, aviation, etc.
Future Work:
Collision detection is one of the most important features of ADAS. We can
integrate stereovision collision detection systems with the Advanced Cruise Control
(ACC) feature of ADAS. When the distance between two cars become less than the
threshold distance, the host car will start applying break to prevent collision. We can also
provide the output of collision detection system to air bag system in the car. So, when
collision is detected, air bag will come out and it will save the driver from the fatal
injuries.
The proposed design can also be used for lane change decision. When the driver
intended to change the lane, this system will detect the any vehicle present behind the
host vehicle. The distance between two vehicles is calculated and the driver will be
warned, if there is possibility of the collision.
52
APPENDIX
Following MATLAB code is used for collision detection.
1. To Play Video:
video = mmread('left.mpg',[1:80],[],false,true);
movie(video.frames)
video = mmread('right.mpg',[1:80],[],false,true);
movie(video.frames)
2. To Convert Frame into Image:
[x,map] = frame2im(video.frames(80));
a = [x,map];
imshow(a);
3. Write the image into the Jpeg File:
imwrite(a,'r1.jpeg');
4. To Read the Images from file:
Left_I = imread('l5.JPEG');
5. Function to find centroid of the image:
function [meanx,meany] = ait_centroid(Left_I)
[x,y,z] = size(Left_I);
if(z==1);
else
Left_I = rgb2gray(Left_I);
end
im = Left_I;
[rows,cols] = size(im);
x = ones(rows,1)*[1:cols];
y = [1:rows]'*ones(1,cols);
area = sum(sum(im));
meanx = sum(sum(double(im).*x))/area;
meany = sum(sum(double(im).*y))/area;
53
6. To Find Centroid of Image
Left_I = imread('l5.JPEG');
[x,y] = ait_centroid(Left_I);
x
y
imshow(Left_I);
pixval on
7. For Analysis and Plotting Graph of Result
clear all;
x1=[293.23,290.9,284.5,262.6]; %X Coordinate of centroid for case I
x2=[294.1,293.53,294.2,312]; %X Coordinate of centroid for case I
y1=[297.20,297.10,298.14,299.58,299.55,299,294.94];
y2=[306.71,307.16,308.31,309.89,316.46,333.97,345.50
z1=[260.38,265.44,266.53,276.54,279.74,284.63];
z2=[287.76,284.31,283.09,285.05,288.30,291.01];
c= (x2-x1);% Disparity for Case I
c1=(y2-y1);% Disparity for Case II
c2=(z2-z1);% Disparity for Case III
p= [0.77,0.77,0.77,0.77];% b*f for case I
p1=[0.455,0.455,0.455,0.455,0.455,0.455,0.455];% b*f for case II
p2=[0.42,0.42,0.42,0.42,0.42,0.42];% b*f for case III
z= p./c; % Calculation of distance D for case I
z1= p1./c1;% Calculation of distance D for case II
z2= p2./c2;% Calculation of distance D for case III
plot(c,z,c1,z1,'r',c2,z2,'g')% Plotting graph for all cases
xlabel('Disparity(x2-x1)');
ylabel ('Distance(D)');
title('Analysis of Results');
legend('Case I', 'CaseII','Case III');
54
REFERENCES
[1] Advanced Driver Assistance Systems [Online]
Available: http://www.contionline.com/generator/www/de/en/continental/automotive/themes/passenger_cars/chassis_
safety/adas/ov1_adas_en.html Date: 2/2/2011.
[2] Ford Safety [Online]
Available: http://www.ford.com Date: 2/2/2011.
[3] Car Safety [Online]
Available: http://www.tarbuiot.com/vehicle-articles/54-success-of-driver-assistancesystems-for-improving-car-safety-depends-on-acceptance-and-use-of-technologies-bydrivers Date: 2/2/2011.
[4] Ming Bai; Yan Zhuang; Wei Wang; , "Stereovision based obstacle detection approach
for mobile robot navigation," Intelligent Control and Information Processing (ICICIP),
2010 International Conference on , vol., no., pp.328-333, 13-15 Aug. 2010
doi: 10.1109/ICICIP.2010.5565220
[5] Nedevschi, S.; Vatavu, A.; Oniga, F.; Meinecke, M.M.; , "Forward collision detection
using a Stereo Vision System," Intelligent Computer Communication and Processing,
55
2008. ICCP 2008. 4th International Conference on , vol., no., pp.115-122, 28-30 Aug.
2008 doi: 10.1109/ICCP.2008.4648362
[6] Chung-Cheng Chiu; Wen-Chung Chen; Min-Yu Ku; Yuh-Jiun Liu; , "Asynchronous
stereo vision system for front-vehicle detection," Acoustics, Speech and Signal
Processing, 2009. ICASSP 2009. IEEE International Conference on , vol., no., pp.965968, 19-24 April 2009 doi: 10.1109/ICASSP.2009.4959746
[7] http://en.wikipedia.org/wiki/Edge_detection Date: 3/5/2011.
[8] Foggia, P.; Limongiello, A.; Vento, M.; , "A real-time stereo-vision system for
moving object and obstacle detection in AVG and AMR applications," Computer
Architecture for Machine Perception, 2005. CAMP 2005. Proceedings. Seventh
International Workshop on , vol., no., pp. 58- 63, 4-6 July 2005
doi: 10.1109/CAMP.2005.6
[9] Ernst, S.; Stiller, C.; Goldbeck, J.; Roessig, C.; "Camera calibration for lane and
obstacle detection," Intelligent Transportation Systems, 1999. Proceedings. 1999
IEEE/IEEJ/JSAI International Conference on , vol., no., pp.356-361, 1999
doi: 10.1109/ITSC.1999.821081
[10] Maryum, Ahmed F. Development of a Stereo Vision System for Outdoor Mobile
Robots. Rep. University of Florida, 2006. Web.
<http://cimar.mae.ufl.edu/CIMAR/pages/thesis/ahmed_m.pdf>.
56
[11] Danyan, G. Stereo Vision Algorithm Using Propagation of Correspondences Along
an Array of Cameras. Rep. 2005 ed. Vol. 1. NJIT. Web.
<http://archives.njit.edu/vol01/etd/2000s/2005/njit-etd2005-049/njit-etd2005-049.pdf>.
[12] http://en.wikipedia.org/wiki/Epipolar_geometry [Online] Date 5/2/2011
[13] http://www.dis.uniroma1.it/~iocchi/stereo/triang.html [Online] 5/10/2011
[14] http://disparity.wikidot.com/triangulation-geometrics [Online] 5/20/2011
[15] van der Mark, W.; Fontijne, D.; Dorst, L.; Groen, F.C.A.; , "Vehicle ego-motion
estimation with geometric algebra," Intelligent Vehicle Symposium, 2002. IEEE , vol.1,
no., pp. 58- 63 vol.1, 17-21 June 2002 doi: 10.1109/IVS.2002.1187928 URL:
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1187928&isnumber=26631
Related documents
Download