COLLISION DETECTION USING STEREOVISION Rakesh S. Patel B.E., K.K.Wagh C.O.E. India, 2007 PROJECT Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in ELECTRICAL AND ELECTRONIC ENGINEERING at CALIFORNIA STATE UNIVERSITY, SACRAMENTO SUMMER 2011 COLLISION DETECTION USING STEREOVISION A Project by Rakesh S. Patel Approved by: __________________________________________________, Committee Chair Fethi Belkhouche, Ph.D. ___________________________________________________, Second Reader Preetham Kumar, Ph.D. ________________________________ Date ii Student: Rakesh S. Patel I certify that this student has met the requirements for format contained in the university format manual, and that this project report is suitable for shelving in the Library and credit is to be rewarded for the Project. _________________________ , Graduate Coordinator Preetham B. Kumar, Ph.D. Department of Electrical and Electronic Engineering iii ________________ Date Abstract of COLLISION DETECTION USING STEREOVISION by Rakesh S. Patel Advanced Driving Assistance System (ADAS) is developed to avoid the increasing number of accidents. Collision Detection and Warning is one of the main features of ADAS. Different techniques are available for the collision detection such as using ultrasonic sensors, laser, radar, vision, etc. The main aim of this project to is to design a system which has the ability to detect the possibility of collision using cameras. Stereovision method is use to design a collision detection system. In this method, two cameras are used. First, camera calibration is done in which intrinsic and extrinsic parameters of the cameras are calculated and then cameras are mounted on the vehicle. These cameras are mounted at the same level from the reference plane. The cameras capture the same scenario which is in front of the car at the same time. Videos of the scene are stored and then converted into images for further processing. The images are slightly different from each other as two cameras are separated by some distance. One point P, normally centroid of the obstacle is detected and projected on the left image iv plane to calculate its 2D coordinates. Epipolar geometry is used to find the coordinates of the same point in the right image plane. Triangulation geometry is used to calculate the distance of an obstacle from the vehicle. Depending upon this distance, likelihood possibility of collision is determined and further precautionary measures can be taken. Different cases are considered to test the functionality of the design. _________________________________________________, Committee Chair Fethi Belkhouche, Ph.D. ________________ Date v ACKNOWLEDGEMENTS I would like to express my sincere gratitude to all those people who helped me during this project. These people helped me to achieve an important milestone in my life. Firstly, I would like to thank my project guide, Dr. Belkhouche. He gave me continuous and valuable advice which helped me to be on the right track to complete this project. I would also like to extend my gratitude to Dr. Kumar for reviewing my report and providing valuable comments. Finally, I would like to thank my family and friends for supporting and helping me through the entire project and making this project successful. vi TABLE OF CONTENTS Pages Acknowledgements…………………………………………………………..……….....vi List of Tables….…………………………………………………………..……………...ix List of Figures.…………………………………..……………………………...…….......x Chapter 1. INTRODUCTION…………………………………………....………………….1 2. RELATED WORK………………………..…………………..…………………3 3. CAMERA CALIBRATION…………………………………...………………..20 3.1 Intrinsic Calibration………………………………………………….21 3.2 Extrinsic Calibration…………………………………………………22 4. STEREOVISION GEOMETRY AND TRIANGULATION SYSTEM………..26 4.1 Stereovision Geometry……………………………………………....26 4.2 Epipolar Concept…………………………………………………….27 4.3 Triangulation System………………………………………………...28 5. EGO MOTION ESTIMATION…………………………………………………31 5.1 Feature Selection and Matching……………………………………..31 5.2 Estimation of tM...................................................................................34 5.3 Estimation of RM……………………………………………………..34 5.4 Estimation of Xi and Yi……………………………………………….35 5.5 Estimation of λ and µ………………………………………...………35 vii 6. SIMULATION AND CALCULATION.………………………………………...36 6.1 Introduction…………………………………………………………..36 6.2 Case I………………………………………………………………...38 6.3 Case II………………………………………………………………..41 6.4 Case III……………………………………………………………….46 6.5 Analysis of Results…………………………………………………..50 7. CONCLUSION AND FUTURE WORKS……………………………..……….51 8. Appendix……………………………………………………………………...…52 9. References………………………………………………………………………..54 viii LIST OF TABLES 1. Table 6.1: Distance Between the Two Cars in Case I……………………………38 2. Table 6.2: Distance between the Two Cars in Case II………………………...…41 3. Table 6.3: Distance between the Two Cars in Case III…………….…………….46 ix LIST OF FIGURES 1. Figure 2.1: Overview of the Approach.…………………………………………...4 2. Figure 2.2: FCW System Architecture……………………….……………………5 3. Figure 2.3: Boarder Scanner…………………………………………………………….7 4. Figure 2.4: Drivable Tunnel Model…………………………………………………........7 5. Figure 2.5: Drivable Tunnel Model Projection…………………………………….8 6. Figure 2.6: Procedure to Determine ROI………………………………………...12 7. Figure 2.7: Tracking of Obstacle………………………………………………...14 8. Figure 2.8: Moving Object and Obstacle Detection System……………………..16 9. Figure 2.9: Pinhole Camera Model……………………………………………....17 10. Figure 3.1: Place of Camera Calibration in Detection System…………………..21 11. Figure 3.2: System of World Coordinates and Camera Coordinates…………….25 12. Figure 4.1: Stereovision Geometry…………….………………………………...27 13. Figure 4.2: Triangulation Geometry for Standard Model ……………………….28 14. Figure 4.3: Triangulation Geometry for General Model………………………...29 15. Figure 5.1: Stereo and Ego Motion Relationship…………………………….......32 16. Figure 6.1: Stereovision System………………………………………………....36 17. Figure 6.2: Disparity Calculation………………………………………………...37 18. Figure 6.3: Different Positions of the Car in Case I…………………………......39 19. Figure 6.4: Different Positions of the Car in Case II…………………………….42 20. Figure 6.4: Different Positions of the Car in Case III………………....................47 x 21. Figure 6.5: Analysis of Results………………………………………………….50 xi 1 Chapter 1 INTRODUCTION The increase in the number of cars has resulted in a higher number of accidents. In order to decrease the number of accidents, we need to develop an efficient driver assistant system which will warn the driver before collision takes place. Implementation of such system will help reducing the number of accidents, improve road safety and thus saving human lives. In recent years, various Advanced Driving Assistance Systems (ADAS) were developed and implemented at faster rate to improve the safety of the drivers on road. ADAS monitors the surroundings and detects other traffic participants. It warns the driver about possible accidental situation or act autonomously, if necessary. It is an active safety system, since action is taken before any accident occurs. ADAS system includes features like Adaptive Cruise Control (ACC), Collision Warning (CW), Blind Spot Detection (BSD), Emergency Break Assistance (EBA), Intelligent Headlamp Control (IHC), Lane Departure Warning (LDW) and Traffic Sign Reorganization (TSR) [1]. In ACC, the driver selects the speed of the car and its distance from the front vehicle. When ACC detects that front vehicle is slowing down, it decreases the speed of the car and maintains the safe distance and when traffic gets clear it increases the speed of the car to a selected level. ACC is sometime combined with CW. This system alerts the driver when it senses the possibility of collision. In case of an emergency, it will precharge the breaks to increase their sensitivity. The BSD system alerts the driver when vehicles are in or approaching towards his blind spot. It may include a visual warning like light indicator in corresponding mirrors or include audio warning [2]. The LDW feature 2 warns the driver when he is about to change his current lane. This system uses the camera to detect the lanes by indentifying lane markers. The main advantage of ADAS is safety. According to study of Insurance Institute for Highway Safety, nearly 1.2 million accidents are avoided due to the Collision Warning feature of ADAS. Accidents are directly related to person’s injury, death and damage of cars, so avoiding accidents means saving human’s life and avoiding car damage [3]. ACC helps to keep preset speed and safe distance between to two vehicles. LDW helps to drive in one lane. Success of ADAS depends upon driver’s ability to learn new technology and willingness to use it. This project mainly focuses on the Collision Warning (CW) feature of ADAS. Stereovision technique is used for collision detection. In stereovision, two cameras are used. The cameras record the same scene at the same time. The videos taken by camera are later converted into images and used to calculate the distance between two vehicles. If this distance is less than the safe distance, then necessary action is taken. 3 Chapter 2 RELATED WORK In this chapter, we discuss the different techniques used for collision detection. These techniques mainly use the stereovision method for collision detection. In [4], obstacle detection is divided into two stages. These stages are detection and confirmation. In the detection stage, real time relation between the obstacle and ground is computed. In confirmation stage, positions of the obstacles are determined. Figure 2.1 represents the overview and approach of this technique. In stereovision two cameras are used to capture images of the scene. The images from the left camera are indentified as left images, while images from the right camera are indentified as right images. In the first stage, left and right images are captured by the respective cameras. The tradeoff between smoothing homogenous area and structural characteristic of images is done by the anisotropic diffusion of the images. The U-V disparities of these images are calculated. This disparity information is used for the detection of the obstacles. By using the above information the region of interest can be determined. Obstacles are present in this area only. In a second stage, obstacles are confirmed by using depth discontinuity of the images. The geometry and location of each obstacle is also determined [4]. Advantages of this method are as follows: 1) This method can detect negative obstacles and thin obstacles also. 2) In this technique the effect of reflection can be eliminated. 3) This system can be used under all conditions and any time of the day and night. 4 Getting Left and Right images Smoothing input image and preserving shape boundaries Extracting approximately vertical or horizontal edges and horizon Indentifying region of interest and confirming region in which obstacle exists Compounding geometric information about obstacle, obstacle-ground contact lines and bounding boxes Figure 2.1: Overview of the Approach [4] 5 In [5], an algorithm for Forward Collision Warning (FCW) is explained. This method uses 3D information provided by the stereo vision system and ego motion of the car to generate a collision warning. This method is designed for urban road traffic. Left Camera Right Camera TYZX Hardware Stereo Machine Reconstruction of 3D Points Coarse Objects Elevation Maps Car Parameters Tracked Objects Object Delimiter Drivable Tunnel Forward Collision Detection Module FCW Output Figure 2.2: FCW System Architecture [5] Figure 2.2 shows the system architecture and the basic modules of this method. The TYZX hardware board takes the input from two cameras and performs 3D reconstruction of the scene. The output of this board is used by other modules to detect 6 and track objects, to determine elevation map, etc. In Coarse Object module, using available stereo information, the rough objects are extracted. The object’s position, size and speed are described in the Tracked Objects module. Kalman filtering is used for this process. Elevation map uses stereo information provided by TYZX hardware to describe the scene. This map is divided into drivable point, curb point and object point. Onboard sensors are used to collect the different car parameters like speed, yaw rate, etc for each video frame. The output of Elevation Map is used by the Object Delimiter to compute the set of unconstructed polygon. The redial scanning algorithm is used for this process. In Drivable Tunnel, the virtual area around ego-car trajectory is represented using mechanical and movement characteristics of the car. Forward Collision Detection module performs collision detection process and FCW output gives visual indication according to the type of obstacle [5]. The main modules in this method are the Delimiter Object module, the Drivable Tunnel module and the Forward Collision Detection module. Delimiter detection is performed in four steps. In the first step, Top view of the image is generated then in the second step each object from the elevation map is labeled. In the third step, contour extraction is done using boarder scanning algorithm. In the final step, approximate curve for the car motion is generated. In boarder scanning algorithm radial scanning is done by a fixed given step with center at the Ego car position. Scanning is performed only in the region of interest i.e. from Q from to Q to with Q rad slope. All the detected delimiter points from region of interest added to a list known as CounterList. For each new label this list is cleared and new list is generated. The Drivable Tunnel Model is represented by 7 polyhedron as shown in figure 2.4. It consists of number of hexahedron cells. Every cell has two parallel faces left and right, bottom and top. The far and near faces are perpendicular to the car motion [5]. Figure 2.3: Boarder Scanner [5] Figure 2.4: Drivable Tunnel Model [5] In this method, a collision detection polygon clipping algorithm is used. In this algorithm vertices of the polygon are taken as input and it generates output which has one or more polygon. The tunnel projection as shown in figure 2.5 is used as input for polygon clipping algorithm. In the projected trajectory, right edges form the circular are of radius R1 and left edges form circular arc of radius R2. These arcs define the drivable tunnel’s boundaries and bounded between TopBoundry and BottomBoundry. As shown 8 in figure 2.5, the drivable tunnel projection is divided into four areas and they are defined by their locations form the tunnel edges. These areas are represented by following mathematical constraints, considering C(x0, y0) as the center of circular arcs and angle α as the angle made by TopBoundry with BottomBoundry: Figure 2.5: Drivable Tunnel Model Projection [5] 1) Right Area: It is limited by the right edge of the tunnel. R1 is the radius of the right edge circular arc. There are two cases for the right area. For x0<0 9 2 R12 x x0 2 y y 0 R 22 z x x 0 tan z 0 For x0>0 x x0 2 y y 0 2 R12 z x x0 tan z 0 2) Left Area: It is limited by the left edge of the tunnel. R2 is the radius of the left edge circular arc. There are two cases for left area. For x0<0 x x 0 2 y y 0 2 R 2 2 z x x 0 tan z 0 For x0>0 x x 0 2 y y 0 2 R 2 2 z x x 0 tan z 0 3) Inside Area: It is the area between the two arcs, this means inside the tunnel. It is given by For x0<0 10 2 R 2 2 x x 0 2 y y 0 R12 For x0>0 2 R12 x x 0 2 y y 0 R 2 2 4) Top Area: It is area on top of tunnel. z x x0 tan z 0 These four areas are used to compute position of each delimiter vertex. The edge direction is assigned to each vertex depending upon the endpoint position. An obstacle is detected when these vertexes intercept right or left edge of the tunnel. The position of the obstacle is determined by intercepting each chord of the tunnel with the current vertex. Accuracy of this method depends upon dense stereo information, elevation map result and object detection algorithm [5]. In [6], another stereovision technique for front vehicle detection is explained. This method uses an asynchronous stereovision technique to overcome the limitation of synchronous stereovision system. It uses edge information to detect the region of an obstacle. It is divided into two modules. These modules are the search module and the matching module. In the search module, the most likely position of the obstacle is determined in the right image. In the matching module, the corresponding pixels are found in the left image. The disparities of these points are used to calculate the distance of an obstacle from the host vehicle [6]. 11 Search Module: Sobel edge detection algorithm is used to detect the edges of obstacles. The change in the image brightness is key feature of this algorithm. The change in the brightness of an image corresponds to change in depth, change in material, discontinuities in surface, etc [7]. The complexity of image processing is reduced by down sampling input image with factor 2n. The Sobel edge detection algorithm is applied to this image to detect the edge points of the obstacle. The horizontal and vertical line segments are used to connect the points which are at same level in the image. The horizontal line segments which are near the bottom of the image are processed first because these points of the obstacle are the closest to the host vehicle. The other horizontal segments are processed bottom up. The length and position of each segment is calculated. The horizontal line segments in the region of interest are given by | SLx1 ILx1 | Th,| SLx 2 ILx 2 | Th, and SLy ILy SLx1: Left point of the seed line SLx2: Right point of the seed line ILx1: Left point of the input horizontal line segment ILx2: Right point of the input horizontal line segment SLy: y coordinate of the seed line ILy: y coordinate of the input horizontal line segment [6]. The error is introduced in the region of interest due to shadows on the road, so time is wasted in processing such line segments. The region of shadow is removed by 12 comparing horizontal and vertical projection histograms of the same image and deleting interference region. A: Original Input Image C: The horizontal and vertical histogram B: Edge Detection in the Input Image D: Final Region of Interest Figure 2.6: Procedure to Determine ROI [6]. Matching Module: In the matching module, pixels of the horizontal line segments from a region of interest are matched with corresponding pixels in the left image. Then, disparity of these pixels is found. The zero –mean SAD full searching, adaptive shifting of searching area 13 and tracking detection is done in matching module [6]. As asynchronous stereo system is used, there will be fixed gray value difference between the right and left image. This difference is nullified before search processing. The right and left matching blocks are defined by zero-mean SAD as follows SADV i, j N /2 N /2 | RC x k , y l M LRi x k , j y l M | R L k N / 2 l N / 2 MR N /2 N /2 1 RC i, j N N i N / 2 j N / 2 ML N /2 N /2 1 LRi, j N N i N / 2 j N / 2 and p i, j p Where N N : Block Size MR: Mean value of the matching block in the right image ML: Mean value of the matching block in the left image. RC(x, y): The Image pixel in the right image. LR(x, y): The Image pixel in the left image [6]. The matched pixels of the right and the left image are not on the same line segments, they are slightly shifted due to the asynchronous nature of the system. In the adaptive shifting algorithm, this shifted area is calculated and added to the left image to reduce the matching time. Once the vehicle detected in one image frame, it can also be 14 found in the next image frame by increasing or decreasing the region of interest. This is illustrated in figure 2.7. A: nth frame B: n+1th Frame C: Histogram of nth+1 Frame D: ROI of nth+1 Frame Figure 2.7: Tracking of Obstacle [6] 15 Advantages: 1) The setup of this system is much simpler than synchronous stereo system. 2) The stereovision object detection algorithms provide better performance with asynchronous system. 3) Price and size of asynchronous stereo system is less than synchronous stereo system. In [8], an algorithm for moving objects and obstacle detection is explained. In this algorithm disparity map is calculated using area based stereo matching technique for 3D representation of scene and object detection. The optical flow method is used to determine the motion of an object. This algorithm mainly works for indoor applications. Figure 2.8 shows basic steps of this design. First, 3D representation of the scene is done by using disparity information. In the second step, resampling and quantization of the disparity map is done. The motion vector of the obstacle is determined using the velocity vector of the camera in the Prediction of Optical Flow Map then blob detection algorithm is used to determine the motion of the objects. 16 System Initialization (Optical Flow, Stereovision) Computation Of Disparity Map Computation of Optical Flow Map with Median Filter Resampling, Filtering and Quantization Prediction of Optical Flow Map Detection of anomalous motion vector Blob Detection for moving object anomalous motion vector Figure 2.8: Moving Object and Obstacle Detection System [8] 3D Representation of Scenes: This system uses the pinhole camera model for 3D reconstruction of a scene. Figure 2.9 shows the pinhole camera model. By using pinhole camera model, a point W(x, y, z) can be projected on the left and right image planes. This point is denoted as PL 17 (xl, yl) in the left image plane and PR(xr,yr) in the right image plane. The coordinates of point W are given by following equations: Z b. f b. f ; xl xr d X xl.Z ; f Y yl.Z f b: baseline, distance between two cameras d: xl xr disparity Figure 2.9: Pinhole Camera Model [8] R: Image Plane C: Optical Center f: Focal length The main task in this step is to calculate the disparity (d). By using epipolar geometry, it is assumed that once camera calibration is done, then a common point projected on the two image planes lies on the same horizontal line and the disparity range is [0, dmax]. The disparity is given by a function called Sum of Absolute Differences: 18 SAD x, y, d n | L x j, y i R x j d , y i | i, j n L(x, y): Intensity of the left image R(x, y): Intensity of the right image n: A small integer constant A recursive formulation is used to simplify the calculation of SAD. SAD x, y 1, d SAD x, y, d U x, y 1, d Where n U x, y 1, d | L x j , y n R x d j , y n | j n | L x j, y n 1 R x d j, y n 1 | Furthermore U x, y 1, d U x 1, y 1, d | L x n, y n 1 R x d n, y n 1 | | L x n 1, y n R x d n 1, y n | | L x n, y n R x d n , y n | | L x n 1, y n 1 R | x d n 1, y n 1 | The minimum value of SAD gives final disparity 19 Optical Flow Based Motion Detection: Optical flow is determined with the help of area based algorithm to estimate motion of an object in scene. The images I1(x,y) and I2(x,y) are two consecutive images, the motion vector (vx , vy) of point (x,y) is given by SAD x, y, vx, vy | I 1 x i, y j I 2 x i vx, y j vy | i, j Where - m/2 ≤ i ≤m/2; - m/2 ≤ j ≤ m/2 m: the size of the correlation window A threshold value for SAD is selected and any motion vector below that value is neglected. The local differences of vectors are attenuated by using a median filter. Horn and Schunk algorithm is used to smoothen the distortion in the images [8]. The environment consists of moving and non moving objects. They need to be treated differently. The moving object’s motion vector (vx, vy) is different from the predicted vector vx, vy . This difference is calculated and compared with the threshold value. m vx 2 vy 2 tg 1 m ' v ' x2 v ' y2 | m m | MThreshold v, v ' v'y vy cos 1 ; ' tg 1 vx v'x | x || y | DThreshold The point which exceeds this threshold value marked as anomalous motion vector. This vector acts as input to the blob detection algorithm. In this algorithm, moving and non moving objects are detected using histogram technique. 20 Chapter 3 CAMERA CALIBRATION In collision detection, vision is the main navigation tool. Camera is the heart of the vision system. Camera calibration is the process in which the true parameters of the camera are calculated. These parameters include intrinsic parameters like focal length, the coordinates of center distortion in the image space, etc., and extrinsic parameters like position and orientation of the camera. The calibration parameters are used to link the coordinates of pixels in an image with the corresponding camera coordinates in the camera reference frame this means the calibration computes the relationship between 2D image coordinates and 3D world coordinates. It is necessary to calculate the true value of these parameters because they affect the image processing. Camera calibration procedure is explained in [9]. Figure 3.1 shows the place of camera calibration in the detection system. As shown in fig 3.1, images taken from two cameras are the inputs to the calibration unit and after calibration, they go to the detection system. The calibration unit calculates the position and angle of the camera in a 3D environment attached to a global coordinate system. These are the position of the two cameras in world co-ordinate system. This unit is divided into two parts: intrinsic calibration and extrinsic calibration [9]. 21 C1 Digitization C2 Calibration Lane Recognition Obstacle Detection and Tracking Figure 3.1: Place of Camera Calibration in the Detection System [9]. 3.1 Intrinsic Calibration: The intrinsic parameters of the camera are affected by optics and the imaging sensor chip of camera. In intrinsic calibration, the focal lengths of the lens (αx, αy), the coordinates of principle point (Cx, Cy) and the lens distortion are calculated. In the pinhole camera model, a 3D point of the world coordinate system is projected on 2D image plane in a linear manner. The Euclidean camera coordinates vector Xc is given by X .x Xc Y . y x Z .z (3.1) 22 Where Xc : Euclidean camera coordinates vector ~ x : Homogenous Image Coordinates λ: Ambiguity of Liner Projection Using homogenous coordinate’s relationship between pixel coordinate and image coordinate xp x 0 Cx x yp = 0 y Cy y 0 0 1 1 1 xp C x Where αx, αy: Focal length in retinal coordinates Cx, Cy: Principal point coordinates (affected by mounting of chip in camera) xp : Image coordinates C : A 3x3 matrix of intrinsic parameters. In an ideal camera, the lens distortion factor can be neglected [9]. (3.2) 23 3.2 Extrinsic Calibration: Extrinsic calibration computes mounting positions and mounting angles of the camera with respect to the static system of the world coordinates. These parameters of the camera are denoted as the position vector of the optical center ( tcw ) and three rotational angles wc 1, 2, 3 . Figure 3.2 represents the system of world and camera coordinates for one camera. The center of rear axle is considered as origin and projected on ground. The relationship between 3D points in camera and world coordinates is expressed as Xc RW C XW t CW (3.3) Where XW : 3D point in the world coordinate system X C : Point in the camera coordinate system tcw : Position vector of the optical center RWC : Rotational matrix of three rotational angles From the above equations, all points can be represented as xp C x C XC C RWC XW tCW According to this equation, the relationship between xp make it linear we can write bHp and XW is non linear. In order to 24 Where b : Observational vector H : Jacobian matrix of observation p : It is 6x1 matrix of position vector ( tcw ) and rotational angle ( wc ) Noise may be present in the operation ˆ b H.p Using a Weighted Least Square (WLS) solution for the linearized, over-determined equation system, leads to a minimum of the error measure ˆT K ˆ ̂ K: convergence matrix of observation . An iterative method is applied to find the final extrinsic parameter. In the proposed calibration technique, every sensor measurement accompanies self assessment parameter, like confidence and reliability measures. The convergence matrix K P can be calculated from the raw data of sensors. K P is considered as confidence measure. From equation (3.3), 3D relationship between two cameras can be given by X 2 R12 X 1 t 12 X 1 : 3D point of first camera 25 X 2 : 3D point of second camera R12 : Rotational matrix between camera 1 and camera 2 t 12 : Transformed translation vector between two cameras [9]. Figure 3.2: System of World Coordinates and Camera Coordinates [9] 26 Chapter 4 STEREOVISION GEOMETRY AND TRIANGULATION SYSTEM In a stereovision system, two cameras are used that capture the picture of the same scene at the same time. These cameras are separated by some distance, so that images captured by these cameras are slightly different. As an object comes closer to the cameras this difference increases. It means object distance is inversely proportional to this difference [10]. This difference is called “disparity”. The disparity value can be used to find the distance of an obstacle from the host vehicle. 4.1 Stereovision Geometry: Figure 4.1 shows stereo vision geometry. The optical centers of two cameras named Ol and Or are parallel to each other. They are separated by a distance ‘b’ called the baseline distance. These cameras have their separate image planes. The image planes are located at distance ‘f’ (focal length) from the optical center. The X axis of the reference coordinate system is parallel to the base line, and the Z axis of the reference coordinate system is parallel to the optical axis of the cameras. A point P(X, Y, Z) in the reference coordinate system, is projected on the left image plane and denoted as Pl (Xl, Yl). Its projection on the right plane is denoted as Pr (Xr, Yr) [11]. 27 Figure 4.1: Stereovision Geometry [11] 4.2 Epipolar Concept: In stereo vision, once one point is projected on the left image plane, the corresponding projected point in the right image plane needs to be found. Epipolar geometry can be used to find these corresponding points. Figure 4.1 shows that a point P and two optical centers OL and OR are in one plane which is called “epipolar plane”. The epipolar plane intersects each image plane where it forms epiploar lines Pl-EPl and PrEPr. For any position of point P, epipolar plane and epipolar line intersect epipoles. This property reduces the search area to find the corresponding point in the right image. If the projection of a point Pl is known, then we also know the epipolar line Pr-EPr. So, when a point P is projected on the right image plane, it will lie on this epipolar line [12]. 28 4.3 Triangulation System: Triangulation is the process in which the distance of the object from the camera is calculated. First consider the standard model as shown in Figure 4.2, which is used when both cameras are parallel. Points L and R are two pinhole points with parallel optical axes. The XY plane is parallel to the image plane. The distance between the two cameras is b. The optical center of the left camera is the origin of the world coordinate system. Figure 4.2: Triangulation Geometry for Standard Model [13] From figure 4.2, using simple geometry, the following equations can be derived Z b f x1 x 2 X x1 Z f 29 Y y1 Z f In the general model, the cameras are not parallel to each other as shown in figure 4.3. In this model it is assumed that the right camera can be moved in any direction with respect to the left camera. Figure 4.3: Triangulation Geometry for General Model [13] 1) Rotation around Y axis: When the right camera is rotated around the Y axis it will intersect the Z axis at point (0, 0, Z0). This point is called a fixation point. If the rotation angle is θ, then Z0 can be calculated by following equation [13]. Z0 b tan 30 By using this, the X, Y and Z coordinates can be found using the following equations X y1 Z f Y Z x1 Z f b f x1 x 2 f b Z0 2) Rotation around X axis: In this case, the right camera is assumed to be rotated around the X axis by angle θ. The rotation around the X axis will affect the Y coordinate only, the X and Z coordinate remain same [13]. X x1 Z f Y y1 Z f tan Z Z b f x1 x 2 31 Chapter 5 EGO MOTION ESTIMATION This chapter discusses ego motion estimation for stereovision system. In ego motion, 3D position of the camera is determined with respect to its surrounding. Ego motion estimation is normally divided into three steps. In the first step, the limited number of common features between two stereo image pairs is extracted. The new positions of these features in the successive images are determined in step two. In the final step ego motion is computed by comparing these images. The method discussed here uses geometric algebra to estimate the ego motion [15]. 5.1 Feature Selection and Matching: Harris and Stephen detection algorithm is used to detect the limited number of features in the left and right images [15]. These features are denoted as x and x’. Epipolar geometry can be used to find the corresponding features in the left and right images. The normalized cross correlation is used for matching of the features in the left and right image. The distance between two 3D points can be given by the following Euclidean equation Y R X t R: Rotational operation t: Translational operation (4.1) 32 Figure 5.1 represents stereovision geometry from which the following equation can be written x f X 3 (4.2) Figure 5.1: Stereo and Ego motion relationship [15] x and x’ are the projection of a point P in the left image and right image, respectively. A point P in the space can be represented by a vector X from the left camera and X’ from right camera. X can be calculated from X’ using the rotational matrix RC and translation matrix tC. When the whole system shifts to the other point, the new vector of a 33 point P is denoted by Y and can be calculated using rotational matrix RM and translation matrix tM. Yˆ RM X tM The distance of point P from the optical center of the camera is given by using scalars λi and µi given by i Xi 3 and i Yi 3 i (4.3) The initial value of λi and µi is obtained by projecting the vector Xi and Yi on the stereo camera image. Vectors Xi and Yi can be calculated by correlating the positions of features in the left and right images. However, these extracted new points will have an error due to noise in the sensors and their resolution. This error is calculated by projecting the resultant vector ( Ŷi ) of RM and tM at location ŷi and computing the square of the distance between yi and ŷi . The other problem is that projection of the vector from the first stereo image pair is considered but the projection of vector from second stereo image pair is neglected. The inverted motion is considered for projecting these vectors backward and minimizing error [15]. The initial value of RM and tM is calculated using least square estimator based on (4.1). An iterative method is used to optimize the estimated motion, vectors Xi, Yi and scalars λi, µi [15]. 34 5.2 Estimation of tM: Translation matrix tM can be obtained by taking the mean of reconstruction product for the cameras and vector Xi, Yi. tM 1 n t Xi , Yi t Xi , Yi t Xi , Yi t Xi , Yi 4n i1 (4.4) Where Xi i xi Yi i yi 5.3 Estimation of RM: Rotational matrix RM can be calculated from X , Yi , Xi , Yi and translation matrix tM. The two rotational motions, R1 and R2, are considered. R1 is rotation of the reconstructed vector of the first stereo pair into the vector µ(Yi). Similarly R2 is rotation of the reconstructed vector of the second stereo pair into the vector µ(Xi). A geometric algebra method is considered for calculation of rotations. In geometric algebra rotation is calculated using spinors, which is sum of one scalar and three scalar-bivector pair [15]. R 1 21 2 3 2 3 41 3 The final value of RM is calculated by the following equation 35 R1 R 2 1 RM || R1 R 2 1 || 5.4 Estimation of Xi and Yi: The values of Xi and Yi are obtained by taking the average motion for jumping between the two cameras and the estimated ego motion of these values: Xi RM 1 s Yi, Yi ' tM s Xi, Xi ' Yi RM s Xi, Xi ' tM s Yi, Yi ' 5.5 Estimation of λ and µ: The values of λ and µ are obtained by projecting vectors Xi and Yi on stereo camera image: i i Yi tM RM xi Xi x 2 xi 2 RM Xi tM RM 1 yi Yi. yi 2 yi 2 Advantages 1) There is less computation cost as stereovision technique is used for estimation of ego motion. 2) As geometric algebra is used, this system can calculate exact Euclidean motion [15]. 36 Chapter 6 SIMULATION AND CALCULATION 6.1 Introduction: In this chapter, a collision detection system is suggested and tested. In this system two small RC cars are used. Two cameras are placed on Car1 with fixed distance. These two cameras capture the video in front of the car. The video is converted into frames. An object is detected in the captured frames and the coordinates of centroid of the object are found. A stereovision triangulation system is used to find out the distance of the object from the car. As the car approaches the object, this distance decreases. A collision corresponds to a zero distance. Figure 6.1: Stereovision System [14] 37 Figure 6.2: Disparity Calculation [14] In figure 6.1, point P is the centroid of other car in our system. It is projected on both left and right image planes. Their coordinates in the image are denoted as P 1(x1,y1) and P2(x2,y2). C1 and C2 are the optical centers of the cameras. They are located at distance f (focal length) from the projected point. The optical centers of the cameras are separated by distance b. The distance between a point P and the host vehicle is calculated using the disparity of point P in the left and right images. This disparity is equal to the horizontal shift of point P from P1 to P2 in the image plane [14]. As both cameras are mounted at the same level there is no vertical disparity in the images. By using stereovision triangulation geometry, we obtain D b f Disparity D b f x1 x 2 38 6.2 Case I: In case I, Car1 is not moving and Car2 is moving towards Car1. The focal length of the cameras is same and the distance between them is 22 cm. Frame No X Coordinate of X Coordinate of Distance Between Centroid in Left Centroid in Right Two Car Image (x1) Image (x2) 25 294.57 294.7 3.85 35 293.23 294.1 0.85 45 290.9 293.52 0.30 55 284.5 294.2 0.08 65 262.6 312.0 0.01 D b f x1 x 2 Table 6.1: Distance Between the Two Cars in Case I From the above equation, we can say that disparity is inversely proportional to the distance between the two objects. As the disparity increases, distance between the two cars decreases. Table 6.1 shows that as Car2 come closer to Car1, the X coordinate of the centroid in the left and the right images are shifted away from each other. This causes to increase in the disparity between two corresponding points. So as time interval increases, distance D between the two cars decreases and at time interval 65 unit, they almost collide with each other. 39 Snap shots taken at different time intervals, are shown in the following figures. The snap shots show that as time interval increases, difference between the position of car in the left and right frame increases. Frame Left Frame Right Frame No 25 35 Figure 6.3: Different Position of the Car in Case I 40 45 55 65 Figure 6.3: Different Position of the Car in Case I (continued) 41 6.3 Case II: In this case Car1 is moving and Car2 is still. The distance between two cameras is 13cm and the focal length of the cameras is 35mm. Frame No X Coordinate of X Coordinate of Distance Between Centroid in Left Centroid in Right Two Car Image (x1) Image (x2) 150 297.20 306.71 0.047 170 297.10 307.16 0.045 200 298.14 308.31 0.044 230 299.58 309.89 0.044 270 299.55 316.46 0.026 300 299.0 333.97 0.013 320 297.94 345.50 0.009 D b f x1 x 2 Table 6.2: Distance between the Two Cars in Case II Table 6.2 shows that as the time interval is increasing, the X coordinate of the centroid in the right image is shifting away from its initial position. This causes to increase the difference between the position of the centroid in the left and the right image. As this difference increases, disparity increases and distance between the two cars decreases. At time interval 320 unit, the two cars are colliding. 42 Snap shots taken at different time intervals, are shown in the following figures. The snap shots show that as time interval increases, difference between the position of car in the left and right frame increases. Frame Left Frame Right Frame No 150 170 Figure 6.4: Different Positions of the Car in Case II 43 Frame Left Frame Right Frame No 200 230 Figure 6.4: Different Positions of the Car in Case II (continued) 44 Frame Left Frame Right Frame No 270 300 Figure 6.4: Different Positions of the Car in Case II (continued) 45 Frame Left Frame Right Frame No 320 Figure 6.4: Different Positions of the Car in Case II (continued) 46 6.4 Case III: In Case III, both cars are moving in forward direction with different speeds. Frame No X Coordinate of X Coordinate of Distance Between Centroid in Left Centroid in Right Two Car Image (x1) Image (x2) 50 260.38 287.76 0.015 100 265.44 284.31 0.022 150 266.53 283.09 0.025 200 276.54 285.05 0.049 250 279.74 288.30 0.049 300 284.63 291.01 0.065 D b f x1 x 2 Table 6.3: Distance between Two Cars in Case III Table 6.3 shows that as time interval increases, the X coordinates of the centroid in the left and right image comes closer. This causes to decrease the difference between the position of centroid in the left and right image. As this difference decreases, disparity decreases and the distance between the two cars increases. If we compare the entries of Table 6.5, it shows that with increase in the time interval, disparity decreases and distance between the two cars increases. As the distance between the two cars is increasing, possibility of the collision becomes smaller. 47 Snap shots taken at different time intervals, are shown in the following figures. The snap shots show that as time interval increases, difference between the position of car in the left and right frame decreases. Frame Left Frame Right Frame No 50 100 Figure 6.5: Different Position of Car in Case III 48 Frame Left Frame Right Frame No 150 200 Figure 6.5: Different Position of Car in Case III (continued) 49 Frame Left Frame Right Frame No 250 300 Figure 6.5: Different Position of Car in Case III (continued) 50 6.5 Analysis of Results: Figure: 6.6 Graph of Results Figure 6.3 shows the graph of the distance as a function of the disparity for all cases. The graph shows that as disparity (x2-x1) increases, the distance D between the two cars decreases. This shows that distance D is inversely proportional to the disparity of the centroid. From figure 6.6, we can say that as objects come closer to the host vehicle disparity of the centroid increases. When the disparity of centroid becomes 50 units, the two cars almost collide each other. 51 Chapter 7 CONCLUSION AND FUTURE WORKS Conclusion: In this project, collision detection using stereovision geometry and triangulation system is studied. An algorithm for collision detection is developed. The algorithm is implemented in MATLAB and results were used to determine the possibility of collision. A number of different cases are considered to test the algorithm. This design can be used in different areas such as automobiles, robotics, aviation, etc. Future Work: Collision detection is one of the most important features of ADAS. We can integrate stereovision collision detection systems with the Advanced Cruise Control (ACC) feature of ADAS. When the distance between two cars become less than the threshold distance, the host car will start applying break to prevent collision. We can also provide the output of collision detection system to air bag system in the car. So, when collision is detected, air bag will come out and it will save the driver from the fatal injuries. The proposed design can also be used for lane change decision. When the driver intended to change the lane, this system will detect the any vehicle present behind the host vehicle. The distance between two vehicles is calculated and the driver will be warned, if there is possibility of the collision. 52 APPENDIX Following MATLAB code is used for collision detection. 1. To Play Video: video = mmread('left.mpg',[1:80],[],false,true); movie(video.frames) video = mmread('right.mpg',[1:80],[],false,true); movie(video.frames) 2. To Convert Frame into Image: [x,map] = frame2im(video.frames(80)); a = [x,map]; imshow(a); 3. Write the image into the Jpeg File: imwrite(a,'r1.jpeg'); 4. To Read the Images from file: Left_I = imread('l5.JPEG'); 5. Function to find centroid of the image: function [meanx,meany] = ait_centroid(Left_I) [x,y,z] = size(Left_I); if(z==1); else Left_I = rgb2gray(Left_I); end im = Left_I; [rows,cols] = size(im); x = ones(rows,1)*[1:cols]; y = [1:rows]'*ones(1,cols); area = sum(sum(im)); meanx = sum(sum(double(im).*x))/area; meany = sum(sum(double(im).*y))/area; 53 6. To Find Centroid of Image Left_I = imread('l5.JPEG'); [x,y] = ait_centroid(Left_I); x y imshow(Left_I); pixval on 7. For Analysis and Plotting Graph of Result clear all; x1=[293.23,290.9,284.5,262.6]; %X Coordinate of centroid for case I x2=[294.1,293.53,294.2,312]; %X Coordinate of centroid for case I y1=[297.20,297.10,298.14,299.58,299.55,299,294.94]; y2=[306.71,307.16,308.31,309.89,316.46,333.97,345.50 z1=[260.38,265.44,266.53,276.54,279.74,284.63]; z2=[287.76,284.31,283.09,285.05,288.30,291.01]; c= (x2-x1);% Disparity for Case I c1=(y2-y1);% Disparity for Case II c2=(z2-z1);% Disparity for Case III p= [0.77,0.77,0.77,0.77];% b*f for case I p1=[0.455,0.455,0.455,0.455,0.455,0.455,0.455];% b*f for case II p2=[0.42,0.42,0.42,0.42,0.42,0.42];% b*f for case III z= p./c; % Calculation of distance D for case I z1= p1./c1;% Calculation of distance D for case II z2= p2./c2;% Calculation of distance D for case III plot(c,z,c1,z1,'r',c2,z2,'g')% Plotting graph for all cases xlabel('Disparity(x2-x1)'); ylabel ('Distance(D)'); title('Analysis of Results'); legend('Case I', 'CaseII','Case III'); 54 REFERENCES [1] Advanced Driver Assistance Systems [Online] Available: http://www.contionline.com/generator/www/de/en/continental/automotive/themes/passenger_cars/chassis_ safety/adas/ov1_adas_en.html Date: 2/2/2011. [2] Ford Safety [Online] Available: http://www.ford.com Date: 2/2/2011. [3] Car Safety [Online] Available: http://www.tarbuiot.com/vehicle-articles/54-success-of-driver-assistancesystems-for-improving-car-safety-depends-on-acceptance-and-use-of-technologies-bydrivers Date: 2/2/2011. [4] Ming Bai; Yan Zhuang; Wei Wang; , "Stereovision based obstacle detection approach for mobile robot navigation," Intelligent Control and Information Processing (ICICIP), 2010 International Conference on , vol., no., pp.328-333, 13-15 Aug. 2010 doi: 10.1109/ICICIP.2010.5565220 [5] Nedevschi, S.; Vatavu, A.; Oniga, F.; Meinecke, M.M.; , "Forward collision detection using a Stereo Vision System," Intelligent Computer Communication and Processing, 55 2008. ICCP 2008. 4th International Conference on , vol., no., pp.115-122, 28-30 Aug. 2008 doi: 10.1109/ICCP.2008.4648362 [6] Chung-Cheng Chiu; Wen-Chung Chen; Min-Yu Ku; Yuh-Jiun Liu; , "Asynchronous stereo vision system for front-vehicle detection," Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on , vol., no., pp.965968, 19-24 April 2009 doi: 10.1109/ICASSP.2009.4959746 [7] http://en.wikipedia.org/wiki/Edge_detection Date: 3/5/2011. [8] Foggia, P.; Limongiello, A.; Vento, M.; , "A real-time stereo-vision system for moving object and obstacle detection in AVG and AMR applications," Computer Architecture for Machine Perception, 2005. CAMP 2005. Proceedings. Seventh International Workshop on , vol., no., pp. 58- 63, 4-6 July 2005 doi: 10.1109/CAMP.2005.6 [9] Ernst, S.; Stiller, C.; Goldbeck, J.; Roessig, C.; "Camera calibration for lane and obstacle detection," Intelligent Transportation Systems, 1999. Proceedings. 1999 IEEE/IEEJ/JSAI International Conference on , vol., no., pp.356-361, 1999 doi: 10.1109/ITSC.1999.821081 [10] Maryum, Ahmed F. Development of a Stereo Vision System for Outdoor Mobile Robots. Rep. University of Florida, 2006. Web. <http://cimar.mae.ufl.edu/CIMAR/pages/thesis/ahmed_m.pdf>. 56 [11] Danyan, G. Stereo Vision Algorithm Using Propagation of Correspondences Along an Array of Cameras. Rep. 2005 ed. Vol. 1. NJIT. Web. <http://archives.njit.edu/vol01/etd/2000s/2005/njit-etd2005-049/njit-etd2005-049.pdf>. [12] http://en.wikipedia.org/wiki/Epipolar_geometry [Online] Date 5/2/2011 [13] http://www.dis.uniroma1.it/~iocchi/stereo/triang.html [Online] 5/10/2011 [14] http://disparity.wikidot.com/triangulation-geometrics [Online] 5/20/2011 [15] van der Mark, W.; Fontijne, D.; Dorst, L.; Groen, F.C.A.; , "Vehicle ego-motion estimation with geometric algebra," Intelligent Vehicle Symposium, 2002. IEEE , vol.1, no., pp. 58- 63 vol.1, 17-21 June 2002 doi: 10.1109/IVS.2002.1187928 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1187928&isnumber=26631