ECE479 Project Report Bohr Robot 3/18/2011 Arada, JC (aradaj@pdx.edu) Chhokar, John ( jchhokar1980@gmail.com) Dixon, Rich (radixoniv@yahoo.com) This paper is part of a project for the Niels Bohr Robot at Portland State University. Introduction ......................................................................................................................................3 Face Detection and the Haar Cascade .................................................................................................4 Integral Image ............................................................................................................................................ 4 Haar Rectangles ......................................................................................................................................... 5 Haar Detection Cascade ............................................................................................................................. 5 Face Detection Algorithm .......................................................................................................................... 6 Face Detection You Tube .......................................................................................................................... 7 Color Detection .................................................................................................................................8 OpenCv Functions ..................................................................................................................................... 8 Color Detection Examples ....................................................................................................................... 10 Color Detection on Niels Bohr ................................................................................................................ 11 Color Detection You Tube....................................................................................................................... 12 Obstacle Avoidance......................................................................................................................... 13 OpenCv Functions ................................................................................................................................... 13 Canny Edge Background .......................................................................................................................... 14 Canny Edge Example ............................................................................................................................... 15 Obstacle Avoidance on Bohr ................................................................................................................... 16 Conclusion and Future Work........................................................................................................... 17 SVN ......................................................................................................................................................... 17 References....................................................................................................................................... 18 2 Bohr Robot Project Introduction Image processing has been an emerging field in electronics as technology advancements in semiconductor technology make image capturing devices such as cameras and webcams more affordable. Not only is it an emerging field but it is at the core of research done at universities and companies across the world. Image processing in the medical field is at the forefront of this research as early cancer detection continues to save millions of lives. There is a lot of top secret research done by the military where image processing helps to identify objects such as tanks, insurgent hideouts and weapon stockpiles. Image processing in robotics is a fundamental tool to create autonomous robots which can detect obstacles or locate objects. In addition to research, a free software library created by Intel called OpenCV is being used as one of the primary tools in image processing. OpenCV based on C, combines mathematical operations and several powerful image capturing tools. OpenCV was created to take advantages of Intel’s multi-core processors. The goals of OpenCV when released was to advance vision research , make it easier for developers to build upon vision software and advance vision applications by making it portable and free. Open CV supports a large area of functions: multidimensional feature toolkits face recognition systems, gesture recognition, object identification, motion tracking, boosting, decision tree learning, neural networks, segmentation, stereopsis and many more. All of these ideas were taught by Dr. Perkowski in 478 and 479 and made OpenCV even more appealing to use as it closely modeled the theory taught in class. In ECE479, the Bohr group looks to implement vision on the Niels Bohr robot. The goal is to learn OpenCV and implement a couple of programs which demonstrate an autonomous robot. We will be using a webcam attached to the Bohr robot as the primary source for capturing images. The three programs we will implement this term are: face detection following robot, color detection arm tracking robot, and obstacle avoidance robot. All three of these programs will implement powerful algorithms which will be described in detail later in this paper. In the face detection program we will show that it is possible to create a robot which will follow the face of a doll. Initially the robot will capture frames and look for the appearance of a face. When a face is detected in the frame the robot will compute the coordinates of the frame and make a decision which will cause the robot to move towards the doll. The robot will continue to make decisions until the user stops the program. The robot will have 2 degrees of freedom. In the color detection robot demonstration, we will show that the robot arm will move towards a red colored object. In this program the webcam will be centered on the robot arm and the arm will try to center the red object in the center of the frame. In the last program we will introduce the obstacle avoidance robot. In this program we utilize algorithms in OpenCV called canny edge detection and Hough transforms. The robot will make decisions based on the slopes of lines in frames captured relatively close to the robot. In addition to the program descriptions found in this paper, links to video demonstrations will be added to the paper. 3 Bohr Robot Project Face Detection and the Haar Cascade OpenCV makes face detection really easy. In order to implement the face detection function one must first learn the theory behind the algorithms. OpenCv makes face detection easy by using a Haar Cascade, which originates from the Haar transform first published in 1909 by Alfred Haar. The Haar transform cross-multiplies a function against the Haar wavelet and performs several sifts and stretches. Haar’s idea showed that several rows in a matrix can be sampled and data can be computed to act as samples of a finer resolution. Using Haar’s idea, Paul Viola and Michael Jones published a paper in 2001 at the conference on Computer Vision and Pattern Recognition. Viola and Jones showed that it is possible to pre-process an image using a technique called the Integral Image, place Haar rectangles over the image to help identify Haar like features, analyze the information using filtering techniques and then identify the face in an image. The paper published in 2001 used a 700Mhz Intel Pentium III processor for data processing and showed that it improved face detection by 15% when compared to other methods. To understand face detection better we will discuss the theory behind each subsection. Integral Image To improve the speed of face recognition, Viola-Jones proposed an idea called the Integral Image. The Integral Image is an intermediate representation of an image. The integral image at any location (x, y) can be calculated by taking the sum of the pixels above and to the left of that (x,y), figure 1 shows this mathematical computation below. ๐๐(๐ฅ, ๐ฆ) = ∑ ๐(๐ฅ ′ , ๐ฆ ′ ), ๐ฅ ′ ≤๐ฅ,๐ฆ′≤๐ฆ Figure 1. Integral Image Calculation To get a better idea on how this computation works we can begin by taking a look at one pixel in a sub region. The Integral image of the point (x, y) for the sub region shown in figure 2 is the sum of all pixels to the left and above. Therefore if we placed a rectangle inside of an image, we can quickly compute the value of this pixel with only four array references. For example, we can compute the values of points (1, 2, 3 or 4) with the following formulas. Since we can make this computation using few operations per pixel Haar-like features can be computed at any feature size or location. 4=A+B+C+D 3=A+C 2=A+B 1=A Where D = (x4, y4) – (x2, y2) – (x3, y3) + (x1, y1) Figure 2. The Integral Image 4 Bohr Robot Project Haar Rectangles After an image has been preprocessed with the Integral Image Haar rectangles are used to find Haar features. Haar rectangles shown in figure 3 are passed over the image to compute pixels in each region. When a Haar rectangle is placed onto an image, the sum of pixels which lie in the white rectangles are subtracted from the sum of pixels within black regions. This is shown by the formulas below. ๐1 = ๐๐ข๐(๐๐,๐คโ๐๐ก๐ ) − ๐๐ข๐(๐๐,๐๐๐๐๐ ) โ(๐ฅ) = { 1, −1, ๐๐ ๐๐ > ๐กโ๐๐๐ โ๐๐๐ ๐๐ ๐๐ < ๐กโ๐๐๐ โ๐๐๐ Figure 3. Haar Rectangles Using Haar Rectangles over a pre-processed Integral Image is an attractive feature. Applying a Haar rectangle over a face quickly identifies the differences between the eye region and the cheek region. The eye region is often darker than the cheek region. In addition, Haar rectangles can quickly identify the face by looking at the difference between the intensity across the bridge of the nose when compared to the eye region. Filtering which will be discussed next, places several Haar rectangles over a face to make a decision on whether a face is detected. Haar rectangles do not need to be limited to only two feature rectangles. One can see above that there are 3 two feature rectangles and one 3 feature rectangles. Haar Detection Cascade The idea behind the Haar Detection cascade is to eliminate negative examples with very little processing. A series of classifiers are computed to every sub-region in the image. If the sub region does not pass all of the classifiers than that image is released and further computation is performed. If a sub region passes the first stage which requires little computation, it is then passed onto the next stage where a little more computation is needed. If the image passes this classifier it is then passed to another sub region. In order for face detection to occur, an image sub region must pass all of these classifiers. One can train a classifier to improve its accuracy, but the classifier in OpenCV for face detection works just fine for out purpose. Figure 4. Haar Classifier 5 Bohr Robot Project Face Detection Algorithm To implement face detection on the Niels Bohr robot we implemented an algorithm which allows the robot to either move forward, turn right, turn left or stop based on decision criteria. To improve processing speed we begin by reducing the size of the captured image from a frame width of 640 to 320 and a frame height to 240 from 480. This reduces the size of the captured frame and gets rid of a lot of processing time by allowing the robot to analyze information inside of a smaller window. We also decided to reduce to size of the rectangle which is drawn on a detected face from 40 x 50 to 20 x 25. This allows our face recognition system to only follow the face of a doll or a face of a smaller object. The face of a person is sometimes recognized but it is mostly discarded in our program. Below is the pseudo code we implemented using the Visual Studio C++ Programming environment. This Algorithm allows the robot to respond to the face of the robot for 500ms before the motors are turned off. One improvement can be made to improve the smoothness of the controls. With our current algorithm the robot ramps up power for 500ms and then quickly ramps down power. We can look at improving this by smoothing up the ramp down of the motors. The ramp up of the motors is necessary to give the robot momentum to move. Note: We are using a webcam attached to the Bohr robot to capture images. The robot commands are being sent out using Bluetooth protocol we setup which gives us control to all motors and arm movements. While (true) { Capture Image Send Image to detectfaces function(x-pos, y -pos) If (x-pos > 0 && < left max ) Move Left with 20% Power Wait 500ms Stop motors 0% Power Else if(x-pos > left max && x-pos < right max) Move Straight with 20% Power Wait 500ms Stop motors 0% Power Else if(x-pos > right max && x-pos < x-videowidth) Move Right with 20% Power Wait 500ms Stop motors 0% Power } // Detect Face Function which include the Haar Cascade Classifier XML Detectfaces(x-pos,y,pos) { Import XML Haar Cascade File Detect Face using cvHaarDetectObjects function Draw rectangle around face in image Calculate (x, y) coordinate of rectangle Return (x, y) } 6 Bohr Robot Project Face Detection You Tube A demonstration of this algorithm has been uploaded to You Tube. It shows the Bohr robot tracking the face of a child’s doll. http://www.youtube.com/watch?v=GvhvUv7BZxI 7 Bohr Robot Project Color Detection The second idea we implemented on the Bohr robot was color detection. In the previous example program we showed that we could have the robot follow the face of a doll. In this program we wanted to use the functionality of the robot arm and have it track the color of a ball. Color tracking in OpenCV is easy due to several built in functions. We will begin by talking about a few powerful functions in OpenCV which allowed us to implement color detection successfully. OpenCv Functions CvScalar (H, S, V, double) CvScalar allows one to represent a four dimensional vector, each vector in a double format. We use CvScalar in our program to set the threshold values to filter out all colors except for red. One CvScalar represented as hsv_min is the lower threshold scalar vectors and hsv_max is the upper threshold scalar vectors. The values inside of the CvScalar can be any values, and in our case we setup this scalar using HSV (Hue, Saturation, Values) where H represents the color as an angle from 0 to 360, S indicates the richness of the color and V indicates the brightness of the color. This format is attractive since it makes it easier to work with to detect colors. We chose the color red because red wraps around the Hue saturation cylinder and makes it an easy color to filter for. Figure 5. Hue Saturation cvCvtColor (frame, hsv_frame, CV_BGR2HSV); This function in OpenCV converts a frame from one color space to another in one line of code. In our program we converted the frame from RGB to HSV. The original frame is passed to the function as the variable frame, the next entry called hsv_frame is where the transformed image will be stored after the CV_BGR2HSV is applied. One could have also used this function to convert from RGB to grey scale by editing the last entry in the parenthesis from CV_BGR2HSV -> CV_BGR2GRAY. A general form of this function is: cvCvtColor (color image, HSV Image, Transform applied) Equations used to convert RGB to HSV: 8 Bohr Robot Project cvInRangeS(hsv_frame, hsv_min, hsv_max, thresholded); The cvInRangS function checks to array elements to see if elements lie between two scalars. Earlier we described how to setup the scalar vectors which are passed into this function. This function takes the hsv_frame as input for the first frame, compares to the inclusive lower boundary and exclusive upper boundary. Each pixel in the hsv_frame is compared to the lower and upper scalar boundary (hsv_min, hsv_max). If the value is greater than or equal to the value set by hsv_min, and less than the value in hsv_max the value for that calculation is set to 0xff, else this pixel is turned off with a zero. If the scalar vectors are setup correctly one would be left with only the red pixels in an image. The thresholded entry in the function is a location where the results are stored after the cvInRangeS function is applied. The thresholded image is then passed to the next function. cvSmooth. cvSmooth (thresholded, thresholded, CV_GAUSSIAN, 9, 9 ); cvSmooth is a common transform applied to an image right before you use a command to either draw a box or circle around an object in an image frame. The cvSmooth function parameters are as follows: cvSmooth (source image, destination image, Type of Smoothing, Aperture Width (must be odd), Aperture Height (must be odd)). We are not limited to perform a Gaussian smoothing, other techniques are available such as: CV_Blur, CV_Median, CV_Bilateral. Applying a Gaussian smooth operation to an image is attractive because it reduces the information of an image and speeds up direct convolution with a Y x Y kernel. In our example of the area of our Kernel is set at 9 X 9. cvHoughCircles (thresholded, storage, CV_HOUGH_GRADIENT, 2, thresholded->height/4, 100, 50, 10, 400); cvHoughCircles finds circles in a grayscale image using our thresholded image which has already been transformed by the hough transform. The second parameter called storage is a place in memory where coordinates of the circles can be placed. The third parameter called the method must always be set to the Hough_Gradient. This is specified in the OpenCV book by Bradski and Kaehler. After the method is a parameter called the dp. This is sed to specify the resolution of the accumulator. The next parameter is the minimum distance at which the centers of circles may lie. One does not want this parameter to be too small to reduce the amount of false detections. The next two numbers in the function are the higher threshold, and accumulator threshold values at the center of the circles. These two values in the OpenCV book are set at 100 and 300 respectively. They can be modified for optimal performance. The last two parameters are usd to set the min and max radius of circles to search for. Editing these parameters one can filter out smaller images and search for larger images or vice versa. The general form of the equation is shown below. cvHoughCircles (image, circle_storage, method, dp, min_dist, param1, param2, min_radius, max_radius); After circles are drawn around an object, our program can then capture the object as it moves across the screen and feedback the coordinates back to Bohr. 9 Bohr Robot Project Color Detection Examples For this explanation we will begin by showing the results of processing red roses through our program. Our program is optimized to detect circle objects, but we were interested in seeing how red objects in general would affect our program. First we show the image after all pixels have been filtered out except for the red pixels. Figure 6 shows the results of using min and max thresholds to filter out pixels which are not red. Filter 7 then shows that we can capture a red object in a distance even though it is not a ball. In figure 8 we show that too much red, too close is not good and creates considerable amounts of noise in the image. To prevent noise, we average the value of the Hough circles if there is more than one present in the image. However, for the demonstration our group will not be able to where anything red or else if will affect the program. Figure 9 shows the ball that we will be using for our arm tracking robot. The coordinates of the ball will be feedback to the robot to allow for arm tracking. Figure 6. Filtering for Red Figure 8. Too much too close not good 10 Bohr Robot Project Figure 7. Detecting Red Roses Figure 9. Red Ball Detected Color Detection on Niels Bohr Using the functions discussed in the previous section of this paper we were successfully able to add color detection to the Niels Bohr robot. For this demonstration we put the webcam onto the middle of the robot arm and had the arm track the movement of a red colored baby toy. We also will show that the robot arm does not respond to a green tennis ball. The pseudo code below shows our algorithm briefly. Initially we had problems with the robot sending out signals out too fast. Our camera captures images at 60fps and we had the robot arm responding to the ball locations at each frame. One improvement we made was to create a time averaging matrix which captured the location of the ball over time. We then could send signals to the robot arm to move based on an average of the ball location. This was a good improvement but we later implemented a better solution. Using the clock function in C++ we could control how often signals were being sent out. We combined the time averaging matrix and the clock function to implement the arm tracking robot which will be shown in the demonstration. While (true) { Capture Image Convert Image from RGB to HSV Filter out all pixels not red in HSV Image Gaussian Smooth Image Draw circles around red objects Record location of red objects in array If clock >= standytime { If (average matrix > 0 and < x_min_lowerbound) { Move Arm Right } If (average matrix > x_upperbound< x_max) { Move arm left } } } Update standby time With this algorithm we can control the decision making of the Bohr robot by increasing or decreasing the standby time along with the array size which is tracking the x-coordinates of the moving red object over time. 11 Bohr Robot Project Color Detection You Tube A demonstration of this algorithm has been uploaded to You Tube. It shows the Bohr robot tracking a baby’s red color ball. When the robot is presented with a green tennis ball the robot does not react to any movements. http://www.youtube.com/watch?v=hhvxku6EsVM http://www.youtube.com/watch?v=v3ZaO9KX4Vc Current work being done on the robot includes the movement of the wrist and hand. 12 Bohr Robot Project Obstacle Avoidance The last program implemented on the Bohr robot is the obstacle avoidance program. We decided to use a couple of functions in OpenCV which would allowed us to easily capture edges. After edges are identified in the frame, the slopes of the lines are calculated and sent to the robot to make a decision based on the value of the slope. Before we look at the demonstration we will introduce the OpenCV functions used in the program. OpenCv Functions cvCvtColor (frame, grey, CV_BGR2GRAY); The cvCvtColor function allows for easy transformation between color spaces. The three variables in the parenthesis represent the following: (source, desitnation, code). The source variable is the frame that is captured by the webcam, grey will hold the transformed image of the frame and the CV_BGR2GRAY converts from color to grey. In one line we can easily convert any color image to a grey scale. This is important to note when working with Canny Edge Detection. If you pass a color image to the canny edge detection function you will get memory errors. Canny Edge functions can only be performed on grayscale images. After the conversion is complete we then pass the grey frame to the next function. cvSmooth( grey, grey, CV_GAUSSIAN, 9, 9 ); cvSmooth was not initially used in our program but we added it in to help eliminate the reflection of the carpet. This is a popular function as it allows you to perform a smooth operation on the image. You are not limited to performing a CV_GAUSSIAN smooth on the image, one could alternately perform a simple blur, median blur or even a bilateral filter. There are five different variables in this function. Similarly to the previous function. The first two represent the source and destination variables. The third is the transformation performed on the image. The final two values relate to the aperture width and aperture height. cvCanny( grey, edges, 50, 150, 3); cvCanny is a really fun function to play with and the speed of the transformation is very impressive. To use this well you have to really fine tune the thresholds. Similarly to the two previous functions, the first two variables in parenthesis are the input and output image variables. The next two variables hold the threshold information. The last number is the aperture parameter. Experimenting with the threshold information is important to detect edges but at the same time removing unwanted features from a frame. The cvCanny function looks to find segments in edges where the gradients are large. 13 Bohr Robot Project Canny Edge Background Here is a little back ground and theory on canny edge detection. John F. Canny, an Australian computer scientist who now teaches at University of Berkeley developed and refined this form of edge detection. To understand the canny edge detection algorithm one would have to read his paper on edge detection which was published in 1983 entitled “A Variational Approach to Edge Detection”. The following is a short synopsis of the functions detailed in the paper. The Canny function applies multiple mathematical operations to achieve our desired output image. The first mathematical operation that is performed, on our 8 bit gray scale input image, is a calculation of the image gradient. This operation is already built in the cvCanny function, that was discussed above , but if we wanted to apply it without this function we could achieve this with applying a 3x3 Sobel convolution mask (figure 10). The return output of this operation is a directional vector in the x or y direction. The next operation performed is the calculation of the vector or gradient of each pixel. The magnitude and angle of the vector pixel can then be calculated by the equations found below (figure 11). Once the gradient has been calculated for each pixel then we can search the image for maximum values in the field. The maximum values found then can be considered our edge pixels. This process is known as Non-Maximum suppression. Finally to form full edges from edge pixels we apply a method of thresholding with hysteresis. The idle behind this consept to define the level of intensity of edge pixels that will be displayed in our output image. Two thesholding values are selected low and high. All edge pixels that are below the low value are rejected and the pixels above the high value are accepted. The pixels that are between the low and high value are usually rejected unless the edge pixel is connected to a pixel above the high value. This allows the continuation of an edge or a line. Figure 10. Sobel Mask 14 Figure 11. Sobel Mask Bohr Robot Project Canny Edge Example Canny edge on an image can be performed using only two lines of code. The pictures below show the canny conversion of a hallway using the functions cvCvtColor and cvCanny. The next goal is then to draw lines on the edges. Figure 12. Canny Edge Figure 13. Color frame cvHoughLines2 (img1, storage, CV_HOUGH_PROBABILISTIC, 1, CV_PI/180, 50, 50, 10); The cvHoughLines2 function enables us to draw lines on straight lines in an image. After performing the canny edge transformation, we then pass this image to the Hough Lines function. The cvHoughLines2 function finds lines in a binary image. We will walk through each variable in the parenthesis of the function. The first variable in the parenthesis entitled img1 is the frame which is being passed to the function. It’s important that this image be a grey scale image. The second parameter is the location where the lines that are detected can be stored. The third variable called the method, gives you the option to implement three different types of the Hough transform. The probabilistic method of the Hough transform, allows one to return line segments instead of full line segments. Because we were only interested in the slope of a line segment, the Probabilistic Hough Transform was an ideal transform to apply. The next five parameters define rho, theta, threshold, param1 and param2. The frame in figure 14, shows the result of applying the Hough lines transform. Figure 14. Hough Lines 15 Bohr Robot Project Obstacle Avoidance on Bohr Obstacle avoidance on Bohr turned out to be more challenging than our group thought. The basic algorithm of our program is shown below. We detect an edge by calculating the slopes of the lines in an image. A full size frame has several lines which caused the robot to make decisions well before the robot made it to the wall. Begin Robot moving forward While (frame) { Capture Frame 640 x 480 Convert frame to grey image Perform Gaussian smooth Perform Edge Canny Perform Hough Lines Calculate Slopes of Hough Lines Analyze small ROI If Edge Detected { Robot Reverse a set amount of time Randomly select Right and Left turn for a set amount of time Stop Robot Straight } } To improve our algorithm we had to fine tune the region of interest so that the robot made decisions on a small window. The trouble we ran into was that we had to fine tune the location on the robot and the ROI window. Figure 15 below shows three different regions of interest where we are only making decisions of a subset of the frame captured. This was a big improvement on our robot. 16 Bohr Robot Project Conclusion and Future Work This term the Bohr robot group implemented three vision concepts, Face Detection, Color Detection and Obstacle Avoidance. In addition to learning OpenCV we also learned how to implement these vision concepts in robotics. In Face Detection, we learned about the mathematical calculations and models which are used in the Haar Cascade Classifier in OpenCV. In the color detection program we learned how to convert from color spaces, filter colors, and perform smoothing operations to easily identify circular objects. Finally in the obstacle avoidance program, we learned how to use canny edge detection and Hough lines in unison to identify edges in an image. In the beginning of the term, OpenCV was very new to our group. After working with it for only three months, we have a really good understanding of the language and how to apply the transforms to accomplish vision and decision making which are fundamental to any robot. This class has also shown us that there is a lot of opportunity in the field of vision to do a lot of research. We hope to continue to apply the principles learned in ECE479 and apply them towards a Master’s Thesis topic in graduate school. SVN All the work for ECE478/479 from the Bohr robot group has been uploaded to our google SVN site. http://ece478robot.googlecode.com/svn/trunk/ 17 Bohr Robot Project References Bradski, G., & Kaehler, A. (2008). OpenCV. Sebastopol: O'Reilly Media. Canny, J. (2002). A VARIATIONAL APPROACH TO EDGE DETECTION. MIT Artificial Intelligence Labaratory, (p. 584). MA. Intel Open Source Computer Vision Library. (1999-2001). Retrieved 1 13, 2011, from Intel Developer Center: http://hawaiilibrary.net/eBooks/Give-Away/Technical_eBooks/OpenCVReferenceManual.pdf Kuntz, N. (2008). Chapters 1 and 2. Retrieved 1 5, 2011, from OpenCV Tutorial: http://dasl.mem.drexel.edu/~noahKuntz/openCVTut1.html OpenCV (Open Source Computer Vision) . ( , ). Retrieved 2 11, 2001, from OpenCV: http://opencv.willowgarage.com/wiki/ OpenCV Wiki. (2, 11 2011). Retrieved 12 27, 2010, from OpenCV: http://opencv.willowgarage.com/wiki/ Viola, P., & Jones, M. (2001). Rapid Object Detection using a Boosted Cascade of Simple Features. Accepted Conference On Computer Vision and Pattern Recognition 2001, (p. 9). 18 Bohr Robot Project