332:472 FINAL REPORT By Group 2 Michael Dougher Giovanni Sosa TABLE OF CONTENTS SECTION PAGE NUMBER I. Abstract 3 II. Goals 3 III. Algorithms 4 IV. Mechanical Design 5 V. 9 Results VI. Error / Threshold Analysis 18 VII. Limitations / Areas for Improvement 19 VIII. OpenCV Initialization Steps 22 IX. OpenCV Functions Used X. a. Report 1 24 b. Report 2 29 c. Final Report 34 Code a. Main C++ Code 35 b. NQC Code 51 c. Removed FindGroups() Function 52 XI. Individual Contributions 53 XII. Robotics in Society and Economics Paper 54 2 I. Abstract Our robotic vision system utilizes uncalibrated stereo images from a single camera to detect the presence of any objects in the immediate path of the robot. If an object is present, our robot will decide to avoid this object by turning to either the left or right. If no object is present or if the object is far away from the robot, the robot will proceed forward. The robot will maneuver about its environment until it finds a certain object which will tell the robot to stop. When it is done, our vision system will display an animation showing the path of the robot and any objects detected using OpenGL. Our system will save every pair of stereo images it takes. It can run autonomously for an unlimited period of time. The only limiting factors would be battery life of the robot and hard drive space for the images on the computer. II. Goals This is a list of the steps taken for each interim. It shows how our system was developed and what changes were made each time. Our goals for the first interim report were to: Build a lego robot and mount the new fusion webcam onto it using only lego bricks. Install OpenCV and get it to work in Microsoft Visual C++ 2005 (see section VIII. OpenCV Initialization). Create a basic method for feature tracking between a set of two images. Our goals for the second interim report were: Rewrite camera initialization and capture code. Update robot design to fit new program methods. Enhance corner detection and point correspondence methods. Design and create an algorithm to detect depth disparity between a stereo pair of images. Find objects and their locations in the images by way of corner detection, edge detection and the depth disparity map. For our final report, our goals are: Again update robot design to fit new program methods. Further enhance corner detection, point correspondence, and object detection methods. Improve upon camera movement, both mechanically and in NQC. Allow program to continuously loop to grant robot more autonomy. Create an animated map in OpenGL that will show the movements of the robot as well as the objects detected in the environment in their positions relative to the robot. 3 III. Algorithms The main algorithm for this interim is the feature tracking algorithm. This algorithm includes a few smaller sub-algorithms. Our algorithm is as follows: Capture first image. Find features to track in image 1 by locating corners. Capture second image. Find features to track in image 2 by locating corners. Find corresponding points of corner 1 and corner 2 by using a template of corner 1 and its surrounding pixel values and take the sum of square difference of each corner and its surrounding pixel values in the second image. If the SSD is less than a threshold we selected, it is the corresponding point. This SSD is also the normalized value. The threshold value was selected by running multiple experiments to see what would give the most accurate corresponding points. The equation for SSD is as follows: T( x, y) - I(x + x, y + y)2 R x, y = x, y T( x, y)2 • I(x + x, y + y)2 x , y x , y Find fundamental matrix by using 8 point algorithm. The method for finding the fundamental matrix by the 8 point algorithm is: X 1 X 1 X X 2 2 X 3 X 3 X X A 4 4 X 5 X 5 X 6 X 6 X X 7 7 X 8 X 8 Y1 X 1 Y2 X 2 Y3 X 3 Y4 X 4 Y5 X 5 Y6 X 6 Y7 X 7 Y8 X 8 X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 1Y1 X 2Y2 X 3Y3 X 4Y4 X 5Y5 X 6Y6 X 7Y7 X 8Y8 Y1Y1 Y2Y2 Y3Y3 Y4Y4 Y5Y5 Y6Y6 Y7Y7 Y8Y8 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 X 1 Y1 1 X 2 Y2 1 X 3 Y3 1 X 4 Y4 1 X 5 Y5 1 X 6 Y6 1 X 7 Y7 1 X 8 Y8 1 U , S V SVD( A) T V 1,9 V 4,9 V 7,9 F V 2,9 V 5,9 V 8,9 V 3,9 V 6,9 V 9,9 4 Rectify both images. Find stereo disparity by using Birchfield and Tomasi disparity method. --This method can be seen in the publication: “Depth Discontinuities by Pixel-to-Pixel Stereo”, IJCV, Volume 35 pages 269-293 1999. Compare disparity map with corners and edges to find objects and their locations. Move robot according to object’s presence and location. End program when certain object is recognized. The matching will be done using normalized correlation. The formula for this is: T ( x ', y ') I ( x x ', y y ') R( x, y) x ', y ' T ( x ', y ') I ( x x ', y y ') 2 x ', y ' 2 x ', y ' Animate a map of robot’s motion and objects detected in OpenGL. The overall flow of the program is as follows: Main Capture Image #1 findCorners Capture Image #2 findCorners findCorrPoints getDepthMap ObjectDetect GetObjectVert GetObjectHoriz Move robot Reset This method will be done to decide how the robot should react to its environment. It will be repeated until the stopping conditions have been met. IV. Mechanical Design Initially, we wanted our robot to be very small in size so that it would be more maneuverable. We gave it a very compact wheel base and attached the lego camera to the top of the RCX simply by surrounding the base of the camera with lego bricks to hold it in place. [Figure 1] shows the overall size of our original robot design while [Figure 2] shows how the camera was mounted. 5 Figure 1: Frontal View of Old Robot. Figure 2: Camera Mounted onto RCX on Old Robot. Eventually, our design required us use a stereo camera setup. To do this with only one camera required the camera to be able to move horizontally. Therefore, we constructed a horizontal track on top of the RCX and mounted the camera to a motor and connected the camera/motor combination to the track via a guidebar [Figure 3,6]. The camera was attached to the motor simply by squeezing the motor into the middle of the curved base of the camera. It is further secured by additional lego bricks both underneath and above the base of the camera. We found this method of mounting the camera to be very secure. Once this was constructed, our very small wheelbase became very unstable under the increased weight. Therefore, we opted for a wider wheelbase that made use of the additional motors we had [Figure 4,5]. We connected two motors to each side of the robot. Both of the motors on each side of the robot are connected to the same output of the RCX brick. This way, each of the outputs of the RCX brick would control two motors on one side of the robot [Figure 4,5]. After the wheelbase was finished, we ran a few test runs of our code. We found that the weight of the camera and the camera mount near the front of the robot caused the robot to be very front heavy. When the robot would move forward and then come to a quick stop, we could see the robot come close to tipping over forwards. This gave us incentive to build an extension of the robot forward to give it extra balance and prevent it from tipping [Figure 4,5]. In the end, we connected the light sensor to the B output of the RCX (the output that the camera motor is connected to). We attached the light to the front of the robot. This addition was purely cosmetic. Its only contribution to our robot is to light up when the camera moves [Figure 5]. 6 Figure 3: The Camera Mount on the New Robot. Figure 5: Frontal View of New Robot. Figure 4: The Wider Wheel Base of the New Robot. Figure 6: Top View of New Robot. When we finished the new robot, we ran the NQC programs to move the robot. We quickly found that some of the motors behaved differently than others. Upon closer inspection, we were able to notice a slight difference in color between some of the motors. We noticed that the brighter colored ones were the new motors given this semester and the more faded colored motors were the older motors from previous years. We found that the newer motors were much stiffer than the older motors. The older motors spin fairly easily when forced; however, the newer motors will only spin a few rotations before coming to a stop. This led to us noting that the newer motors offered a great deal more resistance to spinning than the older motors. To balance out the driving of the robot, we put one new motor and one old motor on each side of the wheel base. Even after this balancing, the robot would still swerve to one side when told to drive straight. This was easily fixed by lowering the power output of the appropriate RCX output in the NQC code. For the final report, we needed to continue to work in updating our robot’s design. The first thing we did was remove the light sensor from the front of the robot. Its presence was purely cosmetic. It had no real function and ended up just getting in the way of other important robot movements so it had to go. Next, after repeated running of the camera movement programs, we found that occasionally, the camera would not move all the way across the track. It was at this point that we discovered the importance of the camera to rest on a smooth base. While the gear of the motor must follow the track, the rest 7 of the motor sits on the sides of some legos that are smooth. It is VERY important that the motor move along a smooth base. If the motor hits any bit of resistance as it moves, it will be slowed down and then the camera will not have moved as far which will degrade the results. The smooth base can be seen somewhat in Figure 3 as the black, gray and yellow lego pieces. As we would move the camera, occasionally the weight would shift and cause the motor’s gear to contact less of the track. To counter this, we needed to extend the smooth base of the motor farther back above the RCX brick. However this caused an increase in the overall weight of the camera mount and so an additional support was needed at the back of the robot to keep the base level. Figure 7 shows the updated camera base design. Figure 7: Updated Camera Base Design. Figure 8: Final Version of Robot with Reinforced Camera Mount, Cable Holder and IR Tower Holder. We have also found that the lateral motion of the camera is very dependent on the webcam’s cable. If the cable is too tense, it will not be able to move far enough to get a good picture. Therefore, it is important that the computer connected to the robot to be moved along with the robot. To help fix this, we built an attachment on top of the motor to hold the cable in place when the camera moves [Figure 8]. After our batteries died once again, we found that our camera was moving with such great power that it would actually break the lego connections holding the camera to the motor. In order to fix this, we had to reinforce the legos around the motor [Figure 8]. In order to reduce the need to worry about the IR tower’s location relative to the robot, we decided that we needed to attach the IR tower to the robot itself. This way, the IR tower stays right behind the robot at all times and maintains a line of sight with the IR receiver inside the RCX brick. This brings us to our final version of our lego robot [Figure 8]. We are quite happy with the way it works mechanically. 8 V. Results Here are the results from our previous method of feature detection and corresponding point matching. The initial method of finding features was reasonable, yet yielded few results [Figure 9,10]. Our method of matching points by using intensity histograms was also very sub-par and managed to rarely match points correctly between images. Figure 9: Features Tracked by Previous Method, First Image. Figure 10: Features Tracked by Previous Method, Second Image. Our new code yields much better feature tracking by finding more features and also by matching the corresponding points much more accurately. We have samples from 3 different runs of our code showing 3 different scenes. Figures 11-16 show the features and correspondences for the 3 different scenes. The method shows many corresponding points. We managed to adjust the algorithm to provide a higher number of correct corresponding points, while at the same time minimizing the number of false accepts. Figure11: Run#1 Left Image. All Features Displayed. Figure 12: Run#1 Right Image. Only the Corresponding Features Displayed. 9 Figure 13: Run#2 Left Image. All Features Displayed. Figure 14: Run#2 Right Image. Only the Corresponding Features Displayed. Figure 15: Run#3 Left Image. All Features Displayed. Figure 16: Run #3 Right Image. Only the Corresponding Features Displayed The first run shows many features detected in the left image. It also shows many corresponding points in the right image. The points on the box in the background are mostly mapped correctly while some of the points in the middle of the calibration pattern are mapped incorrectly. This error is due to the fact that the local windows around each of those corners appear almost identical. This run shows how our algorithm can be confused by repeating patterns like the calibration pattern. The second run also shows a lot of features detected in the left image. However, nearly all of the corresponding points in the right image are mapped correctly. This run shows the potential of our algorithm when it finds a lot of correct matches. The third run shows a lot of features detected around the dresser, yet it does not find many features on the side of the dresser itself. The second image only finds 8 corresponding points, 1-2 of which are incorrect (the one point shown is on the wrong button on the TV while the other button is labeled off the image and cant be compared effectively). So few points are found because the right image moves so that the camera is mostly focused on the side of the dresser, where very few features are found. This run shows how our algorithm can be inaccurate. 10 After we have the corresponding points for the stereo pair, we next calculated the depth map. The depth map shows the approximate depth of the points in the scene. Figures 17-19 show the depth maps for the three runs. The depth maps shown are scaled versions of their original image. Figure 17: Run#1 Depth Map Figure 18: Run#2 Depth Map Figure 19: Run#3 Depth Map All of the depth maps have one thing in common: they are noisy. This noise can come from the error in correspondence or from the approximation of the fundamental matrix or even from the images not being rectified correctly. Even through all of the noise, it is still fairly easy to determine what is in front of you in each scene. The first scene shows what looks like a checkered box in the right side of the image and also another box object farther in the background. The second scene is mostly black, which means that there is nothing very close to the robot. Reviewing the original picture of the scene, we see that this is correct. There are no objects close to the robot. The third scene shows a large object in front of the robot that encompasses all but the left third of the image. All three of these depth maps represent their original scenes fairly well. 11 In order to visually interpret the results, we can also use the canny edge detector and get an image of the edges of each scene from the initial images. Overlaying the edge image and the depth image allows us to better interpret the results. Figures 2025 show the images of the edges taken from the left image of the stereo pair and the combination of the edges and the scaled depth map images. The edge image will become useful later on, when we use it to find the location of an object in the scene. The edge image and the depth map together form the basis for our method of determining an objects presence and, if one is present, determining its location relative to the robot. Figure 20: Run#1 Edge Image. Figure 21: Run#1 Edge Image and Depth Map Combined. Figure 22: Run#2 Edge Image. Figure 23: Run#2 Edge Image and Depth Map Combined. 12 Figure 24: Run#3 Edge Image. Figure 25: Run#3 Edge Image and Depth Map Combined From Scenes 1 and 3, we can see that the edges usually correspond to regions where the depth map is changing sharply. In the second image, the distance is very great so the depth does not correspond very well to the edge image. However for the sake of object detection and avoidance, this will not be a problem since a black depth map indicates an absence of obstacles. This way, the robot is free to move anyway. To aid in the detection of obstacles, we have grouped the feature points that are located near one another. Combining this with the depth maps and the edge images of each scene can provide enough to locate any obstacles and determine a course of action around them. Figures 26-28 show the original scene with the features shown and the windows around the groups of features that we detected. Figure 26: Run#1 Group Image Figure 27: Run#2 Group Image 13 Figure 28: Run#3 Group Image For the final version of our vision system, we updated our Grouping algorithm to perform even better. In the end, however, we did not need the function to detect objects in the images. We used only the depth map and the edge image to find objects and their positions. The function will be displayed in the code section (section X.a. C++ Code). The methods for the corner detection, corresponding point detection, and depth map creation are all very similar in their resulting images as before, so to save space we will omit these images because the older versions of these methods still illustrate the end result very well. Using the depth map and edge detection images, we are able to determine the presence and location of any objects in the scene in the path of the robot. Figure 29 and 30 show two results of our object detection algorithm. These images were taken from two runs that were different from the scenes depicted in runs 1 through 3 above. Figure 29: Run#4 Object Detection Result. Figure 30: Run#5 Object Detection Result. From figures 29 and 30, you can immediately see one of the largest drawbacks of this method. Because of the way the depth map is obtained, there is no depth information for the leftmost side of the image. This means that we cannot detect objects in that part of the image only (this limitation is discussed in more detail in section VII. Limitations / Areas for Improvement). This limitation causes any objects that are recognized to 14 have their leftmost edge to be approximately 85 pixels from the left. Once this is taken into effect, the boxes drawn on the objects in the scene can be seen to be a fairly good approximation. Figure 30 shows the nearest box as the object from the leftmost possible limit to within a few pixels of the actual rightmost edge of the box. Figure 29, however, has the rightmost edge of the object box as the right edge of the image. This would appear to be an error, but if we observe the depth map for this image, Figure 31, we can see that there was a lot of noise in the depth map and this causes the object to appear wider than it actually is. Figure 31: Run#4 Depth Map & Edge Image Showing High Noise Levels From Figure 31, we can see the noise present in the depth map. This noise could be from a number of possible sources (see section VII. Limitations / Areas for Improvement). Because of this noise, our method for detecting the horizontal limits of the object will see the stretched part of the depth map as if it is an extension of the object itself. Therefore, we think that our method for detecting objects in the images works well given correct information. From these two object detection images, our robot should be able to determine how it should act. It is easily seen that there is an object in the robot’s immediate path in both figures 29 and 30. Therefore, we know that the robot should turn either to the right or to the left. Our method for determining which direction to turn is quite simple. We use the fact that the minimum x value that the object detection box will have will be 85 due to the limitations stated above. Therefore, if there is an object in the leftmost side of the image, the object detection box will start at 85 pixels in the x direction. However, if there is no image in the leftmost side of the image, the object detection box will start at something more than 85 pixels in the x direction. Therefore, to determine which direction to go, we do a simple check on the value of the x minimum of the object detection box. If the minimum value is 85, there is an object to the left, so the robot should turn right. If the minimum value is greater than 85, there is not an image to the left, and so the robot should turn left. We also use the value of 85 because the depth map will constantly stretch the depth to the right in the image. Therefore, we do not want to use a threshold value farther to the right in the 15 image. In both runs 4 and 5, our robot turned to the right. Visually inspecting the two object detection images, a right turn would be the acceptable course of action. For our stopping condition, we used a simple object recognition method to determine if one specific object is present in the scene. For this, we modified our simple correlation matching algorithm that we used for last semester to detect a box of popcorn in the scene (or more specifically, a small section the box). We modified the code to search for this region in 3 different sizes. Figure 31 shows the template section of the image to compare to the frames and figure 32 shows an input frame that caused the robot to stop. Figure 32: Template Image. Figure 33: Image Showing Stopping Condition. The downside to using just a simple correspondence match to find the stopping condition is that there is the possibility for the system to incorrectly match a different, but similar scene to the popcorn box. In order to reduce the number of incorrect matches of the stopping condition, we had to adjust the threshold values for matching the popcorn box. When we display our results in OpenGL, we use a specific coloring scheme. We represent the lego robot as a yellow square, which stays in the center of the animation. We represent all objects detected as red squares and we represent the popcorn box that will end the program as a blue square. As the program runs, the yellow robot square will rotate in the center of the animation and object squares will be shown around it. Figures 34 through 38 on the next page show our output animation at a few different time instances. These images show our basic method for displaying our robots actions and object detections in OpenGL. The objects that are drawn as red squares are shown in positions that are relative to the robot. Because our system does not use any sort of calibration, the exact distances cannot be determined and the distance between the OpenGL representation of the robot and the object are merely an approximation. 16 Figure 34: OpenGL representation of robot. Figure 35: OpenGL representation of robot with object detected. Figure 36: OpenGL representation of robot after rotation. Figure 37: OpenGL representation of robot after rotation with object detected. Figure 38: OpenGL representation of robot after detecting popcorn box. This is the last frame of the animation as the program exits when this object is detected. 17 VI. Error / Threshold Analysis The first analysis of importance to us was the setting of the GF_QL threshold value. This threshold value is for the Good Features to Track quality level input. This value determines the number of corner points that will be found in the image. Just by doing a few quick observations, we determined that a lower value for GF_QL meant more points found and a higher value meant less points found. Figure 39 shows a chart of exactly how many corners were found given a certain quality level. The points obtained were taken of an unchanging scene with all other programming values constant. Number of Corners Found GF_QL Value vs Number of Corners Found 250 200 150 100 50 0 0 0.2 0.4 0.6 0.8 1 1.2 GF_QL value Figure 39: Graph of Good Features to Track Quality Level and its effect on the number of corners found. From this graph, we can easily see our previous assumption was correct; the number of corners found decreases as GF_QL increases. We can see that it follows an approximately exponential curve downwards to zero corners as GF_QL approaches 1. For our purposes, we need to have a slightly larger number of total corners so that when the corners are matched, we will have more matches. We set GF_QL to be 0.04 which would find 96 corners in this scene. We ran a series of runs of our vision system. Each time our robot made an action, we recorded that action and noted whether or not we felt it should have made that action based on what we could see of the scene visually. Most of the time, the robot behaved as we would have expected it to, however sometimes it does not. Figure 40 shows how often our robot acted correctly and incorrectly. Out of a total of 88 actions recorded, it acted correctly 62 times, a 70% accuracy. However these numbers are slightly skewed. The run forward action was highly accurate with a 95% accuracy while the turn left action was mostly hit or miss with a 61% accuracy. From this we feel that our method for detecting an objects presence is very good. This can be seen from the 95% accuracy of the move forward algorithm. However, our reasoning for causing the robot to turn left leaves room for improvement. 18 Frequency Actions Taken by Robot 80 60 40 20 0 62 37 2 forward 6 2 right 14 9 left 5 3 end 16 total Actions Correct Action Taken Incorrect Action Taken Figure 40: Plot of Actions taken by Robot, Correct and Incorrect actions shown. VII. Limitations / Areas for Improvement Limitations: One of the major drawbacks to our method is in the acquisition of the depth map. Because of our mechanical design, the leftmost part of the left image of the stereo pair will contain information that is not included in the right image. This is due to the fact that the camera moves a certain distance laterally, and does not rotate to capture the same scene points. This effect means that the depth map will not be able to get any sort of approximation for the depth in that leftmost region. Therefore, the depth map will generally only display black in that region (corresponding to nothing present at that location) whether or not it should be black. This effect can be seen in Figure 19 or 25 above. You can see that the depth appears as black on the leftmost side of the image when it should be a darker gray. Our method of acquiring the depth map is also very vulnerable to motion within the scene. Differences in objects positions in the scene between the stereo images are assumed to be from the difference in the camera, not from the object being in motion. If the object is in motion, the depth map will show it as moving either more or less (depending on its direction) and so the depth for that object will be either closer or farther accordingly. Another limitation of the depth map is its noise. The depth map will almost always display a good deal of noise in the image. Most of the time, it is simply a stretching of the depth horizontally across the image. However, on rare occasions, the noise will cause small areas of the image to show as being close to the robot when they are in fact not close at all. This only gives us a problem when there is little else being represented as close in the depth map. It will then incorrectly show that something is nearby. The method is also vulnerable to shadows of objects. If an object is placed above the ground with a shadow underneath it, the stereo depth algorithm will treat that darkened shadow as another object that is lying on the ground. This can cause the ObjectDetect function to see the shadow and object as a single, larger image. However, this will not affect our outcome much because the object was in the robots path anyway. The only difference will be in the OpenGL mapping of the object. It will display the object as being larger than it should be. 19 Another limitation of our method is that the object detection algorithm assumes that only one object is in the scene. Assume, for example, that two objects are located in the scene and are separated by some distance. Our method finds the extreme horizontal and vertical limits of anything close enough to the robot. In this case, it will treat the two objects as one without any space in between them. This will only be a problem if the distance separating the two objects is more than the width of the robot. This will cause the robot to act incorrectly as the program will tell the robot that there is a large object directly in front of it and it should turn to one side or the other. In reality, the robot may have been able to simply go straight forward between the two objects. Our robot system is also limited in a way by physical constraints. First, our robot is tethered to the computer system. If this computer happens to be a stationary desktop computer, the robot has very little range as the cable for the camera is very short. This could be fixed by simply using a USB extension cable to give the cable enough length to maneuver wherever needed. An alternative would be to connect to a laptop computer and hold the laptop above the robot as it maneuvers about. The idea of cables leads to the second constraint. The cable for the camera and the cable for the IR tower can occasionally get tangled and cause the camera cable to receive extra tension when it is still close to the computer. This might not have been a problem if we were able to build the IR tower harness that holds the tower a few inches behind the RCX brick at all times as we built for last semester. Unfortunately, our design for the camera mount required most of those pieces and so we could not build the IR tower harness. This isn’t much of a problem as the IR tower has plenty of range; it is just something to look out for as the tower must be facing the back of the robot whenever it is trying to communicate. This is more or less just a matter of keeping track of where the tower is relative to the robot and moving it accordingly by hand. Our robot’s performance is also limited by its battery level. One would hope that the robot would perform at about the same quality no matter how much battery life was remaining. However, that is not the case for our robot. We found that at lower battery levels, the motors would actually spin less for each set period of time. This would cause the robot to move a shorter distance and, in the case of our camera, not move far enough to get a good image for the right image of the stereo pair. Also affected was the robots ability to turn. The NQC programs were initially programmed at a certain battery level and after the batteries had drained some, the robot would rotate less. After replacing the batteries, each command of the robot caused every action to have more power than required. This meant that the robot would move and turn farther and faster. This caused many problems especially with the camera motion. After this, we had to change the timing of our NQC codes to reset them to the desired levels. Areas for Improvement: From our results, we noticed that the amount of lateral movement of the camera is very important to get a better quality depth map. A better quality depth map will in turn give us better results. Therefore, it is important to ensure that the camera moves 20 laterally by a significant amount. We found from messing around with our camera system that a lateral movement of about 8 inches or so yielded the best depth map in our system. However, our robot design only allowed for a maximum lateral displacement of about 6 inches. Ideally, we would be able to get the maximum displacement, but our current system does not allow that range of motion. Lego axle pieces that are longer than the ones supplied would be necessary for that amount of motion. A quick fix for the above problem would be the use of two webcams. If we could mount two cameras side by side, we would be able to take two images almost simultaneously without having to move the camera at all. This would simply our design a great deal and we would be able to remove all error in our results associated with incorrect movement of the camera. OpenCV supports the usage of multiple cameras and the code would not be much more complex than the existing single camera code. The method of using two cameras for stereo vision would also allow us to assume a constant distance between the two cameras. With the single camera system, the camera may not move its full distance and so we cannot assume any single value for its distance. Coding wise, our system could be improved by allowing the recognition of multiple objects in the scene. This would allow the robot to act in a more appropriate manner in certain very specific cases such as stated above in the limitations section. This would surely be an upgrade as it would allow our robot a better opportunity to act in a much better way. Our system could also be improved by some sort of auto-calibration system. We would not want to use a manual calibration method as that would take away from the autonomy of our robot. Our current method uses no type of calibration and so all distances from the robot to the objects in the scene cannot be known. With the calibration, we would be able to know the distances between the objects in the scene and the robot as well as more exact measurements of the sizes of the object(s). This would allow our robot to make much better decisions on how to maneuver around objects placed in front of it. 21 VIII. OpenCV Initialization Files Downloaded for installation of OpenCV: OpenCV beta 5 DirectX 9.0 February 2005 SDK package DirectX 9.0 February 2005 SDK Extras package Windows Platform SDK Note: here, we assume that the respective program packages are installed to the c drive in the ‘Program Files’ directory. Special Instructions when installing: When installing the DirectX SDK Extras package, one has to place the destination directory where the installed DirectX SDK package is located. Instructions for editing Visual Studio 2005: Open Visual Studio without loading a project, so the start page is displayed. Then select the tools toolbox and then options… In the Project and Solutions tab->Vc++ Directories tab there is a drop down menu to select the different directories to add where the SDK and OpenCV files are. In the executables files directories option we added the following directories: C:\Program Files\Microsoft Platform SDK\bin\winnt C:\Program Files\Microsoft Platform SDK\bin C:\Program Files\Microsoft DirectX 9.0 SDK (February 2005)\Utilities\Bin\x86 C:\Program Files\OpenCV\bin In the include files directories option we add the following directories: C:\Program Files\Microsoft Platform SDK\include, C:\Program Files\OpenCV\cv\include C:\Program Files\OpenCV\otherlibs\highgui C:\Program Files\OpenCV\otherlibs\cvcam\include C:\Program Files\OpenCV\cxcore\include C:\Program Files\OpenCV\cvaux\include In the libraries directories option we added the following: C:\Program Files\Microsoft Platform SDK\lib C:\Program Files\Microsoft DirectX 9.0 SDK (February 2005)\Lib\x86 C:\Program Files\Microsoft DirectX 9.0 SDK (February 2005)\Extras\DirectShow\Samples\C++\DirectShow\BaseClasses C:\Program Files\OpenCV\lib 22 Additional Necessary Steps: Edit the Windows environmental variables. You can access this by going into system settings under control panel. Then select the Advance tab and click on Environmental Variables. Then add C:\ProgramFiles\OpenCV\bin to the system path variable. Method for executing nqc within Visual Studio. Copy the nqc.exe file to the C:\Windows\System32 directory. The program can be executed in C++ by using the system() command. For example: system(“nqc –d program1.nqc –run”) will download program1.nqc and run it. Alternatively, you can also download each program you need to a different program slot on the RCX brick. Then in C++, you can simply tell the robot to run this program without needing to download it again. This can be done by: system("nqc -pgm 1 -run") where the ‘1’ is the program in the corresponding program slot. It can be an integer 1 through 5. Side note: If a USB tower is being used, it is easier to add a variable called RCX_Port with the value usb to the environmental variables. Creation of Solution / Project: When creating a project one has to edit the links to the project and include cv.lib, cvcam.lib, highgui.lib, cxcore.lib, cvaux.lib. This is done by going to the project toolbox, and then selecting the properties option. Then select the Configuration Properties->linker->inputs tab. This .lib files are typed into the Additional Dependencies option. 23 IX. OpenCV Functions a. Functions used in Interim Report 1: CVCam camera Functions: cvcamInit() Usage: no inputs/outputs. This is used to initialize the cvcam properties defined by cvcamSetProperty() int cvcamStart(); Usage: no inputs. This is used to start the camera video feed after initialization. Outputs: returns a int value to check to see if there was an error int cvcamStop(); Usage: no inputs. This is used to stop the camera video feed. Outputs: this function always outputs a int value of 0 int cvcamExit(); Usage: no inputs. This is used to clear close all cvcam attributes on system Outputs: this function always outputs a int value of 0 int cvcamSelectCamera(int** out); Usage: This function is used to select different video capture devices that might be installed on the system. Inputs: Int** Out- this is an empty pointer used to select different device numbers Outputs: int – This is the return value of the device selected by user to use as the video capture device. int cvcamSetProperty(int camera, const char* property, void* value); Usage: This function is used set different properties for the device being used. Some of these properties include a callback function in which every frame is processed by this callback function. Inputs: Int camera- this is the device number which is the output of cvcamSelectProprty() Const Char* property- this is the variable in which the user state the property you want to invoke Void* Value- this is the control value for the property Outputs: Int – this is used as a way to validate that the property was executed or encountered an error Image Processing Functions (1): IplImage* cvCreateImage( CvSize size, int depth, int channels ); Usage: this function creates an empty image with the designated parameters. Inputs: CvSize size - this is the size of the image that will be created. This is best 24 used by just passing cvSize( int width, int height) or by using cvGetSize(IplImage*) for the size variable. Int depth - this is the color depth of the new image. Can be one of: IPL_DEPTH_8U - unsigned 8-bit integers IPL_DEPTH_8S - signed 8-bit integers IPL_DEPTH_16U - unsigned 16-bit integers IPL_DEPTH_16S - signed 16-bit integers IPL_DEPTH_32S - signed 32-bit integers IPL_DEPTH_32F - single precision floating-point numbers IPL_DEPTH_64F - double precision floating-point numbers. Int channels – this is the number of channels in the image. For example, grayscale images will have a channel value of 1, while full color will have 3 channels (1 each for red, green and blue). Outputs: IplImage* - this is the output image. It will be of type IplImage*. cvSaveImage( const char* filename, const CvArr* image ); Usage: this function will save an image variable to a file on the hard disk. Inputs: const char* filename - this is the filename that the saved image will have. For example, “image01.bmp” (with the quotes) is acceptable. Const CvArr* image - this is the image that will be saved to a file. IplImage* is an acceptable form of CvArr* to be passed into cvSaveImage. Outputs: none. IplImage* cvLoadImage( const char* filename, int iscolor ); Usage: this function will load an image from a saved file to a variable. Inputs: const char* filename - this is the filename of the file to be loaded. Int iscolor Specifies colorness of the loaded image: if >0, the loaded image is forced to be color 3-channel image; if 0, the loaded image is forced to be grayscale; if <0, the loaded image will be loaded as is (with number of channels depends on the file). IplImage* cvCloneImage( const IplImage* image ); Usage: this function will copy an image variable into another image variable. Inputs: const IplImage* image – this is the image variable that you want to copy. Outputs: IplImage* - output image variable is IplImage*, the same as the input. cvSetImageROI( IplImage* image, CvRect rect ); Usage: this function will set the region of interest (ROI) of an image. Using this, You can set the processing range to be a smaller portion of an image. Inputs: IplImage* image - this is the input image that you want to get a region of. CvRect rect – this is the location and size of the region to be used. The Parameters can be set as follows: CvRect rec; rec.height = 10; rec.width = 10; rec.x = x_offset; rec.y = y_offset; 25 and then passing rec into the cvSetImageROI function. cvRectangle( CvArr* img, CvPoint pt1, CvPoint pt2, CvScalar color, int thickness=1, int line_type=8, int shift=0 ); Usage: will draw a rectangle at the two vertices of the rectangle and add them to the image. Inputs: CvArr* img – input image to draw the rectangle on. CvPoint pt1 – vertex 1 of rectangle. CvPoint pt2 – vertex 2 of rectangle. CvScalar color – 4 object variable that has the color of the line. The four parameters are: Blue amount, Green amount, Red amount, and Alpha (have not tested the 4th component yet). You can set this by: CvScalar z; z.val[0]=300.00; z.val[1]=100.00; z.val[2]=100.00; z.val[3]=300.00; This will give you blue squares. Int thickness – thickness of border of rectangle. Default is 1. This parameter is optional. Int line_type – type of line for border. Default is 8. This parameter is optional. Int shift – Default is 0. This parameter is optional. Outputs: none. cvCvtColor( const CvArr* src, CvArr* dst, int code ); Usage: This function converts an image from one color space to another. For example, converts an image from color (3 channels) to grayscale (1 ch). Inputs: const CvArr* src – this is the input image to be converted. It will not be modified. CvArr* dst - this is the output image that the converted image will be. Int code - Color conversion operation. Possible options are: CV_RGB2YCrCb CV_RGB2HSV CV_RGB2GRAY Even though those codes are not integers, they can still be used for input arguments for the code. Outputs: none (the output image file is actually an input argument). cvFlip( const CvArr* src, CvArr* dst=NULL, int flip_mode=0); Usage: this function flips an image about one or more axes. Inputs: const CvArr* src cvGoodFeaturesToTrack( const CvArr* image, CvArr* eig_image, CvArr* temp_image, CvPoint2D32f* corners, int* corner_count, double quality_level, double min_distance, const CvArr* mask=NULL, int block_size=3, int use_harris=0, double k=0.04 ); Usage: This function finds corners within an image based on its eigenvalues. It will return an array of corner points. The maximum number of corners it finds is the total length of the array corners. If the function finds fewer 26 corners than that, the array is padded by numbers approximately zero. Inputs: const CvArr* image – this is the input image that you want to find corners in. It will not be changed. CvArr* eig_image – this is a temporary image used within the function. It can be passed as a newly created IplImage* that has not been assigned any image. CvArr* temp_image – another temporary image used similarly to eig_image. CvPoint2D32f* corners – this is an array of 2D points. The array length is the maximum number of corners you want found. Int*corner_count – this output variable is the total number of points found and located in corners. Double quality_level – this is used as a multiplier to find the maximum eigenvalue. It works to filter out ‘lower quality’ corners by “rejecting the corners with the minimal eigenvalue less than quality_level•max(eig_image(x,y)).” Double min_distance – this is the minimum distance that can be between two corners. Making this higher makes the corners be farther apart. Const CvArr* mask – this is used to set the search region to find corners. If set to NULL, the function will search the entire image. Default value is NULL. This parameter is optional. Int block_size – this is the size of the averaging block. It is used by other functions within goodFeaturesToTrack. Default value is 3. This parameter is optional. Int use_harris – this is used to tell the function whether or not to use the harris edge detector. If this value is zero, goodFeatures will use cvCornerMinEigenVal. If this value is nonzero, goodFeatures will use cvCornerHarris. The default value is zero (do not use harris detection). This parameter is optional. Double k – this is a parameter for the harris corner detector. It is only used when use_harris is nonzero. The default value is 0.04. This parameter is optional. Outputs: none (outputs are assigned as inputs to the function). Histogram functions: CvHistogram* cvCreateHist( int dims, int* sizes, int type, float** ranges=NULL, int uniform=1 ); Usage: this function creates a histogram of the given size and dimensions. Inputs: int dims – this is the number of dimensions for the histogram. Int* sizes – histogram dimension sizes. Int type – determines type of histogram. Can be either: CV_HIST_ARRAY or CV_HIST_SPARSE Float** ranges – array of ranges for histogram values. Default value is NULL. 27 int uniform - Uniformity flag; if not 0, the histogram has evenly spaced bins. Default is 1 (not zero). Outputs: CvHistogram* - returns a pointer to the histogram object. void cvCalcHist( IplImage** image, CvHistogram* hist, int accumulate=0, const CvArr* mask=NULL ); Usage: This function is used to calculate the histogram of images Inputs: IplImage** image – this is the image that you want to calculate the histogram for CvHistogram* hist – this is a pointer that points to the created histogram by the function cvCreateHist(). Outputs: None double cvCompareHist( const CvHistogram* hist1, const CvHistogram* hist2, int method ); Usage: this function compares two histograms using the desired method. Inputs: const CvHistogram* hist1 – first histogram to be compared. Const CvHistogram* hist2 – second histogram to be compared. Int method – the possible methods for comparison are: CV_COMP_CORREL, CV_COMP_CHISQR CV_COMP_INTERSECT, CV_COMP_BHATTACHARYYA Outputs: double – a comparison value based on the method will be output. void cvClearHist( CvHistogram* hist ); Usage: This function clears the histogram information Inputs: CvHistogram* hist - this is the histogram that you wish to clear Outputs: None Math Functions: int cvRound( double value ); Usage: Takes in a double value and rounds it to the nearest integer and returns that integer value. Inputs: double value – this is the decimal number that you want to convert to int. Outputs: int – this is the rounded integer value of the input number. 28 IX. OpenCV Functions b. Functions added for Interim Report 2: CVCapture Functions: CvCapture* cvCaptureFromCAM( int index ); Usage: assigns a variable to your camera that will be capturing images. Inputs: int index - this is the index number of the camera connected to the computer. If only one camera is used, you can set index to be NULL or -1 Outputs: CvCapture* - this is the variable for the camera that you use to capture images. IplImage* cvQueryFrame(CvCapture* capture); Usage: this function will capture a frame from the camera assigned to the capture variable. Inputs: CvCapture* capture – this is the camera variable that was assigned using cvCaptureFromCAM. Outputs: IplImage* - this is the captured image. Void cvReleaseCapture( CvCapture** capture); Usage: this function releases the camera variable from memory. Inputs: CvCapture** capture – this is a pointer to your camera’s capture variable. Because the input is type CvCapture** and your capture variable is type CvCapture*, you need to pass it &capture (assuming that ‘capture’ is your camera’s variable.) Outputs: None. Image Processing Functions (2): void cvMatchTemplate( const CvArr* image, const CvArr* templ, CvArr* result, int method ); Usage: this function scans through the input image and tries to find a match for the template image. Inputs: const CvArr* image – input image to search be searched. Const CvArr* templ – template image that will be compared to all regions of the input image. CvArr* result - A map of comparison results; single-channel 32-bit floatingpoint. If image is W×H and templ is w×h then result must be W-w+1×H-h+1. Int method – method to compare template to input image. Can be: CV_TM_SQDIFF, CV_TM_SQDIFF_NORMED, CV_TM_CCORR CV_TM_CCORR_NORMED, CV_TM_CCOEFF, or CV_TM_CCOEFF_NORMED Outputs: None. void cvMinMaxLoc( const CvArr* arr, double* min_val, double* max_val, CvPoint* min_loc, CvPoint* max_loc, const CvArr* mask); Usage: this function finds the minimum and maximum values for an array and their locations. 29 Inputs: const CvArr* arr – this is the array that you will be searching through to find the minimum and maximum values. Double* min_val – this is the returned minimum value of the array. Double* max_val – this is the returned maximum value of the array. CvPoint* min_loc – this is the returned location of the minimum value in the array. Default is NULL (does not return value). CvPoint* max_loc – this is the returned location of the maximum value in the array. Default is NULL (does not return value). Const CvArr* mask – mask used to select a subarray. Default is NULL. (no subarray used = = whole array is searched). Outputs: None. Void cvReleaseImage( IplImage** image ); Usage: this function releases the image variable from memory. Inputs: IplImage** image – this is a pointer to the image variable you want to release. Because it is of type IplImage** and your image is of type IplImage*, you must pass it &image (assuming that ‘image’ is your image variable.) Outputs: None. CvMat* cvCreateMat( int rows, int cols, int type ); Usage: this function creates a matrix variable of the given size and type. The matrix will store the data as a single vector of length (n_rows * n_cols). The elements are assigned row by row. Inputs: int rows – this is the number of rows of the matrix. int cols – this is the number of columns of the matrix. int type – this is the type of data being stored in the matrix. For example, CV_8UC1 means an 8-bit unsigned single-channel matrix. Outputs: CvMat* - this is the matrix variable. int cvFindFundamentalMat(const CvMat* points1, const CvMat* points2, CvMat* fundamental_matrix, int method, double param1, double param2, CvMat* status); Usage: this function calculates the fundamental matrix for a set of images from a set of corresponding points between those images. Inputs: Const CvMat* points1 – this is an array of the point (x,y) coordinates for the first image. Const CvMat* points2 – this is an array of the point (x,y) coordinates for the second image. CvMat* fundamental_matrix – this is the variable where the output fundamental matrix will be saved into a [9x1] column vector. Int method – this is the method that the function will use to calculate the fundamental matrix. We used CV_FM_8POINT which is the 8-Point algorithm. Double param1 - the maximum distance from point to epipolar line. This is only used if method is CV_FM_RANSAC or CV_FM_LMEDS. Default value is 1.0. 30 Double param2 – the desirable level of confidence that the matrix is correct. This is only used if method is CV_FM_RANSAC or CV_FM_LMEDS. Default value is 0.99. CvMat* status - The optional output array of N elements, every element of which is set to 0 for outliers and to 1 for the other points. This is only used if method is CV_FM_RANSAC or CV_FM_LMEDS. Default value is NULL. Outputs: int – this output variable is the number of fundamental matrices found (1 or 3 depending on method or 0 if no matrix is found). Void cvMakeScanlines( const CvMatrix3* matrix, CvSize img_size, int* scanlines1, int* scanlines2, int* lengths1, int* lengths2, int* line_count); Usage: this function has two main uses. When you input scanlines1=scanlines2=lenghts1=lengths2=0, the function will simply count the number of scanlines and save that number into line_count. If those parameters are not zero, the function will calculate the scanlines (epilines) for the two images. The information from the two images comes from the fundamental matrix passed into it by ‘matrix’. NOTE: MEMORY MUST BE ALLOCATED BEFORE CALLING FUNCTION. This is done by first calling cvMakeScanlines to find the total number of scanlines and save them to line_count. You then allocate the memory by: scanlines1 = (int*)(calloc( line_count* 2, sizeof(int) * 4)); scanlines2 = (int*)(calloc( line_count * 2, sizeof(int) * 4)); lengths1 =(int*)(calloc( line_count* 2, sizeof(int) * 4)); lengths2 = (int*)(calloc( line_count* 2, sizeof(int) * 4)); After the memory is allocated for those 4 variables, the function can be called again to find the scanlines. Inputs: const CvMatrix3* matrix – this is the input fundamental matrix. NOTE: the type CvMatrix3 is different from a CvMat of size (3x3). The CvMatrix3 fundamental matrix can be found from the CvMat fundamental matrix as follows: CvMatrix3 FMAT; //f_mat is CvMat fundamental matrix already calculated. for (int i=0;i<3;i++) { FMAT.m[i][0]=f_mat->data.fl[i*3]; FMAT.m[i][1]=f_mat->data.fl[i*3+1]; FMAT.m[i][2]=f_mat->data.fl[i*3+2]; } CvSize img-size – this is a cvSize variable for the size of the image. The simplest way to handle this parameter if the image height and width are known constants, you can pass cvSize(width,height) as this input. Int* scanlines1 – this is the array of scanlines of the first image. (see note in usage for this input.) Int* scanlines2 – this is the array of scanlines of the second image. (see note in usage for this input.) Int* lengths1 – this is the array of the lengths of the scanlines in the first image. (see note in usage for this input.) Int* lengths2 – this is the array of the lengths of the scanlines in the second image. (see note in usage for this input.) Int* line_count – this is the total number of scanlines found. 31 Void cvPreWarpImage( int line_count, IplImage*img, uchar*dst, int*dst_nums, int*scanlines ); Usage: this function will warp the input image so that its scanlines are horizontal, rectifying it. NOTE: MEMORY MUST BE ALLOCATED BEFORE CALLING FUNCTION. This can be done as follows: buffer = (uchar*)(malloc(320*line_count*3)); where line_count is found from cvMakeScanlines. Inputs: int line_count – this is the total number of scanlines in the image. This parameter can be obtained from cvMakeScanlines. IplImage* img – this is the input image to be rectified. Uchar* dst- the output data buffer. Memory must be allocated to this before calling the function as noted above. Int* dst_nums – this is the array of scanline lengths from cvMakeScanlines. Int* scanlines – this is the array of scanlines from cvMakeScanlines. Outputs: None. Void cvFindStereoCorrespondences(const CvArr* leftImage, const CvArr* rightImage, int mode, CvArr* depthImage, int maxDisparity, double param1, double param2, double param3, double param4, double param5 ); Usage: this function will input two rectified grayscale images and will calculate a depth map based on their disparities. Inputs: const CvArr* leftImage – the input image of the left part of the stereo pair. Const CvArr* rightImage – the input image of the right part of the stereo pair. int mode – this is the method to find the depth map. Currently, this value must be: CV_DISPARITY_BIRCHFIELD. CvArr* depthImage – this is the output depth map image. Int maxDisparity – this is the maximum level of disparity there will be between the two images. If there are objects in the foreground and background, this value will need to be high. Double param1 – ‘constant occlusion penalty’. Default value is 25. double param2 – ‘constant match reward’. Default value is 5. Double param3 – ‘highly reliable region’. Default value is 12. Double param4 – ‘moderately reliable region’. Default value is 15. Double param5 – ‘slightly reliable region’. Default value is 25. Outputs: None. Void cvConvertScale( const CvArr* src, CvArr* dst, double scale, double shift); Usage: this function scales and/or shifts the input array and puts that information into the output array. Inputs: const CvArr* src – input image/array to be scaled or shifted. CvArr*dst – output image/array. Double scale – amount by which the array will be scaled. Default value is 1 (for no scaling). Double shift – amount by which the array will be shifted. Default value is 0 (for no shifting). Outputs: None. 32 Font Functions: Void cvInitFont( CvFont* font, int font_face, double hscale, double vscale, double shear, int thickness, int line_type ); Usage: this function initializes the font so that you can insert text onto images. This must be called before any other font commands. The CvFont* font variable must first be created by: CvFont* font=new CvFont; Inputs: CvFont* font – pointer to font structure that will be initialized. Int font_face – integer value for the font type. Can be one of the following: CV_FONT_HERSHEY_SIMPLEX CV_FONT_HERSHEY_PLAIN CV_FONT_HERSHEY_DUPLEX CV_FONT_HERSHEY_COMPLEX CV_FONT_HERSHEY_TRIPLEX CV_FONT_HERSHEY_COMPLEX_SMALL CV_FONT_HERSHEY_SCRIPT_SIMPLEX CV_FONT_HERSHEY_SCRIPT_COMPLEX Double hscale – horizontal scale factor for font size. Double vscale – vertical scale factor for font size. Double shear – slope factor for font shape. Default is 0. (normal text). Int thickness – thickness of the text lines. Default is 1. Int line_type – type of stroke. Default is 8. Outputs: None. Void cvPutText( CvArr* img, const char* text, CvPoint org, const CvFont* font, CvScalar color ); Usage: this function will draw text on top of the input image. The font must be initialized before anything can be written. Inputs: CvArr* img – input image to be drawn on. Const char* text – text to be drawn on the image. CvPoint org – coordinates for the origin of the text. The origin is set to the bottom left corner of the first letter of the text string. Const CvFont* font – this is the font that was initialized with cvInitFont. CvScalar color – this is a CvScalar value containing the color information that determines the color of the text that will be drawn. Outputs None. Window Functions: Int cvNamedWindow( const char* name, int flags); Usage: this function creates a window designated with the name given. Inputs: const char* name – this is the name that will the window will be given. It should be passed into the function in quotes. For example, “New Window”. Int flags – only parameter available is: CV_WINDOW_AUTOSIZE. Default value is 0. Outputs: int – this returns a ‘1’ if created ok or a ‘0’ if the window was not created. Void cvShowImage( const char* name, const CvArr* image); 33 Usage: this function displays the given image onto a named window. cvNamedWindow must be called to assign the name to the window before this function is called. Inputs: const char* name – this is the name of the window that was assigned with cvNamedWindow that you wish to show the image in that window. Const CvArr* image- this is the image that you want to display in the window. Outputs: None. IX. OpenCV Functions c. Functions added for Final Report: Image Processing Functions (3): CvScalar cvGet2D(const CvArr* arr, int idx0, int idx1); Usage: this function will get a value from a 2-D array. In our case, we use it to get the pixel data out of the images. Inputs: const CvArr* arr – this is the input array (or image) that you want to get the data from. Int idx0 – the first index for the data element. Int idx1 – the second index for the data element. NOTE: The data in the images are stored as image(y,x) where x and y are the x and y pixel values you wish to access. For example, if you want the image value at point (100,250), you would call cvGet2D(image,250,100). Outputs: CvScalar – the function returns the value of the array at the given points as a data type CvScalar. void cvSet2D( CvArr* arr, int idx0, int idx1, CvScalar value ); Usage: This function is used to set a value in a 2-D array. In our case, we use it to set a pixel value in an image. Inputs: CvArr* arr – this is the input array (or image) that you want to set a value in. Int idx0 – this is the first index for the data element. Int idx1 – this is the second index for the data element. CvScalar value – this is the value you wish to set into the array. It must be of type CvScalar. To pass it another data type, for example an integer, you can pass this input as cvScalar(int value). Outputs: None. void cvOr( const CvArr* src1, const CvArr* src2, CvArr* dst, const CvArr* mask=NULL ); Usage: This function will do a bitwise ‘OR’ operator on the two input arrays. In our case, we use this to combine two grayscale images. Inputs: CvArr* src1 – this is the first array/image to be compared. CvArr* src2 – this is the second array/image to be compared. CvArr* dst – this is the destination for the resulting image to be saved. Const CvArr* mask – area of images to be compared. Default value is NULL (whole image is compared). Outputs: None. 34 X. CODE a. C++ Code #include #include #include #include #include #include #include #include #include #include #include <windows.h> <cvcam.h> <cv.h> <highgui.h> <stdlib.h> <string> <iostream> <fstream> <cmath> <cvaux.h> <GL/glut.h> using namespace std; double GF_QL = 0.033; //GoodFeaturesToTrack quality_level parameter -- smaller=> more points double CP_Threshold = 0.085; //CorrPoints threshold parameter -- smaller=>better, but less matches double D_Threshold = 0.1; //threshold for matching points on depth map int G_Threshold=150; //threshold for depth image pixel check int Move_Threshold=85; //threshold for moving the robot left or right double PC1_Threshold = 0.85; //threshold for correspondence for checking of stopping condition double PC2_Threshold = 0.85; double PC3_Threshold = 0.85; string file1=("FrameImage1corners.bmp");//filenames for saved images string file2=("FrameImage2corners.bmp"); string file3=("GrayImage1.bmp"); string file4=("GrayImage2.bmp"); string file5=("scaleddepthimage.bmp"); string file6=("edges.bmp"); string file7=("depth&edge.bmp"); string file8=("FrameImage1.bmp"); string file9=("objdetect.bmp"); const char* f1; const char* f2; const char* f3; const char* f4; const char* f5; const char* f6; const char* f7; const char* f8; const char* f9; char n='1'; //initial frame number for saving file purposes int cc=500; //number of corners to find in image CvPoint2D32f cr1[500]; //initialization of corner variables to be read in from output file CvPoint2D32f cr2[500]; int corners1,corners2; //number of corners ofstream outFile("corner.dat",ios::out);//create file for output of corner values int openglval[500][5]; //OpenGL variable char b[50]; //used by cvPutText CvFont* font=new CvFont; //font info to number corners float cornersMatched1[500][2]; //matching corners float cornersMatched2[500][2]; int k=0; //matching corner counter int yval[2]; //variable that is used to find roi of depth image that contains pixels greater than 165 35 int xval[2]; //variable that is use to determine the min /max pixel location of the obj in ROI set by yval in edge image //file name variables string run=("Run"); int runtime=1; IplImage* edgeim = cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1);//edge image used by multiple functions //*********************OPENGL VARS***************************************** int time=0; int x_translate=0; int y_translate=0; int translate_x=0; int translate_y=0; static GLfloat spinL =45.0; static GLfloat spinR=90.0; static GLfloat spin=0.0; GLubyte imagedata[320][240][3]; GLubyte imagedata2[320][240][3]; //function prototypes void findCorners(IplImage*); int findCorrPoints(); void getDepthMap(); int ObjectDetect(); void GetObjVert(IplImage *); void GetObjHoriz(IplImage *); int isPopcorn(IplImage* image1); void Reset(); void init(void); void loadTextures(void); void robot(void); void obj(void); void display(void); void action(void); void reshape(int, int); void mouse(int, int , int , int ); void spinDisplay(void); int opengl(); //main function int main(int argc,char** argv) { cvInitFont(font,2, 0.25,0.25 );//initialize font for drawing on images CvCapture *cam1 = cvCaptureFromCAM(NULL);//select camera CvCapture *cam2 = cvCaptureFromCAM(NULL);//only used if a second camera is connected IplImage* Grabimage=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,3); while(true) { //set all variables for filenames for saving/loading images run.insert(3,itoa(runtime,b,10)); file1.insert(0,run); file2.insert(0,run); file3.insert(0,run); file4.insert(0,run); file5.insert(0,run); file6.insert(0,run); file7.insert(0,run); file8.insert(0,run); file9.insert(0,run); f1=file1.c_str(); f2=file2.c_str(); f3=file3.c_str(); 36 f4=file4.c_str(); f5=file5.c_str(); f6=file6.c_str(); f7=file7.c_str(); f8=file8.c_str(); f9=file9.c_str(); int move = 0; cout<<"RUN "<<runtime<<endl; //capture first image and find corners Grabimage=cvQueryFrame( cam1 ); cvSaveImage(f8,Grabimage); findCorners(Grabimage); //move camera, capture 2nd image, move camera back system("nqc -pgm 1 -run"); Sleep(300); Grabimage=cvQueryFrame( cam1 ); system("nqc -pgm 2 -run"); //find corners in 2nd image findCorners(Grabimage); //process both images int cornercheck; cornercheck=findCorrPoints(); if (cornercheck==0) { openglval[runtime-1][0]=0; openglval[runtime-1][1]=0; openglval[runtime-1][2]=320; openglval[runtime-1][3]=240; openglval[runtime-1][4]=5; Reset(); move=0; cout<<"corners less than 8"<<endl; cout<<"Robot moved left"<<endl; system("nqc -pgm 5 -run"); Sleep(1000); runtime++; continue; } getDepthMap(); move = ObjectDetect(); //move robot based on information from ObjectDetect if (move ==1) { system("nqc -pgm 3 -run"); cout<<"Robot moved forward"<<endl<<endl; } else if (move==2) { system("nqc -pgm 5 -run"); cout<<"Robot moved left"<<endl<<endl; } else if (move==3) { system("nqc -pgm 4 -run"); cout<<"Robot moved right"<<endl<<endl; } else { cout << "Popcorn Box Found. Program Completed." << endl; break; } //release images, reset used global variables for next set of images Reset(); move=0; 37 //pause before next run Sleep(1000); runtime++; } //release images and camera cvReleaseCapture(&cam1); cvReleaseCapture(&cam2); opengl(); } void findCorners(IplImage* image) { //function findCorners will find the corners in an image cout<<"picture taken"<<endl; IplImage* image1 = cvCloneImage(image); IplImage* gray=cvCreateImage(cvSize(image1->width,image1>height),IPL_DEPTH_8U,1); IplImage* eigImage = cvCreateImage(cvGetSize(image1),IPL_DEPTH_32F,1); IplImage* tempImage = cvCreateImage(cvGetSize(image1),IPL_DEPTH_32F,1); CvPoint2D32f corner[500]; //find corners in image cvCvtColor(image1,gray,CV_BGR2GRAY); cvFlip(gray,NULL,0); cvGoodFeaturesToTrack(gray,eigImage,tempImage,corner,&cc,GF_QL,10.0,NULL); //draws square boxes around corners for( int i=0;i<cc;i++) { CvPoint pt1,pt2; pt1.x=corner[i].x-5; pt1.y=240-corner[i].y-5; pt2.x=corner[i].x+5; pt2.y=240-corner[i].y+5; CvScalar y,z; z.val[0]=300.00; y.val[0]=50.0; z.val[1]=100.00; y.val[1]=50.0; z.val[2]=100.00; y.val[2]=500.0; z.val[3]=300.00; y.val[3]=300.0; if (n=='1')//only draw all corners on FIRST frame, COLOR image { cvRectangle(image1,pt1,pt2,z); cvPutText(image1,itoa(i+1, b,10),cvPoint(pt1.x,pt1.y),font,y); } if (corner[i].x<1 ) { corner[i].x=corner[i].y=0; } outFile<<corner[i].x<<" "<<corner[i].y<<endl; //writes corner vector to output file } outFile<<-100<<endl; //creates marker for visual check of addtional output file outFile<<endl; //file.insert(9,1,f); if (n=='1') { cvSaveImage(f1,image1); cvSaveImage(f3,gray); } else { cvSaveImage(f2,image1); cvSaveImage(f4,gray); } n++; //release temporary images 38 cvReleaseImage(&image1);cvReleaseImage(&gray);cvReleaseImage(&eigImage);cvRelea seImage(&tempImage); } int findCorrPoints() { //function findCorrPoints finds corresponding points between two images. IplImage* im1=cvLoadImage(f3,0); //loads frame 1 IplImage* im2=cvLoadImage(f4,0); //loads frame 2 IplImage* tempim1=cvCreateImage(cvSize(im1->width,im1->height),IPL_DEPTH_8U,1); IplImage* tempim2=cvCreateImage(cvSize(im1->width,im1->height),IPL_DEPTH_8U,1); IplImage* tim1=cvCreateImage(cvSize(10,10),IPL_DEPTH_8U,1); IplImage* tim2=cvCreateImage(cvSize(10,10),IPL_DEPTH_8U,1); IplImage* result=cvCreateImage(cvSize(1,1),IPL_DEPTH_32F,1); tempim1=cvCloneImage(im1); //copy frame 1 to tempim1 tempim2=cvCloneImage(im2); CvRect rec,rec2; //initialization of rectangle region to scan for in images rec.height=10;rec.width=10; // rec2.height=10;rec2.width=10;// const int index=500; //rectangular region 1 to use for comparision of regions of image 2 double min[index],min2[index]; // int min_corners1[index]; int min_corners2[index]; int min_corners[index]; CvScalar y,z,crnclr; //colors used in drawing on images z.val[0]=300.00; y.val[0]=50.0; z.val[1]=100.00; y.val[1]=50.0; z.val[2]=100.00; y.val[2]=500.0; z.val[3]=300.00; y.val[3]=300.0; crnclr.val[0]=100.00; crnclr.val[1]=300.00; crnclr.val[2]=300.00; crnclr.val[3]=300.00; //detect edges of first frame cvCanny(im1,edgeim,30,500); cvSaveImage(f6,edgeim); //reads in corner file ifstream inFile; inFile.open("corner.dat"); int i=0; int check=0; while(check==0) { inFile>>cr1[i].x; if (cr1[i].x==-100) { cr1[i].x=NULL; check=1; break; } inFile>>cr1[i].y; corners1 = i; i++; } i=0; check=0; while(check==0) { inFile>>cr2[i].x; inFile>>cr2[i].y; corners2 = i; 39 if (cr2[i].x==-100) { check=1; } i++; } //initialize array min and min_corners to 1000, and -1 for(int i=0; i<index;i++) { min[i]=1000; min2[i]=1000; min_corners1[i]=-1; min_corners2[i]=-1; min_corners[i]=-1; } //match corners of first image to those of second image: for (int i=0;i<=corners1-1;i++) { rec.x=cr1[i].x-5;rec.y=cr1[i].y-5; if(rec.x<=6) rec.x=6; if(rec.x>=310) rec.x=310; if(rec.y<=6) rec.y=6; if(rec.y>=230) rec.y=230; CvScalar z; z.val[0]=100.00;z.val[1]=100.00;z.val[2]=100.00;z.val[3]=300.00; cvSetImageROI(tempim1,rec); cvCopy(tempim1,tim1); //find best matches of a given template image ,in this case the corner we want to find its corresponding pair for(int j=0;j<=corners2-1;j++) { rec2.x=cr2[j].x-5;rec2.y=cr2[j].y-5; if(rec2.x<=6) rec2.x=6; if(rec2.x>=310) rec2.x=310; if(rec2.y<=6) rec2.y=6; if(rec2.y>=230) rec2.y=230; cvSetImageROI(tempim2,rec2); cvCopy(tempim2,tim2); //cvMatchTemplate is used to find corresponding points using square difference method cvMatchTemplate(tim2,tim1,result,CV_TM_SQDIFF_NORMED); double min_val; CvPoint loc; cvMinMaxLoc(result,&min_val,NULL,&loc,NULL); if (min_val<min[i] && min_val<CP_Threshold) { min[i]=min_val; min_corners1[i]=j+1; } } //compare min corners for best match if more than one corner in the second image is selected as a matching corner in the first image int val1,val2; val1=min_corners1[i]; for(int p=0; p<i;p++) { val2=min_corners1[p]; if (val1==val2) { if (min[i]<min[p]) min_corners1[p]=-1; } } } 40 //now match corners of second image to those in first image for (int i=0;i<=corners2-1;i++) { rec.x=cr2[i].x-5;rec.y=cr2[i].y-5; if(rec.x<=6) rec.x=6; if(rec.x>=310) rec.x=310; if(rec.y<=6) rec.y=6; if(rec.y>=230) rec.y=230; CvScalar z; z.val[0]=100.00;z.val[1]=100.00;z.val[2]=100.00;z.val[3]=300.00; cvSetImageROI(tempim2,rec); cvCopy(tempim2,tim2); //find best matches of a given template image ,in this case the corner we want to find its corresponding pair for(int j=0;j<=corners1-1;j++) { rec2.x=cr1[j].x-5;rec2.y=cr1[j].y-5; if(rec2.x<=6) rec2.x=6; if(rec2.x>=310) rec2.x=310; if(rec2.y<=6) rec2.y=6; if(rec2.y>=230) rec2.y=230; cvSetImageROI(tempim1,rec2); cvCopy(tempim1,tim1); //cvMatchTemplate is used to find corresponding points using square difference method cvMatchTemplate(tim1,tim2,result,CV_TM_SQDIFF_NORMED); double min_val; CvPoint loc; cvMinMaxLoc(result,&min_val,NULL,&loc,NULL); if (min_val<min2[i] && min_val<CP_Threshold) { min2[i]=min_val; min_corners2[i]=j+1; } } //compare min corners for best match if more than one corner in the second image is selected as a matching corner in the first image int val1,val2; val1=min_corners2[i]; for(int p=0; p<i;p++) { val2=min_corners2[p]; if (val1==val2) { if (min2[i]<min2[p]) min_corners2[p]=-1; } } } for (int i=0;i<corners1;i++) { int val1,val2; val1=min_corners1[i]; val2=min_corners2[val1-1]; if (val2==i+1) min_corners[i]=val1; } //draw corresponding points on second frame IplImage* im2out=cvLoadImage(f2,3); CvPoint pt1,pt2; for( int i=0;i<=corners2;i++) { if(min_corners[i]>0)//if corner is valid { 41 //CvPoint pt1,pt2; pt1.x=cr2[min_corners[i]-1].x-5; pt1.y=cr2[min_corners[i]-1].y-5; pt2.x=cr2[min_corners[i]-1].x+5; pt2.y=cr2[min_corners[i]-1].y+5; cvRectangle(im2out,pt1,pt2,z); cvPutText(im2out,itoa(i+1, b,10),cvPoint(pt1.x,pt1.y),font,crnclr); cornersMatched1[k][0]=cr1[i].x; cornersMatched1[k][1]=cr1[i].y; cornersMatched2[k][0]=cr2[i].x; cornersMatched2[k][1]=cr2[i].y; k++; } } cvSaveImage(f2,im2out); if (cornersMatched2[8][0]== 0 && cornersMatched2[8][1]==0) return 0; return 1; } void getDepthMap() { //function getDepthMap gets the depth map from the two rectified images and the fundamental matrix CvMatrix3 FMAT; CvMat* points1=cvCreateMat(1,2*k,CV_32FC2); CvMat* points2=cvCreateMat(1,2*k,CV_32FC2);; CvMat* f_mat=cvCreateMat(3,3,CV_32FC1); int *scanlines1; int *scanlines2; int *lengths1=0; int *lengths2=0; int line_count=0; uchar* buffer_1; uchar* buffer_2; IplImage* im1=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,3); IplImage* im2=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,3); IplImage* gim1=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1); IplImage* gim2=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1); IplImage* depthim=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1); IplImage* depthImageScaled=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1); IplImage* mergedim=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1); im1=cvLoadImage(f1,1); im2=cvLoadImage(f2,1); //calculate fundamental matrix from corresponding points for (int i=0;i<k;i++) { points1->data.fl[2*i]=cornersMatched1[i][0]; points1->data.fl[2*i+1]=cornersMatched1[i][1]; points2->data.fl[2*i]=cornersMatched2[i][0]; points2->data.fl[2*i+1]=cornersMatched2[i][1]; } int f_check=cvFindFundamentalMat( points1,points2,f_mat,CV_FM_8POINT); //convert fundamental matrix of type cvMat into type cvMatrix3 for (int i=0;i<3;i++) { FMAT.m[i][0]=f_mat->data.fl[i*3]; FMAT.m[i][1]=f_mat->data.fl[i*3+1]; FMAT.m[i][2]=f_mat->data.fl[i*3+2]; } //get scanlines of image from fundamental matrix cvMakeScanlines( &FMAT, cvSize(320,240), 0, 0, 0, 0, &line_count);//this gets the number of scanlines 42 scanlines1 = (int*)(calloc( line_count* 2, sizeof(int) * 4)); // | scanlines2 = (int*)(calloc( line_count * 2, sizeof(int) * 4)); //allocating memory for parameters lengths1 =(int*)(calloc( line_count* 2, sizeof(int) * 4)); //of cvMakeScanlines lengths2 = (int*)(calloc( line_count* 2, sizeof(int) * 4)); // | cvMakeScanlines(&FMAT,cvSize(320,240),scanlines1, scanlines2,lengths1,lengths2,&line_count); //rectify images buffer_1 = (uchar*)(malloc(320*line_count*3)); //allocate memory for parameters of buffer_2 = (uchar*)(malloc(320*line_count*3)); //cvPreWarpImage cvPreWarpImage(line_count,im1,buffer_1,lengths1,scanlines1); cvPreWarpImage(line_count,im2,buffer_2,lengths2,scanlines2); //find depthmap from rectified images cvCvtColor(im1,gim1,CV_BGR2GRAY); //cvFindStereoCorrespondence requires cvCvtColor(im2,gim2,CV_BGR2GRAY); //rectified grayscale images cvFindStereoCorrespondence(gim1,gim2,CV_DISPARITY_BIRCHFIELD,depthim, 255, 25, 5, 12, 12, 25); cvCopyImage(depthim,depthImageScaled,0); cvConvertScale(depthim, depthImageScaled, 2.5); cvSaveImage(f5,depthImageScaled); //combine edge image from canny edge detector and the depth map image into new image cvOr(depthImageScaled,edgeim,mergedim); cvSaveImage(f7,mergedim); //release images cvReleaseImage(&im1); cvReleaseImage(&im2); cvReleaseImage(&gim1); cvReleaseImage(&gim2); cvReleaseImage(&depthim); cvReleaseImage(&depthImageScaled); cvReleaseImage(&mergedim); } int ObjectDetect() { //function ObjectDetect will use information from the edge lines from the canny //edge detector and the depth map to detect objects, their locations and also //determine a plan of action. IplImage* depthim=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1); IplImage* im=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,0); CvRect rec; CvPoint pt1; CvPoint pt2; CvScalar z; //color for drawing on image frame z.val[0]=100.00; z.val[1]=400.00; z.val[2]=100.00; z.val[3]=300.00; //check for image with no edges. //image with no edges --> very large object //is right in front of camera. int edgecheck=0; for (int j=0;j<240;j++){ for (int i=0;i<320;i++){ CvScalar val=cvGet2D(edgeim,j,i); if (val.val[0]==255) edgecheck++; } } if (edgecheck <20) //less than 20 edge pixels { openglval[runtime-1][0]=0; openglval[runtime-1][1]=0; openglval[runtime-1][2]=320; 43 openglval[runtime-1][3]=240; openglval[runtime-1][4]=2; return 2; } im=cvLoadImage(f1,1); depthim=cvLoadImage(f5,0); //find an object and determine its size/location GetObjVert(depthim); if (yval[0]==1000 && yval[1]==-1) return 1; rec.x=0; rec.y=yval[0]; rec.width=320; rec.height=yval[1]-yval[0]; cvSetImageROI(edgeim,rec); GetObjHoriz(depthim); //if one of object corner points is invalid, no object in path, move forward if ( (xval[0]==1000) || (xval[1]==-1) || (yval[0]==1000) ||(yval[1]==-1) || (xval[1]-xval[0]<20)) { openglval[runtime-1][5]=1;//save variables to map in opengl return 1;//send move to be 1 } //object corner points are valid, therefore object is in path //draws rectangle of the obj that is detected pt1.x=xval[0]; pt1.y=yval[0]; pt2.x=xval[1]; pt2.y=yval[1]; cvRectangle(im,pt1,pt2,z,3); cvSaveImage(f9,im); openglval[runtime-1][0]=xval[0];//save variables to map in opengl openglval[runtime-1][1]=yval[0]; openglval[runtime-1][2]=xval[1]; openglval[runtime-1][3]=yval[1]; //first check for stopping condition IplImage* Frame1 = cvLoadImage(f8,1); int end = isPopcorn(Frame1); if (end) { openglval[runtime-1][4]=0; return 0; } //robot should not stop, determine action if (xval[0] >Move_Threshold) { openglval[runtime-1][4]=2; return 2;//send move to be 2 } else { openglval[runtime-1][4]=3; return 3;//send move to be 3 } cvResetImageROI(edgeim); cvReleaseImage(&im); cvReleaseImage(&depthim); } void GetObjVert(IplImage * dim) { //function GetObjVert gets the vertical bounds of an object //in the image in pixels and saves it to yval[]. //this function is called by ObjectDetect(). int min_val,max_val; CvScalar temp; 44 min_val=1000; max_val=-1; for(int i=0; i<320;i++) { for(int j=0;j<240;j++) { temp=cvGet2D(dim,j,i); //check depthmap to see if object is close to robot if( temp.val[0] >= G_Threshold) { min_val=min(min_val,j); max_val=max(max_val,j); } } } yval[0]=min_val; yval[1]=max_val; } void GetObjHoriz(IplImage * dim) { //function GetObjHoriz gets the horizontal bounds of an object //in the image in pixels and saves it to xval[]. //this function is called by ObjectDetect(). int min_val,max_val; min_val=1000; max_val=-1; CvScalar temp,temp2; for(int i=0; i<320;i++) { for(int j=0;j<edgeim->roi->height;j++) { temp=cvGet2D(edgeim,j,i); temp2=cvGet2D(dim,j,i); //if there is an edge if (temp.val[0]==255) { //check depthmap to see if object is close to robot if(temp2.val[0]>=G_Threshold) { min_val=min(min_val,i); max_val=max(max_val,i); } //erase edges on edge image that are not close //to robot. else if(temp2.val[0]<100) cvSet2D(edgeim,j,i,cvScalar(0)); } } xval[0]=min_val; xval[1]=max_val; } } void Reset() { //function Reset will reinitialize certain global variables //to their appropriate initial values to be used in the next //run of the program. for (int i=0; i<500;i++) { cr1[i].x=0; cr1[i].y=0; cr2[i].x=0; 45 cr2[i].y=0; cornersMatched1[i][0]=0; cornersMatched1[i][1]=0; cornersMatched2[i][0]=0; cornersMatched2[i][1]=0; } k=0; corners1=0; corners2=0; xval[0]=1000; xval[1]=-1; yval[0]=1000; yval[1]=-1; n='1'; file1=("FrameImage1corners.bmp"); file2=("FrameImage2corners.bmp"); //filenames for saved images file3=("GrayImage1.bmp"); file4=("GrayImage2.bmp"); file5=("scaleddepthimage.bmp"); file6=("edges.bmp"); file7=("depth&edge.bmp"); file8=("FrameImage1.bmp"); file9=("objdetect.bmp"); run=("Run"); cvReleaseImage(&edgeim); edgeim = cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1); } int isPopcorn(IplImage* image1) { //function isPopcorn will scan through the input image and try to match //a region to a template image to determine whether or not the popcorn //box is present in front of the robot. IplImage* popcorn = cvLoadImage("Image/popcornsmall.bmp",3); IplImage* result = cvCreateImage(cvSize(219,204),IPL_DEPTH_32F,1); //first scan for smallest template. cvMatchTemplate(image1,popcorn,result,CV_TM_CCORR_NORMED); double max_val; cvMinMaxLoc(result,NULL,&max_val,NULL,NULL); if (max_val>PC1_Threshold) { cvReleaseImage(&popcorn); cvReleaseImage(&result); return 1;//return 1 --> popcorn found. } //scan for medium template. popcorn = cvLoadImage("Image/popcornmedium.bmp",1); result = cvCreateImage(cvSize(176,189),IPL_DEPTH_32F,1); cvMatchTemplate(image1,popcorn,result,CV_TM_CCORR_NORMED); cvMinMaxLoc(result,NULL,&max_val,NULL,NULL); if (max_val>PC2_Threshold) { cvReleaseImage(&popcorn); cvReleaseImage(&result); return 1;//return 1 --> popcorn found. } //scan for large template. popcorn = cvLoadImage("Image/popcornlarge.bmp",3); result = cvCreateImage(cvSize(165,185),IPL_DEPTH_32F,1); cvMatchTemplate(image1,popcorn,result,CV_TM_CCORR_NORMED); cvMinMaxLoc(result,NULL,&max_val,NULL,NULL); if (max_val>PC3_Threshold) { cvReleaseImage(&popcorn); cvReleaseImage(&result); return 1;//return 1 --> popcorn found. } cvReleaseImage(&popcorn); cvReleaseImage(&result); return 0;//return 0 --> popcorn not found. } 46 int opengl() { //int* argc=0; char** argv=opengl; glutInit(&__argc,__argv); glutInitDisplayMode (GLUT_DOUBLE | GLUT_RGB); glutInitWindowSize (640, 480); glutInitWindowPosition (100, 100); glutCreateWindow (__argv[0]); init (); glutDisplayFunc(display); glutReshapeFunc(reshape); glutMouseFunc(mouse); if(runtime==time) exit(0); glutMainLoop(); return 0; /* ANSI C requires main to return int. */ } /* void loadTextures(void) { IplImage* im=cvLoadImage("Image/robot.bmp",1); int imageWidth=320; int imageHeight=240; for(int m=0; m<320;m++) { for(int n=0; n<240;n++) { CvScalar q=cvGet2D(im,n,m); imagedata[m][n][0]=q.val[0]; imagedata[m][n][1]=q.val[1]; imagedata[m][n][2]=q.val[2]; } } glBindTexture (GL_TEXTURE_2D, 0); glPixelStorei (GL_UNPACK_ALIGNMENT, 1); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT); glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT); glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_DECAL); glTexImage2D (GL_TEXTURE_2D, 0, GL_RGB, imageWidth, imageHeight, 0, GL_RGB, GL_UNSIGNED_BYTE, imagedata); for (int i=1; i<=runtime;i++) { run.insert(3,itoa(i,b,10)); file8.insert(0,run); f8=file8.c_str(); im=cvLoadImage(f8,1); CvRect roi; roi.x=openglval[i-1][0]; roi.width=openglval[i-1][2]-openglval[i-1][0]; roi.y=openglval[i-1][1]; roi.height=openglval[i-1][3]-openglval[i-1][1]; for(int m=0; m<roi.width;m++) { for(int n=0; n<roi.height;n++) 47 { cvSetImageROI(im,roi); CvScalar q=cvGet2D(im,n,m); imagedata2[m][n][0]=q.val[0]; imagedata2[m][n][1]=q.val[1]; imagedata2[m][n][2]=q.val[2]; } } imageWidth=openglval[i-1][2]-openglval[i-1][0]; imageHeight=openglval[i-1][3]-openglval[i-1][1]; glBindTexture (GL_TEXTURE_2D, i); glPixelStorei (GL_UNPACK_ALIGNMENT, 1); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT); glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT); glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_DECAL); glTexImage2D (GL_TEXTURE_2D, 0, GL_RGB, imageWidth, imageHeight, 0, GL_RGB, GL_UNSIGNED_BYTE,imagedata2 ); run=("Run"); file8=("FrameImage1.bmp"); } }*/ void robot(void) { glPushMatrix(); glColor3f(1,1,0); glRotatef(spin,0.0,0.0,0.1); glTranslatef(x_translate,y_translate,0); glRectf(-5.0, -5.0, 5.0, 5.0); glPopMatrix(); } void obj(void) { glPushMatrix(); glTranslatef(translate_x,translate_y,0); //glBindTexture (GL_TEXTURE_2D, time); glRectf(-5.0, -5.0, 5.0, 5.0); glPopMatrix(); } void display(void) { glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); //glEnable(GL_TEXTURE_2D); //loadTextures(); if(time==0) { robot(); glFlush(); glutSwapBuffers(); } else if(time==runtime+1) exit(1000); else { if (openglval[time+1][4]==1) { glPushMatrix(); glColor3f(1,1,0); 48 robot(); glPopMatrix(); glFlush(); glutSwapBuffers(); } if (openglval[time+1][4]==2) { translate_x=x_translate-15; translate_y=y_translate+15; glColor3f(1.0,0, 0); obj(); spinDisplay(); glColor3f(1,1,0); robot(); glFlush(); glutSwapBuffers(); } if (openglval[time+1][4]==3) { translate_x=x_translate+10; translate_y=y_translate+15; glColor3f(1.0,0, 0); obj(); spinDisplay(); glColor3f(1,1,0); robot(); glFlush(); glutSwapBuffers(); } if (openglval[time+1][4]==0) //end of program, popcorn found { robot(); translate_y=y_translate+15; glColor3f(0,0,1); obj(); glFlush(); glutSwapBuffers(); } if (openglval[time+1][4]==5) //robot didnt find 8 corners { spinDisplay(); robot(); glFlush(); glutSwapBuffers(); } } } void mouse(int button, int state, int x, int y) { switch (button) { case GLUT_LEFT_BUTTON: if (state == GLUT_DOWN) { time++; display(); } break; default: break; } } 49 void spinDisplay(void) { if (openglval[time][4]==2 || openglval[time][4]==5) { spin=spinL; glutPostRedisplay(); } else if (openglval[time][4]==3) { spin=spinR; glutPostRedisplay(); } else spin=0.0; } void init(void) { glClearColor(0,0,0,0); run=("Run"); file8=("FrameImage1.bmp"); } void reshape(int w, int h) { glViewport (0, 0, (GLsizei) w, (GLsizei) h); glMatrixMode(GL_PROJECTION); glLoadIdentity(); glOrtho(-50.0, 50.0, -50.0, 50.0, -1.0, 1.0); glMatrixMode(GL_MODELVIEW); glLoadIdentity(); } 50 X. CODE b. NQC Code start.nqc: moves robot in a straight line forward a set distance. #define left OUT_C //defines the variable left as the output C of the rcx #define right OUT_A //defines the variable left as the output A of the rcx task main() { SetPower(left,10); SetPower(right,6); On(left); //turns on the left motor of the rcx On(right); //turns on the right motor of the rcx Wait(100); Off(left+right); } moveleft.nqc: turn robot to the left by 90 degrees. #define left OUT_C //defines the variable left as the output C of the rcx #define right OUT_A //defines the variable left as the output A of the rcx task main() { SetPower(right,7); SetPower(left,10); OnRev(left); //turns on the left motor of the rcx On(right); //turns on the right motor of the rcx Wait(65); Off(left+right); } moveright.nqc: turn robot to the right by 90 degrees. #define left OUT_C //defines the variable left as the output C of the rcx #define right OUT_A //defines the variable left as the output A of the rcx task main() { SetPower(right,7); SetPower(left,9); On(left); //turns on the left motor of the rcx OnRev(right); //turns on the right motor of the rcx Wait(65); Off(left+right); } camright.nqc: move camera on robot to the right a set distance. task main() { SetPower (OUT_B,10); On (OUT_B); Wait(35); Off(OUT_B); } camleft.nqc: move camera on robot to the left a set distance. task main() { SetPower (OUT_B,10); OnRev (OUT_B); Wait(35); Off(OUT_B); } 51 X. CODE c. Removed FindGroups() C++ Function Code void FindGroups() //note: function no longer used. { // This function groups neighboring corners by a certain distance togethter // in a group and finds the min and max x and y value for that group double max_dist=40.0; //max distance between grouped corners int g=1; int checker=0; CvPoint2D32f min_val; CvPoint2D32f max_val; for(int i=0; i<=500;i++) { min_val.x=cr1[i].x; min_val.y=cr1[i].y; max_val.x=cr1[i].x; max_val.y=cr1[i].y; for(int j=i+1;j<500;j++) { if( cr1[j].x!=0 && cr1[j].y!=0) { //compare distance between corners if (sqrt( pow( (cr1[j].x-cr1[i].x) ,2)+ pow( (cr1[j].ycr1[i].y) ,2) )< max_dist) { if (group[j]<=0) { group[i]=g; group[j]=g; checker=1; //sets min max values for each group window min_val.x=min(min_val.x,cr1[j].x); min_val.y=min(min_val.y,cr1[j].y); max_val.x=max(max_val.x,cr1[j].x); max_val.y=max(max_val.y,cr1[j].y); } } } } if (checker==1) { min_max_of_group[g-1][0]=min_val; min_max_of_group[g-1][1]=max_val; g++; checker=0; } } } 52 XI. Individual Contributions About 80-90% of our code was written with one of us sitting at the computer typing with the other person looking over his shoulder pointing out all the little typos made. We took turns at this so we are about 50-50 when it comes to the amount of code written (aside from the OpenGL done by Giovanni). Here is a list of the parts of the project that were done separately: Everything that is not mentioned should be assumed that it was done cooperatively. Code: Giovanni created the original capture method using cvCam functions and the appropriate callback functions. Mike created the revised capture method using cvCaptureFromCAM functions. Giovanni created the original FindGroups() function. Mike wrote the isPopcorn() function, that Giovanni later revised. Giovanni wrote all code involving OpenGL. Report: Giovanni typed up the goals, opencv initialization, and the algorithms used. Mike typed up the mechanical design, the results, the limitations / areas for improvement, and the error / threshold analysis sections. Mike commented most of the code to make it look nice and neat and fit into our report. Mike also typed up the second and third parts of the opencv function descriptions. OpenCV Installation: Giovanni found out what was needed to install. Mike downloaded some installation files from the Microsoft website. Robot Construction and Design Giovanni came up with the initial idea to mount the camera onto the robot. Mike designed and built the new wheel base. Giovanni designed and built the reinforcements for the camera mount and camera cable holder. Mike designed and built the IR tower holder. 53 XII. Robots in Society and Economics Paper The robotic field has been around for quite some time know, but has been one of those little known fields. Usually when one thinks or robots they think or wither some sci-fi movie robot or a children’s toy. Robots now have hit the mainstream and showing up in different sectors on the market today. A few or these sectors are military, medical, and manufacturing/consumer sectors. World militaries have always relied upon human forces to get the job done. However, today’s army is spending more and more money for each soldier sent to war. First, soldiers in the military must be trained before they can do anything at whatever cost necessary. While in the field, the soldiers must be equipped with the latest gear that will help keep the good guys alive and help eliminate the bad guys. As technology progresses, so does the cost of this equipment. At the end of the day, the cost of these soldiers can be very expensive for the taxpaying citizens. Robots can do the more dangerous tasks at different price. The main cost of a robot comes from its research and development costs. After that, all that is left is the individual price of the unit and its maintenance costs. These robots do not need to be trained or fed. After all this, the military may have realized that it would be cheaper in the long run to lose a robot to a sniper or to a bomb than it would be to lose a soldier. Recently, “iRobot Corp. …won a $95.6 million contract modification to increase the number of available MTRS [Man Transportable Robotic System bomb-disposal robot] units from 250 to 1,200” while “Foster-Miller Inc. …won a similar $96.1 million contract modification to provide an additional 250 to 1,200 MTRS production units” from the U.S. Department of Defense [6]. This shows how the government is willing to spend more upfront money in order to help save the lives of its soldiers. Because the 54 robots do not require any sort of training period, they can be sent to war as soon as they are manufactured, thus working to save more soldiers’ lives. The government will be able to see a quick return on this investment as fewer soldiers are killed in war, thus reducing the need to train more replacement soldiers. The United States government places a high value on human life. They realize that these soldiers are not just tools used to win a war; they also have families of their own. Not only will the family be devastated by the loss of a loved one, but they will also have to make up for a new lack of income. This will indirectly affect the local economy as the family is forced to save more and more money just make ends meet and thus return less money to the community. The family may also be forced to seek government financial assistance which would be paid by the taxpayers. No matter what happens, society as a whole will feel the burden of the lost soldier(s). The U.S. government also has a strong degree of national pride for its citizens and soldiers in foreign lands. The idea of ‘no man left behind’ has caused the whole nation to go to great lengths for a handful of people in need. We would be willing to make more sacrifices just to release one or two prisoners of war, which could even cause more loss of life. One way to prevent the unnecessary loss of human life on the battlefield would be to use robots as replacement soldiers or to clear the way for human soldiers. The iRobot Corp. have devised a system that is “a remote, deployable sensor suite designed to provide early warning information, gunshot detection, intelligence, surveillance, and targeting capabilities to military forces.”[7] This sniper detection robot would be able to assist soldiers in locating and attacking the enemy soldiers, effectively keeping themselves alive. “Snipers have had the advantage of being effectively invisible, making them a deadly threat on the battlefield and in urban settings.”[7] The sniper robot will put U.S. soldiers at more of an advantage with the additional 55 intelligence. While the military may lose a few robots here and there at war, most of the country would gladly pay the price of a few pounds of metal over the price of losing a loved one. At the same time, iRobot’s bomb-detection robot can achieve the same goals of saving human lives. In the cases where the situation is just too dangerous to risk a human life, a robot can excel. Their MTRS robot will be used for “remote reconnaissance of unexploded ordinance and improvised explosive devices.”[6] This robot will be able to do the jobs that no human would be willing to do, soldier or otherwise. One of the major contributors to the increase in technology of robots in military applications has been the United States government. The Defense Advanced Research Projects Agency (DARPA) held a competition for $1 million for any research team that could design and build an automated vehicle to drive across a desert track. “The point of the Grand Challenge was not to produce a robot that the military could move directly to mass production, [5]” its real goal was to motivate researchers to think in new directions and further improve the technology available. Participating in the Grand Challenge were teams from universities across the country such as Carnegie Mellon University, Stanford, Princeton, Cornell, UCLA, MIT, Caltech, and Virginia Tech. The winning team was the Stanford team, showing that universities can advance technology just as well as private companies. In industrial sectors, robots have become an accepted part of the work day. A sector that is new to robotics is the medical sector. One of the driving forces of this change is economics. While the robots have a very high price tag, “about $1.3 million apiece”[1], the robots are able to perform surgeries with such greater precision that the price becomes a bargain. Intuitive Surgical’s da Vinci robotic system has “transformed the field of prostate surgery, for which it was approved in May 2001. That year it was used in less than 1 percent of all prostatectomies. 56 This year more than 20 percent will be done with the robot. And that figure is expected to double next year.”[1] Doing that many operations each year, it is only a matter of time before the hospital turns a large profit. The additional precision provided by the robotic system allows for a less intrusive surgery, which is less dangerous and also has a shorter recovery time. Both of those contribute to lowered costs of the hospital. The downside to robots entering the medical field is the unwillingness to accept these techniques. The difference between the U.S. and Europe, where people are “less squeamish about such things”, is that in Europe, “robots have assisted in operations on the brain, the prostate and the inner ear.”[8] While only part of the problem is the patients here in the states, some doctors think that “once a hospital invests in such an expensive system, surgeons might feel pressured to use it and steer patients toward surgery over other treatment options” and others worry is that “some surgeons are reluctant to commit the hours necessary to learn robotic techniques.”[1] We can see that the problem is with the country as a whole and its prejudice against new techniques. Other countries have already accepted robotics as simply a tool to perform a job better, yet the U.S. seems to overlook this fact. The sector with the most robot market penetration would have to be the industrial or manufacturing sector. Robots of a type have been used in the automation of assembly lines for years. Companies now such as Adept Technology and CRS Robotics are selling robots that can automate drug discovery and assembly of fiber-optic equipment [2]. At the same time, companies like Honda and Sony are taking the robots in a new direction. Honda, with its ASIMO, and Sony, with its QRIO, have developed more humanoid robots. These robots are designed to be more lifelike in the way they look by having visible features like eyes, mechanical lungs and artificial lips[3]. These companies are driving Japan in 57 particular to be the world leader in robotics. It has been estimated that the market for ‘service robots’ will reach $10 billion within a decade. These robots will be used to replace workers from the shrinking Japanese economy. Japan is a country that would be able to adjust to such a change in the workforce. This could be because “few Japanese have the fear of robots that seems to haunt westerners in seminars and Hollywood films.”[4] In our culture, robots are often seen as the evil monster in most situations. However, in Japanese culture, robots are seen in more of a positive way. This difference in perspective is what could be the difference in what brings technology to Japan faster than the United States. One can see all the aspects in which robots are being introduced or expanded on today and how the future or the robotics filed will continue to change. This will lead to social and economic aspects that will involve humans in one way or another, whether they be personal governmental or industrial wise. 58 1. Barrett, Jennifer. “Cutting Edge” Newsweek 12/12/2005, Vol. 146 Issue 24, p50-54, 4p, 2c 2. Chediak, Mark. “ROBOTICS SECTOR PROFILES.” Red Herring; Dec2001 Issue 108, p94-95, 2p 3. “Humanoids on the march.” Economist; 3/12/2005, Vol. 374 Issue 8417, Special Section p3-4, 2p, 2c 4. “Better than people.” Economist; 12/24/2005, Vol. 377 Issue 8458, p58-59, 2p, 3c 5. Gibbs, W. Wayt. “INNOVATIONS FROM A ROBOT Jan2006, Vol. 294 Issue 1, p64-71, 8p, 19 diagrams, 9c RALLY.” Scientific American; 6. Keller, John. “DOD issues two quick urgent orders for antiterrorism bomb-disposal robots.” Military & Aerospace Electronics; Nov2005, Vol. 16 Issue 11, p22-22, 1/4p 7. McHale, John. “iRobot and Photonics Center create sniper detection system for land robots.” Military & Aerospace Electronics; Nov2005, Vol. 16 Issue 11, p6-8, 2p, 1bw 8. “And now, robodoc!” Time; 11/23/92, Vol. 140 Issue 21, p23, 1/5p, 1c 59