Group2

advertisement
332:472
FINAL REPORT
By Group 2
Michael Dougher
Giovanni Sosa
TABLE OF CONTENTS
SECTION
PAGE NUMBER
I.
Abstract
3
II.
Goals
3
III.
Algorithms
4
IV. Mechanical Design
5
V.
9
Results
VI. Error / Threshold Analysis
18
VII. Limitations / Areas for Improvement
19
VIII. OpenCV Initialization Steps
22
IX. OpenCV Functions Used
X.
a. Report 1
24
b. Report 2
29
c. Final Report
34
Code
a. Main C++ Code
35
b. NQC Code
51
c. Removed FindGroups() Function
52
XI. Individual Contributions
53
XII. Robotics in Society and Economics Paper
54
2
I. Abstract
Our robotic vision system utilizes uncalibrated stereo images from a single
camera to detect the presence of any objects in the immediate path of the
robot. If an object is present, our robot will decide to avoid this object by
turning to either the left or right. If no object is present or if the object is far
away from the robot, the robot will proceed forward. The robot will
maneuver about its environment until it finds a certain object which will tell
the robot to stop. When it is done, our vision system will display an
animation showing the path of the robot and any objects detected using
OpenGL.
Our system will save every pair of stereo images it takes. It can run
autonomously for an unlimited period of time. The only limiting factors
would be battery life of the robot and hard drive space for the images on the
computer.
II. Goals
This is a list of the steps taken for each interim. It shows how our system was
developed and what changes were made each time.
Our goals for the first interim report were to:
 Build a lego robot and mount the new fusion webcam onto it using
only lego bricks.
 Install OpenCV and get it to work in Microsoft Visual C++ 2005 (see
section VIII. OpenCV Initialization).

Create a basic method for feature tracking between a set of two
images.
Our goals for the second interim report were:
 Rewrite camera initialization and capture code.
 Update robot design to fit new program methods.
 Enhance corner detection and point correspondence methods.
 Design and create an algorithm to detect depth disparity between a
stereo pair of images.
 Find objects and their locations in the images by way of corner
detection, edge detection and the depth disparity map.
For our final report, our goals are:
 Again update robot design to fit new program methods.
 Further enhance corner detection, point correspondence, and object
detection methods.
 Improve upon camera movement, both mechanically and in NQC.
 Allow program to continuously loop to grant robot more autonomy.
 Create an animated map in OpenGL that will show the movements of
the robot as well as the objects detected in the environment in their
positions relative to the robot.
3
III. Algorithms
The main algorithm for this interim is the feature tracking algorithm. This
algorithm includes a few smaller sub-algorithms. Our algorithm is as follows:
 Capture first image.
 Find features to track in image 1 by locating corners.
 Capture second image.
 Find features to track in image 2 by locating corners.
 Find corresponding points of corner 1 and corner 2 by using a
template of corner 1 and its surrounding pixel values and take the
sum of square difference of each corner and its surrounding pixel
values in the second image. If the SSD is less than a threshold we
selected, it is the corresponding point. This SSD is also the
normalized value. The threshold value was selected by running
multiple experiments to see what would give the most accurate
corresponding points. The equation for SSD is as follows:
T( x, y) - I(x + x, y + y)2

R x, y  = x, y 
 T( x, y)2 •  I(x + x, y + y)2
x , y 

x , y 
Find fundamental matrix by using 8 point algorithm.
The method for finding the fundamental matrix by the 8 point
algorithm is:
 X 1 X 1
X X 
 2 2
 X 3 X 3

X X
A 4 4
 X 5 X 5

 X 6 X 6
X X 
 7 7
 X 8 X 8
Y1 X 1
Y2 X 2
Y3 X 3
Y4 X 4
Y5 X 5
Y6 X 6
Y7 X 7
Y8 X 8
X 1
X 2
X 3
X 4
X 5
X 6
X 7
X 8
X 1Y1
X 2Y2
X 3Y3
X 4Y4
X 5Y5
X 6Y6
X 7Y7
X 8Y8
Y1Y1
Y2Y2
Y3Y3
Y4Y4
Y5Y5
Y6Y6
Y7Y7
Y8Y8
Y1
Y2
Y3
Y4
Y5
Y6
Y7
Y8
X 1 Y1 1
X 2 Y2 1
X 3 Y3 1

X 4 Y4 1
X 5 Y5 1

X 6 Y6 1
X 7 Y7 1

X 8 Y8 1
U , S V   SVD( A)
T
 V 1,9  V  4,9  V  7,9  


F  V  2,9  V  5,9  V  8,9  
V  3,9  V  6,9  V  9,9  
4






Rectify both images.
Find stereo disparity by using Birchfield and Tomasi disparity method.
--This method can be seen in the publication: “Depth Discontinuities
by Pixel-to-Pixel Stereo”, IJCV, Volume 35 pages 269-293 1999.
Compare disparity map with corners and edges to find objects and
their locations.
Move robot according to object’s presence and location.
End program when certain object is recognized. The matching will be
done using normalized correlation. The formula for this is:
 T ( x ', y ')  I ( x  x ', y  y ')
R( x, y)  x ', y '
 T ( x ', y ')   I ( x  x ', y  y ')
2
x ', y '

2
x ', y '
Animate a map of robot’s motion and objects detected in OpenGL.
The overall flow of the program is as follows:
Main
Capture Image #1
findCorners
Capture Image #2
findCorners
findCorrPoints
getDepthMap
ObjectDetect
GetObjectVert
GetObjectHoriz
Move robot
Reset
This method will be done to decide how the robot should react to its
environment. It will be repeated until the stopping conditions have been met.
IV. Mechanical Design
Initially, we wanted our robot to be very small in size so that it would be
more maneuverable. We gave it a very compact wheel base and attached the
lego camera to the top of the RCX simply by surrounding the base of the
camera with lego bricks to hold it in place. [Figure 1] shows the overall size of
our original robot design while [Figure 2] shows how the camera was
mounted.
5
Figure 1: Frontal View of Old Robot.
Figure 2: Camera Mounted onto RCX on Old Robot.
Eventually, our design required us use a stereo camera setup. To do this
with only one camera required the camera to be able to move horizontally.
Therefore, we constructed a horizontal track on top of the RCX and mounted
the camera to a motor and connected the camera/motor combination to the
track via a guidebar [Figure 3,6]. The camera was attached to the motor
simply by squeezing the motor into the middle of the curved base of the
camera. It is further secured by additional lego bricks both underneath and
above the base of the camera. We found this method of mounting the camera
to be very secure.
Once this was constructed, our very small wheelbase became very
unstable under the increased weight. Therefore, we opted for a wider
wheelbase that made use of the additional motors we had [Figure 4,5]. We
connected two motors to each side of the robot. Both of the motors on each
side of the robot are connected to the same output of the RCX brick. This
way, each of the outputs of the RCX brick would control two motors on one
side of the robot [Figure 4,5].
After the wheelbase was finished, we ran a few test runs of our code. We
found that the weight of the camera and the camera mount near the front of
the robot caused the robot to be very front heavy. When the robot would move
forward and then come to a quick stop, we could see the robot come close to
tipping over forwards. This gave us incentive to build an extension of the
robot forward to give it extra balance and prevent it from tipping [Figure 4,5].
In the end, we connected the light sensor to the B output of the RCX (the
output that the camera motor is connected to). We attached the light to the
front of the robot. This addition was purely cosmetic. Its only contribution to
our robot is to light up when the camera moves [Figure 5].
6
Figure 3: The Camera Mount on the New Robot.
Figure 5: Frontal View of New Robot.
Figure 4: The Wider Wheel Base of the New Robot.
Figure 6: Top View of New Robot.
When we finished the new robot, we ran the NQC programs to move the
robot. We quickly found that some of the motors behaved differently than
others. Upon closer inspection, we were able to notice a slight difference in
color between some of the motors. We noticed that the brighter colored ones
were the new motors given this semester and the more faded colored motors
were the older motors from previous years. We found that the newer motors
were much stiffer than the older motors. The older motors spin fairly easily
when forced; however, the newer motors will only spin a few rotations before
coming to a stop. This led to us noting that the newer motors offered a great
deal more resistance to spinning than the older motors. To balance out the
driving of the robot, we put one new motor and one old motor on each side of
the wheel base. Even after this balancing, the robot would still swerve to one
side when told to drive straight. This was easily fixed by lowering the power
output of the appropriate RCX output in the NQC code.
For the final report, we needed to continue to work in updating our robot’s
design. The first thing we did was remove the light sensor from the front of
the robot. Its presence was purely cosmetic. It had no real function and ended
up just getting in the way of other important robot movements so it had to go.
Next, after repeated running of the camera movement programs, we found that
occasionally, the camera would not move all the way across the track.
It was at this point that we discovered the importance of the camera to rest
on a smooth base. While the gear of the motor must follow the track, the rest
7
of the motor sits on the sides of some legos that are smooth. It is VERY
important that the motor move along a smooth base. If the motor hits any bit
of resistance as it moves, it will be slowed down and then the camera will not
have moved as far which will degrade the results. The smooth base can be
seen somewhat in Figure 3 as the black, gray and yellow lego pieces.
As we would move the camera, occasionally the weight would shift and
cause the motor’s gear to contact less of the track. To counter this, we needed
to extend the smooth base of the motor farther back above the RCX brick.
However this caused an increase in the overall weight of the camera mount
and so an additional support was needed at the back of the robot to keep the
base level. Figure 7 shows the updated camera base design.
Figure 7: Updated Camera Base Design.
Figure 8: Final Version of Robot with Reinforced
Camera Mount, Cable Holder and IR Tower Holder.
We have also found that the lateral motion of the camera is very
dependent on the webcam’s cable. If the cable is too tense, it will not be able
to move far enough to get a good picture. Therefore, it is important that the
computer connected to the robot to be moved along with the robot. To help fix
this, we built an attachment on top of the motor to hold the cable in place
when the camera moves [Figure 8].
After our batteries died once again, we found that our camera was moving
with such great power that it would actually break the lego connections
holding the camera to the motor. In order to fix this, we had to reinforce the
legos around the motor [Figure 8].
In order to reduce the need to worry about the IR tower’s location relative
to the robot, we decided that we needed to attach the IR tower to the robot
itself. This way, the IR tower stays right behind the robot at all times and
maintains a line of sight with the IR receiver inside the RCX brick. This
brings us to our final version of our lego robot [Figure 8]. We are quite happy
with the way it works mechanically.
8
V. Results
Here are the results from our previous method of feature detection and
corresponding point matching. The initial method of finding features was
reasonable, yet yielded few results [Figure 9,10]. Our method of matching points
by using intensity histograms was also very sub-par and managed to rarely match
points correctly between images.
Figure 9: Features Tracked by Previous Method, First Image.
Figure 10: Features Tracked by Previous Method, Second
Image.
Our new code yields much better feature tracking by finding more features and
also by matching the corresponding points much more accurately. We have samples
from 3 different runs of our code showing 3 different scenes. Figures 11-16 show the
features and correspondences for the 3 different scenes. The method shows many
corresponding points. We managed to adjust the algorithm to provide a higher
number of correct corresponding points, while at the same time minimizing the
number of false accepts.
Figure11: Run#1 Left Image. All Features Displayed.
Figure 12: Run#1 Right Image. Only the Corresponding
Features Displayed.
9
Figure 13: Run#2 Left Image. All Features Displayed.
Figure 14: Run#2 Right Image. Only the Corresponding
Features Displayed.
Figure 15: Run#3 Left Image. All Features Displayed.
Figure 16: Run #3 Right Image. Only the Corresponding
Features Displayed
The first run shows many features detected in the left image. It also shows many
corresponding points in the right image. The points on the box in the background are
mostly mapped correctly while some of the points in the middle of the calibration
pattern are mapped incorrectly. This error is due to the fact that the local windows
around each of those corners appear almost identical. This run shows how our
algorithm can be confused by repeating patterns like the calibration pattern.
The second run also shows a lot of features detected in the left image. However,
nearly all of the corresponding points in the right image are mapped correctly. This
run shows the potential of our algorithm when it finds a lot of correct matches.
The third run shows a lot of features detected around the dresser, yet it does not
find many features on the side of the dresser itself. The second image only finds 8
corresponding points, 1-2 of which are incorrect (the one point shown is on the wrong
button on the TV while the other button is labeled off the image and cant be
compared effectively). So few points are found because the right image moves so
that the camera is mostly focused on the side of the dresser, where very few features
are found. This run shows how our algorithm can be inaccurate.
10
After we have the corresponding points for the stereo pair, we next calculated the
depth map. The depth map shows the approximate depth of the points in the scene.
Figures 17-19 show the depth maps for the three runs. The depth maps shown are
scaled versions of their original image.
Figure 17: Run#1 Depth Map
Figure 18: Run#2 Depth Map
Figure 19: Run#3 Depth Map
All of the depth maps have one thing in common: they are noisy. This noise can
come from the error in correspondence or from the approximation of the fundamental
matrix or even from the images not being rectified correctly. Even through all of the
noise, it is still fairly easy to determine what is in front of you in each scene.
The first scene shows what looks like a checkered box in the right side of the
image and also another box object farther in the background. The second scene is
mostly black, which means that there is nothing very close to the robot. Reviewing
the original picture of the scene, we see that this is correct. There are no objects close
to the robot. The third scene shows a large object in front of the robot that
encompasses all but the left third of the image. All three of these depth maps
represent their original scenes fairly well.
11
In order to visually interpret the results, we can also use the canny edge detector
and get an image of the edges of each scene from the initial images. Overlaying the
edge image and the depth image allows us to better interpret the results. Figures 2025 show the images of the edges taken from the left image of the stereo pair and the
combination of the edges and the scaled depth map images.
The edge image will become useful later on, when we use it to find the location of
an object in the scene. The edge image and the depth map together form the basis for
our method of determining an objects presence and, if one is present, determining its
location relative to the robot.
Figure 20: Run#1 Edge Image.
Figure 21: Run#1 Edge Image and Depth Map Combined.
Figure 22: Run#2 Edge Image.
Figure 23: Run#2 Edge Image and Depth Map Combined.
12
Figure 24: Run#3 Edge Image.
Figure 25: Run#3 Edge Image and Depth Map Combined
From Scenes 1 and 3, we can see that the edges usually correspond to regions
where the depth map is changing sharply. In the second image, the distance is very
great so the depth does not correspond very well to the edge image. However for the
sake of object detection and avoidance, this will not be a problem since a black depth
map indicates an absence of obstacles. This way, the robot is free to move anyway.
To aid in the detection of obstacles, we have grouped the feature points that are
located near one another. Combining this with the depth maps and the edge images
of each scene can provide enough to locate any obstacles and determine a course of
action around them. Figures 26-28 show the original scene with the features shown
and the windows around the groups of features that we detected.
Figure 26: Run#1 Group Image
Figure 27: Run#2 Group Image
13
Figure 28: Run#3 Group Image
For the final version of our vision system, we updated our Grouping algorithm to
perform even better. In the end, however, we did not need the function to detect
objects in the images. We used only the depth map and the edge image to find
objects and their positions. The function will be displayed in the code section (section
X.a. C++ Code). The methods for the corner detection, corresponding point detection, and
depth map creation are all very similar in their resulting images as before, so to save
space we will omit these images because the older versions of these methods still
illustrate the end result very well.
Using the depth map and edge detection images, we are able to determine the
presence and location of any objects in the scene in the path of the robot. Figure 29
and 30 show two results of our object detection algorithm. These images were taken
from two runs that were different from the scenes depicted in runs 1 through 3 above.
Figure 29: Run#4 Object Detection Result.
Figure 30: Run#5 Object Detection Result.
From figures 29 and 30, you can immediately see one of the largest drawbacks of
this method. Because of the way the depth map is obtained, there is no depth
information for the leftmost side of the image. This means that we cannot detect
objects in that part of the image only (this limitation is discussed in more detail in section VII.
Limitations / Areas for Improvement). This limitation causes any objects that are recognized to
14
have their leftmost edge to be approximately 85 pixels from the left. Once this is
taken into effect, the boxes drawn on the objects in the scene can be seen to be a
fairly good approximation. Figure 30 shows the nearest box as the object from the
leftmost possible limit to within a few pixels of the actual rightmost edge of the box.
Figure 29, however, has the rightmost edge of the object box as the right edge of the
image. This would appear to be an error, but if we observe the depth map for this
image, Figure 31, we can see that there was a lot of noise in the depth map and this
causes the object to appear wider than it actually is.
Figure 31: Run#4 Depth Map & Edge Image
Showing High Noise Levels
From Figure 31, we can see the noise present in the depth map. This noise could
be from a number of possible sources (see section VII. Limitations / Areas for Improvement).
Because of this noise, our method for detecting the horizontal limits of the object will
see the stretched part of the depth map as if it is an extension of the object itself.
Therefore, we think that our method for detecting objects in the images works well
given correct information.
From these two object detection images, our robot should be able to determine
how it should act. It is easily seen that there is an object in the robot’s immediate
path in both figures 29 and 30. Therefore, we know that the robot should turn either
to the right or to the left. Our method for determining which direction to turn is quite
simple. We use the fact that the minimum x value that the object detection box will
have will be 85 due to the limitations stated above. Therefore, if there is an object in
the leftmost side of the image, the object detection box will start at 85 pixels in the x
direction. However, if there is no image in the leftmost side of the image, the object
detection box will start at something more than 85 pixels in the x direction.
Therefore, to determine which direction to go, we do a simple check on the value
of the x minimum of the object detection box. If the minimum value is 85, there is an
object to the left, so the robot should turn right. If the minimum value is greater than
85, there is not an image to the left, and so the robot should turn left. We also use the
value of 85 because the depth map will constantly stretch the depth to the right in the
image. Therefore, we do not want to use a threshold value farther to the right in the
15
image. In both runs 4 and 5, our robot turned to the right. Visually inspecting the
two object detection images, a right turn would be the acceptable course of action.
For our stopping condition, we used a simple object recognition method to
determine if one specific object is present in the scene. For this, we modified our
simple correlation matching algorithm that we used for last semester to detect a box
of popcorn in the scene (or more specifically, a small section the box). We modified
the code to search for this region in 3 different sizes. Figure 31 shows the template
section of the image to compare to the frames and figure 32 shows an input frame that
caused the robot to stop.
Figure 32: Template Image.
Figure 33: Image Showing Stopping Condition.
The downside to using just a simple correspondence match to find the stopping
condition is that there is the possibility for the system to incorrectly match a different,
but similar scene to the popcorn box. In order to reduce the number of incorrect
matches of the stopping condition, we had to adjust the threshold values for matching
the popcorn box.
When we display our results in OpenGL, we use a specific coloring scheme. We
represent the lego robot as a yellow square, which stays in the center of the
animation. We represent all objects detected as red squares and we represent the
popcorn box that will end the program as a blue square. As the program runs, the
yellow robot square will rotate in the center of the animation and object squares will
be shown around it. Figures 34 through 38 on the next page show our output
animation at a few different time instances. These images show our basic method for
displaying our robots actions and object detections in OpenGL. The objects that are
drawn as red squares are shown in positions that are relative to the robot. Because
our system does not use any sort of calibration, the exact distances cannot be
determined and the distance between the OpenGL representation of the robot and the
object are merely an approximation.
16
Figure 34: OpenGL representation of robot.
Figure 35: OpenGL representation of robot with object
detected.
Figure 36: OpenGL representation of robot after rotation.
Figure 37: OpenGL representation of robot after rotation
with object detected.
Figure 38: OpenGL representation of robot after detecting popcorn box. This is the
last frame of the animation as the program exits when this object is detected.
17
VI. Error / Threshold Analysis
The first analysis of importance to us was the setting of the GF_QL threshold
value. This threshold value is for the Good Features to Track quality level input.
This value determines the number of corner points that will be found in the image.
Just by doing a few quick observations, we determined that a lower value for GF_QL
meant more points found and a higher value meant less points found. Figure 39
shows a chart of exactly how many corners were found given a certain quality level.
The points obtained were taken of an unchanging scene with all other programming
values constant.
Number of Corners
Found
GF_QL Value vs Number of Corners Found
250
200
150
100
50
0
0
0.2
0.4
0.6
0.8
1
1.2
GF_QL value
Figure 39: Graph of Good Features to Track Quality Level and its effect on the number of corners found.
From this graph, we can easily see our previous assumption was correct; the
number of corners found decreases as GF_QL increases. We can see that it follows
an approximately exponential curve downwards to zero corners as GF_QL
approaches 1. For our purposes, we need to have a slightly larger number of total
corners so that when the corners are matched, we will have more matches. We set
GF_QL to be 0.04 which would find 96 corners in this scene.
We ran a series of runs of our vision system. Each time our robot made an action,
we recorded that action and noted whether or not we felt it should have made that
action based on what we could see of the scene visually. Most of the time, the robot
behaved as we would have expected it to, however sometimes it does not. Figure 40
shows how often our robot acted correctly and incorrectly. Out of a total of 88
actions recorded, it acted correctly 62 times, a 70% accuracy. However these
numbers are slightly skewed. The run forward action was highly accurate with a 95%
accuracy while the turn left action was mostly hit or miss with a 61% accuracy. From
this we feel that our method for detecting an objects presence is very good. This can
be seen from the 95% accuracy of the move forward algorithm. However, our
reasoning for causing the robot to turn left leaves room for improvement.
18
Frequency
Actions Taken by Robot
80
60
40
20
0
62
37
2
forward
6
2
right
14
9
left
5
3
end
16
total
Actions
Correct Action Taken
Incorrect Action Taken
Figure 40: Plot of Actions taken by Robot, Correct and Incorrect actions shown.
VII. Limitations / Areas for Improvement
Limitations:
One of the major drawbacks to our method is in the acquisition of the depth map.
Because of our mechanical design, the leftmost part of the left image of the stereo
pair will contain information that is not included in the right image. This is due to the
fact that the camera moves a certain distance laterally, and does not rotate to capture
the same scene points. This effect means that the depth map will not be able to get
any sort of approximation for the depth in that leftmost region. Therefore, the depth
map will generally only display black in that region (corresponding to nothing present
at that location) whether or not it should be black. This effect can be seen in Figure
19 or 25 above. You can see that the depth appears as black on the leftmost side of
the image when it should be a darker gray.
Our method of acquiring the depth map is also very vulnerable to motion within
the scene. Differences in objects positions in the scene between the stereo images are
assumed to be from the difference in the camera, not from the object being in motion.
If the object is in motion, the depth map will show it as moving either more or less
(depending on its direction) and so the depth for that object will be either closer or
farther accordingly.
Another limitation of the depth map is its noise. The depth map will almost
always display a good deal of noise in the image. Most of the time, it is simply a
stretching of the depth horizontally across the image. However, on rare occasions,
the noise will cause small areas of the image to show as being close to the robot when
they are in fact not close at all. This only gives us a problem when there is little else
being represented as close in the depth map. It will then incorrectly show that
something is nearby.
The method is also vulnerable to shadows of objects. If an object is placed above
the ground with a shadow underneath it, the stereo depth algorithm will treat that
darkened shadow as another object that is lying on the ground. This can cause the
ObjectDetect function to see the shadow and object as a single, larger image.
However, this will not affect our outcome much because the object was in the robots
path anyway. The only difference will be in the OpenGL mapping of the object. It
will display the object as being larger than it should be.
19
Another limitation of our method is that the object detection algorithm assumes
that only one object is in the scene. Assume, for example, that two objects are
located in the scene and are separated by some distance. Our method finds the
extreme horizontal and vertical limits of anything close enough to the robot. In this
case, it will treat the two objects as one without any space in between them. This will
only be a problem if the distance separating the two objects is more than the width of
the robot. This will cause the robot to act incorrectly as the program will tell the
robot that there is a large object directly in front of it and it should turn to one side or
the other. In reality, the robot may have been able to simply go straight forward
between the two objects.
Our robot system is also limited in a way by physical constraints. First, our robot
is tethered to the computer system. If this computer happens to be a stationary
desktop computer, the robot has very little range as the cable for the camera is very
short. This could be fixed by simply using a USB extension cable to give the cable
enough length to maneuver wherever needed. An alternative would be to connect to a
laptop computer and hold the laptop above the robot as it maneuvers about.
The idea of cables leads to the second constraint. The cable for the camera and
the cable for the IR tower can occasionally get tangled and cause the camera cable to
receive extra tension when it is still close to the computer. This might not have been
a problem if we were able to build the IR tower harness that holds the tower a few
inches behind the RCX brick at all times as we built for last semester. Unfortunately,
our design for the camera mount required most of those pieces and so we could not
build the IR tower harness. This isn’t much of a problem as the IR tower has plenty
of range; it is just something to look out for as the tower must be facing the back of
the robot whenever it is trying to communicate. This is more or less just a matter of
keeping track of where the tower is relative to the robot and moving it accordingly by
hand.
Our robot’s performance is also limited by its battery level. One would hope that
the robot would perform at about the same quality no matter how much battery life
was remaining. However, that is not the case for our robot. We found that at lower
battery levels, the motors would actually spin less for each set period of time. This
would cause the robot to move a shorter distance and, in the case of our camera, not
move far enough to get a good image for the right image of the stereo pair. Also
affected was the robots ability to turn. The NQC programs were initially programmed
at a certain battery level and after the batteries had drained some, the robot would
rotate less.
After replacing the batteries, each command of the robot caused every action to
have more power than required. This meant that the robot would move and turn
farther and faster. This caused many problems especially with the camera motion.
After this, we had to change the timing of our NQC codes to reset them to the desired
levels.
Areas for Improvement:
From our results, we noticed that the amount of lateral movement of the camera is
very important to get a better quality depth map. A better quality depth map will in
turn give us better results. Therefore, it is important to ensure that the camera moves
20
laterally by a significant amount. We found from messing around with our camera
system that a lateral movement of about 8 inches or so yielded the best depth map in
our system. However, our robot design only allowed for a maximum lateral
displacement of about 6 inches. Ideally, we would be able to get the maximum
displacement, but our current system does not allow that range of motion. Lego axle
pieces that are longer than the ones supplied would be necessary for that amount of
motion.
A quick fix for the above problem would be the use of two webcams. If we could
mount two cameras side by side, we would be able to take two images almost
simultaneously without having to move the camera at all. This would simply our
design a great deal and we would be able to remove all error in our results associated
with incorrect movement of the camera. OpenCV supports the usage of multiple
cameras and the code would not be much more complex than the existing single
camera code. The method of using two cameras for stereo vision would also allow us
to assume a constant distance between the two cameras. With the single camera
system, the camera may not move its full distance and so we cannot assume any
single value for its distance.
Coding wise, our system could be improved by allowing the recognition of
multiple objects in the scene. This would allow the robot to act in a more appropriate
manner in certain very specific cases such as stated above in the limitations section.
This would surely be an upgrade as it would allow our robot a better opportunity to
act in a much better way.
Our system could also be improved by some sort of auto-calibration system. We
would not want to use a manual calibration method as that would take away from the
autonomy of our robot. Our current method uses no type of calibration and so all
distances from the robot to the objects in the scene cannot be known. With the
calibration, we would be able to know the distances between the objects in the scene
and the robot as well as more exact measurements of the sizes of the object(s). This
would allow our robot to make much better decisions on how to maneuver around
objects placed in front of it.
21
VIII. OpenCV Initialization
Files Downloaded for installation of OpenCV:
 OpenCV beta 5
 DirectX 9.0 February 2005 SDK package
 DirectX 9.0 February 2005 SDK Extras package
 Windows Platform SDK
Note: here, we assume that the respective program packages are installed to the c drive
in the ‘Program Files’ directory.
Special Instructions when installing:
When installing the DirectX SDK Extras package, one has to place the
destination directory where the installed DirectX SDK package is located.
Instructions for editing Visual Studio 2005:
Open Visual Studio without loading a project, so the start page is
displayed. Then select the tools toolbox and then options… In the Project
and Solutions tab->Vc++ Directories tab there is a drop down menu to
select the different directories to add where the SDK and OpenCV files
are.
 In the executables files directories option we added the following
directories:
C:\Program Files\Microsoft Platform SDK\bin\winnt
C:\Program Files\Microsoft Platform SDK\bin
C:\Program Files\Microsoft DirectX 9.0 SDK (February
2005)\Utilities\Bin\x86
C:\Program Files\OpenCV\bin
 In the include files directories option we add the following directories:
C:\Program Files\Microsoft Platform SDK\include,
C:\Program Files\OpenCV\cv\include
C:\Program Files\OpenCV\otherlibs\highgui
C:\Program Files\OpenCV\otherlibs\cvcam\include
C:\Program Files\OpenCV\cxcore\include
C:\Program Files\OpenCV\cvaux\include
 In the libraries directories option we added the following:
C:\Program Files\Microsoft Platform SDK\lib
C:\Program Files\Microsoft DirectX 9.0 SDK (February 2005)\Lib\x86
C:\Program Files\Microsoft DirectX 9.0 SDK (February
2005)\Extras\DirectShow\Samples\C++\DirectShow\BaseClasses
C:\Program Files\OpenCV\lib
22
Additional Necessary Steps:
 Edit the Windows environmental variables. You can access this by going into
system settings under control panel. Then select the Advance tab and click on
Environmental Variables. Then add C:\ProgramFiles\OpenCV\bin to the system
path variable.
 Method for executing nqc within Visual Studio. Copy the nqc.exe file to the
C:\Windows\System32 directory. The program can be executed in C++ by using
the system() command. For example: system(“nqc –d program1.nqc –run”) will
download program1.nqc and run it. Alternatively, you can also download each
program you need to a different program slot on the RCX brick. Then in C++,
you can simply tell the robot to run this program without needing to download it
again. This can be done by: system("nqc -pgm 1 -run") where the ‘1’ is the
program in the corresponding program slot. It can be an integer 1 through 5.
Side note: If a USB tower is being used, it is easier to add a variable called
RCX_Port with the value usb to the environmental variables.
Creation of Solution / Project:
 When creating a project one has to edit the links to the project and include cv.lib,
cvcam.lib, highgui.lib, cxcore.lib, cvaux.lib. This is done by going to the project
toolbox, and then selecting the properties option. Then select the Configuration
Properties->linker->inputs tab. This .lib files are typed into the Additional
Dependencies option.
23
IX. OpenCV Functions
a. Functions used in Interim Report 1:
CVCam camera Functions:
cvcamInit()
Usage: no inputs/outputs. This is used to initialize the cvcam properties defined by
cvcamSetProperty()
int cvcamStart();
Usage: no inputs. This is used to start the camera video feed after initialization.
Outputs: returns a int value to check to see if there was an error
int cvcamStop();
Usage: no inputs. This is used to stop the camera video feed.
Outputs: this function always outputs a int value of 0
int cvcamExit();
Usage: no inputs. This is used to clear close all cvcam attributes on system
Outputs: this function always outputs a int value of 0
int cvcamSelectCamera(int** out);
Usage: This function is used to select different video capture devices that might be
installed on the system.
Inputs: Int** Out- this is an empty pointer used to select different device numbers
Outputs: int – This is the return value of the device selected by user to use as the video
capture device.
int cvcamSetProperty(int camera, const char* property, void* value);
Usage: This function is used set different properties for the device being used. Some of
these properties include a callback function in which every frame is processed by
this callback function.
Inputs: Int camera- this is the device number which is the output of
cvcamSelectProprty()
Const Char* property- this is the variable in which the user state the property
you want to invoke
Void* Value- this is the control value for the property
Outputs: Int – this is used as a way to validate that the property was executed or
encountered an error
Image Processing Functions (1):
IplImage* cvCreateImage( CvSize size, int depth, int channels );
Usage: this function creates an empty image with the designated parameters.
Inputs: CvSize size - this is the size of the image that will be created. This is best
24
used by just passing cvSize( int width, int height) or by using
cvGetSize(IplImage*) for the size variable.
Int depth - this is the color depth of the new image. Can be one of:
IPL_DEPTH_8U - unsigned 8-bit integers
IPL_DEPTH_8S - signed 8-bit integers
IPL_DEPTH_16U - unsigned 16-bit integers
IPL_DEPTH_16S - signed 16-bit integers
IPL_DEPTH_32S - signed 32-bit integers
IPL_DEPTH_32F - single precision floating-point numbers
IPL_DEPTH_64F - double precision floating-point numbers.
Int channels – this is the number of channels in the image. For example,
grayscale images will have a channel value of 1, while full color will have
3 channels (1 each for red, green and blue).
Outputs: IplImage* - this is the output image. It will be of type IplImage*.
cvSaveImage( const char* filename, const CvArr* image );
Usage: this function will save an image variable to a file on the hard disk.
Inputs: const char* filename - this is the filename that the saved image will have.
For example, “image01.bmp” (with the quotes) is acceptable.
Const CvArr* image - this is the image that will be saved to a file.
IplImage* is an acceptable form of CvArr* to be passed into
cvSaveImage.
Outputs: none.
IplImage* cvLoadImage( const char* filename, int iscolor );
Usage: this function will load an image from a saved file to a variable.
Inputs: const char* filename - this is the filename of the file to be loaded.
Int iscolor Specifies colorness of the loaded image:
if >0, the loaded image is forced to be color 3-channel image;
if 0, the loaded image is forced to be grayscale;
if <0, the loaded image will be loaded as is (with number of channels
depends on the file).
IplImage* cvCloneImage( const IplImage* image );
Usage: this function will copy an image variable into another image variable.
Inputs: const IplImage* image – this is the image variable that you want to copy.
Outputs: IplImage* - output image variable is IplImage*, the same as the input.
cvSetImageROI( IplImage* image, CvRect rect );
Usage: this function will set the region of interest (ROI) of an image. Using this,
You can set the processing range to be a smaller portion of an image.
Inputs: IplImage* image - this is the input image that you want to get a region of.
CvRect rect – this is the location and size of the region to be used. The
Parameters can be set as follows:
CvRect rec; rec.height = 10; rec.width = 10;
rec.x = x_offset; rec.y = y_offset;
25
and then passing rec into the cvSetImageROI function.
cvRectangle( CvArr* img, CvPoint pt1, CvPoint pt2, CvScalar color,
int thickness=1, int line_type=8, int shift=0 );
Usage: will draw a rectangle at the two vertices of the rectangle and add them to
the image.
Inputs: CvArr* img – input image to draw the rectangle on.
CvPoint pt1 – vertex 1 of rectangle.
CvPoint pt2 – vertex 2 of rectangle.
CvScalar color – 4 object variable that has the color of the line.
The four parameters are: Blue amount, Green amount, Red
amount, and Alpha (have not tested the 4th component yet).
You can set this by:
CvScalar z;
z.val[0]=300.00; z.val[1]=100.00;
z.val[2]=100.00; z.val[3]=300.00;
This will give you blue squares.
Int thickness – thickness of border of rectangle. Default is 1. This
parameter is optional.
Int line_type – type of line for border. Default is 8. This parameter
is optional.
Int shift – Default is 0. This parameter is optional.
Outputs: none.
cvCvtColor( const CvArr* src, CvArr* dst, int code );
Usage: This function converts an image from one color space to another. For
example, converts an image from color (3 channels) to grayscale (1 ch).
Inputs: const CvArr* src – this is the input image to be converted. It will not be
modified.
CvArr* dst - this is the output image that the converted image will be.
Int code - Color conversion operation. Possible options are:
CV_RGB2YCrCb CV_RGB2HSV CV_RGB2GRAY
Even though those codes are not integers, they can still be used for
input arguments for the code.
Outputs: none (the output image file is actually an input argument).
cvFlip( const CvArr* src, CvArr* dst=NULL, int flip_mode=0);
Usage: this function flips an image about one or more axes.
Inputs: const CvArr* src cvGoodFeaturesToTrack( const CvArr* image, CvArr* eig_image, CvArr* temp_image,
CvPoint2D32f* corners, int* corner_count, double quality_level, double
min_distance, const CvArr* mask=NULL, int block_size=3, int use_harris=0,
double k=0.04 );
Usage: This function finds corners within an image based on its eigenvalues. It
will return an array of corner points. The maximum number of corners it
finds is the total length of the array corners. If the function finds fewer
26
corners than that, the array is padded by numbers approximately zero.
Inputs: const CvArr* image – this is the input image that you want to find
corners in. It will not be changed.
CvArr* eig_image – this is a temporary image used within the function.
It can be passed as a newly created IplImage* that has not been
assigned any image.
CvArr* temp_image – another temporary image used similarly to
eig_image.
CvPoint2D32f* corners – this is an array of 2D points. The array length
is the maximum number of corners you want found.
Int*corner_count – this output variable is the total number of points
found and located in corners.
Double quality_level – this is used as a multiplier to find the maximum
eigenvalue. It works to filter out ‘lower quality’ corners by
“rejecting the corners with the minimal eigenvalue less than
quality_level•max(eig_image(x,y)).”
Double min_distance – this is the minimum distance that can be between
two corners. Making this higher makes the corners be farther apart.
Const CvArr* mask – this is used to set the search region to find corners.
If set to NULL, the function will search the entire image. Default
value is NULL. This parameter is optional.
Int block_size – this is the size of the averaging block. It is used by other
functions within goodFeaturesToTrack. Default value is 3. This
parameter is optional.
Int use_harris – this is used to tell the function whether or not to use
the harris edge detector. If this value is zero, goodFeatures will use
cvCornerMinEigenVal. If this value is nonzero, goodFeatures will
use cvCornerHarris. The default value is zero (do not use harris
detection). This parameter is optional.
Double k – this is a parameter for the harris corner detector. It is only used
when use_harris is nonzero. The default value is 0.04. This
parameter is optional.
Outputs: none (outputs are assigned as inputs to the function).
Histogram functions:
CvHistogram* cvCreateHist( int dims, int* sizes, int type, float** ranges=NULL, int
uniform=1 );
Usage: this function creates a histogram of the given size and dimensions.
Inputs: int dims – this is the number of dimensions for the histogram.
Int* sizes – histogram dimension sizes.
Int type – determines type of histogram. Can be either:
CV_HIST_ARRAY or CV_HIST_SPARSE
Float** ranges – array of ranges for histogram values. Default value is
NULL.
27
int uniform - Uniformity flag; if not 0, the histogram has evenly spaced
bins. Default is 1 (not zero).
Outputs: CvHistogram* - returns a pointer to the histogram object.
void cvCalcHist( IplImage** image, CvHistogram* hist, int accumulate=0, const CvArr*
mask=NULL );
Usage: This function is used to calculate the histogram of images
Inputs: IplImage** image – this is the image that you want to calculate the histogram
for
CvHistogram* hist – this is a pointer that points to the created histogram by the
function cvCreateHist().
Outputs: None
double cvCompareHist( const CvHistogram* hist1, const CvHistogram* hist2, int
method );
Usage: this function compares two histograms using the desired method.
Inputs: const CvHistogram* hist1 – first histogram to be compared.
Const CvHistogram* hist2 – second histogram to be compared.
Int method – the possible methods for comparison are:
CV_COMP_CORREL, CV_COMP_CHISQR
CV_COMP_INTERSECT, CV_COMP_BHATTACHARYYA
Outputs: double – a comparison value based on the method will be output.
void cvClearHist( CvHistogram* hist );
Usage: This function clears the histogram information
Inputs: CvHistogram* hist - this is the histogram that you wish to clear
Outputs: None
Math Functions:
int cvRound( double value );
Usage: Takes in a double value and rounds it to the nearest integer and returns that
integer value.
Inputs: double value – this is the decimal number that you want to convert to int.
Outputs: int – this is the rounded integer value of the input number.
28
IX. OpenCV Functions
b. Functions added for Interim Report 2:
CVCapture Functions:
CvCapture* cvCaptureFromCAM( int index );
Usage: assigns a variable to your camera that will be capturing images.
Inputs: int index - this is the index number of the camera connected to the computer. If
only one camera is used, you can set index to be NULL or -1
Outputs: CvCapture* - this is the variable for the camera that you use to capture images.
IplImage* cvQueryFrame(CvCapture* capture);
Usage: this function will capture a frame from the camera assigned to the capture
variable.
Inputs: CvCapture* capture – this is the camera variable that was assigned using
cvCaptureFromCAM.
Outputs: IplImage* - this is the captured image.
Void cvReleaseCapture( CvCapture** capture);
Usage: this function releases the camera variable from memory.
Inputs: CvCapture** capture – this is a pointer to your camera’s capture variable.
Because the input is type CvCapture** and your capture variable is type
CvCapture*, you need to pass it &capture (assuming that ‘capture’ is your
camera’s variable.)
Outputs: None.
Image Processing Functions (2):
void cvMatchTemplate( const CvArr* image, const CvArr* templ,
CvArr* result, int method );
Usage: this function scans through the input image and tries to find a match for the
template image.
Inputs: const CvArr* image – input image to search be searched.
Const CvArr* templ – template image that will be compared to all regions of the
input image.
CvArr* result - A map of comparison results; single-channel 32-bit floatingpoint. If image is W×H and templ is w×h then result must be W-w+1×H-h+1.
Int method – method to compare template to input image. Can be:
CV_TM_SQDIFF, CV_TM_SQDIFF_NORMED, CV_TM_CCORR
CV_TM_CCORR_NORMED, CV_TM_CCOEFF, or CV_TM_CCOEFF_NORMED
Outputs: None.
void cvMinMaxLoc( const CvArr* arr, double* min_val, double* max_val,
CvPoint* min_loc, CvPoint* max_loc, const CvArr* mask);
Usage: this function finds the minimum and maximum values for an array and their
locations.
29
Inputs: const CvArr* arr – this is the array that you will be searching through to find the
minimum and maximum values.
Double* min_val – this is the returned minimum value of the array.
Double* max_val – this is the returned maximum value of the array.
CvPoint* min_loc – this is the returned location of the minimum value in the
array. Default is NULL (does not return value).
CvPoint* max_loc – this is the returned location of the maximum value in the
array. Default is NULL (does not return value).
Const CvArr* mask – mask used to select a subarray. Default is NULL. (no
subarray used = = whole array is searched).
Outputs: None.
Void cvReleaseImage( IplImage** image );
Usage: this function releases the image variable from memory.
Inputs: IplImage** image – this is a pointer to the image variable you want to release.
Because it is of type IplImage** and your image is of type IplImage*, you must
pass it &image (assuming that ‘image’ is your image variable.)
Outputs: None.
CvMat* cvCreateMat( int rows, int cols, int type );
Usage: this function creates a matrix variable of the given size and type. The matrix will
store the data as a single vector of length (n_rows * n_cols). The elements are
assigned row by row.
Inputs: int rows – this is the number of rows of the matrix.
int cols – this is the number of columns of the matrix.
int type – this is the type of data being stored in the matrix. For example,
CV_8UC1 means an 8-bit unsigned single-channel matrix.
Outputs: CvMat* - this is the matrix variable.
int cvFindFundamentalMat(const CvMat* points1, const CvMat* points2, CvMat*
fundamental_matrix, int method, double param1, double param2,
CvMat* status);
Usage: this function calculates the fundamental matrix for a set of images from a set of
corresponding points between those images.
Inputs: Const CvMat* points1 – this is an array of the point (x,y) coordinates for the
first image.
Const CvMat* points2 – this is an array of the point (x,y) coordinates for the
second image.
CvMat* fundamental_matrix – this is the variable where the output
fundamental matrix will be saved into a [9x1] column vector.
Int method – this is the method that the function will use to calculate the
fundamental matrix. We used CV_FM_8POINT which is the 8-Point algorithm.
Double param1 - the maximum distance from point to epipolar line. This is only
used if method is CV_FM_RANSAC or CV_FM_LMEDS. Default value is 1.0.
30
Double param2 – the desirable level of confidence that the matrix is correct. This
is only used if method is CV_FM_RANSAC or CV_FM_LMEDS. Default value
is 0.99.
CvMat* status - The optional output array of N elements, every element of
which is set to 0 for outliers and to 1 for the other points. This is only used if
method is CV_FM_RANSAC or CV_FM_LMEDS. Default value is NULL.
Outputs: int – this output variable is the number of fundamental matrices found (1 or 3
depending on method or 0 if no matrix is found).
Void cvMakeScanlines( const CvMatrix3* matrix, CvSize img_size, int* scanlines1, int*
scanlines2, int* lengths1, int* lengths2, int* line_count);
Usage: this function has two main uses. When you input
scanlines1=scanlines2=lenghts1=lengths2=0, the function will simply count the
number of scanlines and save that number into line_count. If those parameters are
not zero, the function will calculate the scanlines (epilines) for the two images.
The information from the two images comes from the fundamental matrix passed
into it by ‘matrix’. NOTE: MEMORY MUST BE ALLOCATED BEFORE
CALLING FUNCTION. This is done by first calling cvMakeScanlines to find
the total number of scanlines and save them to line_count. You then allocate the
memory by:
scanlines1 = (int*)(calloc( line_count* 2, sizeof(int) * 4));
scanlines2 = (int*)(calloc( line_count * 2, sizeof(int) * 4));
lengths1 =(int*)(calloc( line_count* 2, sizeof(int) * 4));
lengths2 = (int*)(calloc( line_count* 2, sizeof(int) * 4));
After the memory is allocated for those 4 variables, the function can be called
again to find the scanlines.
Inputs: const CvMatrix3* matrix – this is the input fundamental matrix. NOTE: the
type CvMatrix3 is different from a CvMat of size (3x3). The CvMatrix3
fundamental matrix can be found from the CvMat fundamental matrix as follows:
CvMatrix3 FMAT; //f_mat is CvMat fundamental matrix already calculated.
for (int i=0;i<3;i++)
{
FMAT.m[i][0]=f_mat->data.fl[i*3];
FMAT.m[i][1]=f_mat->data.fl[i*3+1];
FMAT.m[i][2]=f_mat->data.fl[i*3+2];
}
CvSize img-size – this is a cvSize variable for the size of the image. The simplest
way to handle this parameter if the image height and width are known constants,
you can pass cvSize(width,height) as this input.
Int* scanlines1 – this is the array of scanlines of the first image. (see note in
usage for this input.)
Int* scanlines2 – this is the array of scanlines of the second image. (see note in
usage for this input.)
Int* lengths1 – this is the array of the lengths of the scanlines in the first image.
(see note in usage for this input.)
Int* lengths2 – this is the array of the lengths of the scanlines in the second
image. (see note in usage for this input.)
Int* line_count – this is the total number of scanlines found.
31
Void cvPreWarpImage( int line_count, IplImage*img, uchar*dst, int*dst_nums,
int*scanlines );
Usage: this function will warp the input image so that its scanlines are horizontal,
rectifying it. NOTE: MEMORY MUST BE ALLOCATED BEFORE
CALLING FUNCTION. This can be done as follows:
buffer = (uchar*)(malloc(320*line_count*3));
where line_count is found from cvMakeScanlines.
Inputs: int line_count – this is the total number of scanlines in the image. This parameter
can be obtained from cvMakeScanlines.
IplImage* img – this is the input image to be rectified.
Uchar* dst- the output data buffer. Memory must be allocated to this before
calling the function as noted above.
Int* dst_nums – this is the array of scanline lengths from cvMakeScanlines.
Int* scanlines – this is the array of scanlines from cvMakeScanlines.
Outputs: None.
Void cvFindStereoCorrespondences(const CvArr* leftImage, const CvArr* rightImage, int
mode, CvArr* depthImage, int maxDisparity, double
param1, double param2, double param3, double param4,
double param5 );
Usage: this function will input two rectified grayscale images and will calculate a depth
map based on their disparities.
Inputs: const CvArr* leftImage – the input image of the left part of the stereo pair.
Const CvArr* rightImage – the input image of the right part of the stereo pair.
int mode – this is the method to find the depth map. Currently, this value must be:
CV_DISPARITY_BIRCHFIELD.
CvArr* depthImage – this is the output depth map image.
Int maxDisparity – this is the maximum level of disparity there will be between
the two images. If there are objects in the foreground and background, this value
will need to be high.
Double param1 – ‘constant occlusion penalty’. Default value is 25.
double param2 – ‘constant match reward’. Default value is 5.
Double param3 – ‘highly reliable region’. Default value is 12.
Double param4 – ‘moderately reliable region’. Default value is 15.
Double param5 – ‘slightly reliable region’. Default value is 25.
Outputs: None.
Void cvConvertScale( const CvArr* src, CvArr* dst, double scale, double shift);
Usage: this function scales and/or shifts the input array and puts that information into the
output array.
Inputs: const CvArr* src – input image/array to be scaled or shifted.
CvArr*dst – output image/array.
Double scale – amount by which the array will be scaled. Default value is 1 (for
no scaling).
Double shift – amount by which the array will be shifted. Default value is 0 (for
no shifting).
Outputs: None.
32
Font Functions:
Void cvInitFont( CvFont* font, int font_face, double hscale, double vscale, double shear, int
thickness, int line_type );
Usage: this function initializes the font so that you can insert text onto images. This must
be called before any other font commands. The CvFont* font variable must first
be created by:
CvFont* font=new CvFont;
Inputs: CvFont* font – pointer to font structure that will be initialized.
Int font_face – integer value for the font type. Can be one of the following:
CV_FONT_HERSHEY_SIMPLEX
CV_FONT_HERSHEY_PLAIN
CV_FONT_HERSHEY_DUPLEX
CV_FONT_HERSHEY_COMPLEX
CV_FONT_HERSHEY_TRIPLEX
CV_FONT_HERSHEY_COMPLEX_SMALL
CV_FONT_HERSHEY_SCRIPT_SIMPLEX
CV_FONT_HERSHEY_SCRIPT_COMPLEX
Double hscale – horizontal scale factor for font size.
Double vscale – vertical scale factor for font size.
Double shear – slope factor for font shape. Default is 0. (normal text).
Int thickness – thickness of the text lines. Default is 1.
Int line_type – type of stroke. Default is 8.
Outputs: None.
Void cvPutText( CvArr* img, const char* text, CvPoint org, const CvFont* font, CvScalar color
);
Usage: this function will draw text on top of the input image. The font must be initialized
before anything can be written.
Inputs: CvArr* img – input image to be drawn on.
Const char* text – text to be drawn on the image.
CvPoint org – coordinates for the origin of the text. The origin is set to the
bottom left corner of the first letter of the text string.
Const CvFont* font – this is the font that was initialized with cvInitFont.
CvScalar color – this is a CvScalar value containing the color information that
determines the color of the text that will be drawn.
Outputs None.
Window Functions:
Int cvNamedWindow( const char* name, int flags);
Usage: this function creates a window designated with the name given.
Inputs: const char* name – this is the name that will the window will be given. It should
be passed into the function in quotes. For example, “New Window”.
Int flags – only parameter available is: CV_WINDOW_AUTOSIZE. Default value is
0.
Outputs: int – this returns a ‘1’ if created ok or a ‘0’ if the window was not created.
Void cvShowImage( const char* name, const CvArr* image);
33
Usage: this function displays the given image onto a named window. cvNamedWindow
must be called to assign the name to the window before this function is called.
Inputs: const char* name – this is the name of the window that was assigned with
cvNamedWindow that you wish to show the image in that window.
Const CvArr* image- this is the image that you want to display in the window.
Outputs: None.
IX. OpenCV Functions
c. Functions added for Final Report:
Image Processing Functions (3):
CvScalar cvGet2D(const CvArr* arr, int idx0, int idx1);
Usage: this function will get a value from a 2-D array. In our case, we use it to get the
pixel data out of the images.
Inputs: const CvArr* arr – this is the input array (or image) that you want to get the data
from.
Int idx0 – the first index for the data element.
Int idx1 – the second index for the data element.
NOTE: The data in the images are stored as image(y,x) where x and y are the x
and y pixel values you wish to access. For example, if you want the image value
at point (100,250), you would call cvGet2D(image,250,100).
Outputs: CvScalar – the function returns the value of the array at the given points as a
data type CvScalar.
void cvSet2D( CvArr* arr, int idx0, int idx1, CvScalar value );
Usage: This function is used to set a value in a 2-D array. In our case, we use it to set a
pixel value in an image.
Inputs: CvArr* arr – this is the input array (or image) that you want to set a value in.
Int idx0 – this is the first index for the data element.
Int idx1 – this is the second index for the data element.
CvScalar value – this is the value you wish to set into the array. It must be of
type CvScalar. To pass it another data type, for example an integer, you can pass
this input as cvScalar(int value).
Outputs: None.
void cvOr( const CvArr* src1, const CvArr* src2, CvArr* dst, const CvArr* mask=NULL );
Usage: This function will do a bitwise ‘OR’ operator on the two input arrays. In our
case, we use this to combine two grayscale images.
Inputs: CvArr* src1 – this is the first array/image to be compared.
CvArr* src2 – this is the second array/image to be compared.
CvArr* dst – this is the destination for the resulting image to be saved.
Const CvArr* mask – area of images to be compared. Default value is NULL
(whole image is compared).
Outputs: None.
34
X. CODE
a. C++ Code
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
<windows.h>
<cvcam.h>
<cv.h>
<highgui.h>
<stdlib.h>
<string>
<iostream>
<fstream>
<cmath>
<cvaux.h>
<GL/glut.h>
using namespace std;
double GF_QL = 0.033;
//GoodFeaturesToTrack
quality_level parameter -- smaller=> more points
double CP_Threshold = 0.085;
//CorrPoints threshold parameter
-- smaller=>better, but less matches
double D_Threshold = 0.1;
//threshold for matching points on
depth map
int G_Threshold=150;
//threshold for depth image pixel
check
int Move_Threshold=85;
//threshold for moving the robot
left or right
double PC1_Threshold = 0.85;
//threshold for correspondence
for checking of stopping condition
double PC2_Threshold = 0.85;
double PC3_Threshold = 0.85;
string file1=("FrameImage1corners.bmp");//filenames for saved images
string file2=("FrameImage2corners.bmp");
string file3=("GrayImage1.bmp");
string file4=("GrayImage2.bmp");
string file5=("scaleddepthimage.bmp");
string file6=("edges.bmp");
string file7=("depth&edge.bmp");
string file8=("FrameImage1.bmp");
string file9=("objdetect.bmp");
const char* f1; const char* f2;
const char* f3; const char* f4;
const char* f5; const char* f6;
const char* f7; const char* f8;
const char* f9;
char n='1';
//initial frame number for
saving file purposes
int cc=500;
//number of corners to
find in image
CvPoint2D32f cr1[500];
//initialization of corner
variables to be read in from output file
CvPoint2D32f cr2[500];
int corners1,corners2;
//number of corners
ofstream outFile("corner.dat",ios::out);//create file for output of corner values
int openglval[500][5];
//OpenGL variable
char b[50];
//used by cvPutText
CvFont* font=new CvFont;
//font info to number corners
float cornersMatched1[500][2];
//matching corners
float cornersMatched2[500][2];
int k=0;
//matching corner counter
int yval[2];
//variable that is used to find roi of depth image that
contains pixels greater than 165
35
int xval[2];
//variable that is use to determine the min /max pixel
location of the obj in ROI set by yval in edge image
//file name variables
string run=("Run");
int runtime=1;
IplImage* edgeim = cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1);//edge image used by
multiple functions
//*********************OPENGL VARS*****************************************
int time=0;
int x_translate=0;
int y_translate=0;
int translate_x=0;
int translate_y=0;
static GLfloat spinL =45.0;
static GLfloat spinR=90.0;
static GLfloat spin=0.0;
GLubyte imagedata[320][240][3];
GLubyte imagedata2[320][240][3];
//function prototypes
void findCorners(IplImage*);
int findCorrPoints();
void getDepthMap();
int ObjectDetect();
void GetObjVert(IplImage *);
void GetObjHoriz(IplImage *);
int isPopcorn(IplImage* image1);
void Reset();
void init(void);
void loadTextures(void);
void robot(void);
void obj(void);
void display(void);
void action(void);
void reshape(int, int);
void mouse(int, int , int , int );
void spinDisplay(void);
int opengl();
//main function
int main(int argc,char** argv)
{
cvInitFont(font,2, 0.25,0.25 );//initialize font for drawing on images
CvCapture *cam1 = cvCaptureFromCAM(NULL);//select camera
CvCapture *cam2 = cvCaptureFromCAM(NULL);//only used if a second camera is
connected
IplImage* Grabimage=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,3);
while(true)
{
//set all variables for filenames for saving/loading images
run.insert(3,itoa(runtime,b,10));
file1.insert(0,run);
file2.insert(0,run);
file3.insert(0,run);
file4.insert(0,run);
file5.insert(0,run);
file6.insert(0,run);
file7.insert(0,run);
file8.insert(0,run);
file9.insert(0,run);
f1=file1.c_str();
f2=file2.c_str();
f3=file3.c_str();
36
f4=file4.c_str();
f5=file5.c_str();
f6=file6.c_str();
f7=file7.c_str();
f8=file8.c_str();
f9=file9.c_str();
int move = 0;
cout<<"RUN "<<runtime<<endl;
//capture first image and find corners
Grabimage=cvQueryFrame( cam1 );
cvSaveImage(f8,Grabimage);
findCorners(Grabimage);
//move camera, capture 2nd image, move camera back
system("nqc -pgm 1 -run");
Sleep(300);
Grabimage=cvQueryFrame( cam1 );
system("nqc -pgm 2 -run");
//find corners in 2nd image
findCorners(Grabimage);
//process both images
int cornercheck;
cornercheck=findCorrPoints();
if (cornercheck==0)
{
openglval[runtime-1][0]=0;
openglval[runtime-1][1]=0;
openglval[runtime-1][2]=320;
openglval[runtime-1][3]=240;
openglval[runtime-1][4]=5;
Reset();
move=0;
cout<<"corners less than 8"<<endl;
cout<<"Robot moved left"<<endl;
system("nqc -pgm 5 -run");
Sleep(1000);
runtime++;
continue;
}
getDepthMap();
move = ObjectDetect();
//move robot based on information from ObjectDetect
if (move ==1)
{
system("nqc -pgm 3 -run");
cout<<"Robot moved forward"<<endl<<endl;
}
else if (move==2)
{
system("nqc -pgm 5 -run");
cout<<"Robot moved left"<<endl<<endl;
}
else if (move==3)
{
system("nqc -pgm 4 -run");
cout<<"Robot moved right"<<endl<<endl;
}
else {
cout << "Popcorn Box Found. Program Completed." << endl;
break;
}
//release images, reset used global variables for next set of images
Reset();
move=0;
37
//pause before next run
Sleep(1000);
runtime++;
}
//release images and camera
cvReleaseCapture(&cam1);
cvReleaseCapture(&cam2);
opengl();
}
void findCorners(IplImage* image)
{
//function findCorners will find the corners in an image
cout<<"picture taken"<<endl;
IplImage* image1 = cvCloneImage(image);
IplImage* gray=cvCreateImage(cvSize(image1->width,image1>height),IPL_DEPTH_8U,1);
IplImage* eigImage = cvCreateImage(cvGetSize(image1),IPL_DEPTH_32F,1);
IplImage* tempImage = cvCreateImage(cvGetSize(image1),IPL_DEPTH_32F,1);
CvPoint2D32f corner[500];
//find corners in image
cvCvtColor(image1,gray,CV_BGR2GRAY);
cvFlip(gray,NULL,0);
cvGoodFeaturesToTrack(gray,eigImage,tempImage,corner,&cc,GF_QL,10.0,NULL);
//draws square boxes around corners
for( int i=0;i<cc;i++)
{
CvPoint pt1,pt2;
pt1.x=corner[i].x-5;
pt1.y=240-corner[i].y-5;
pt2.x=corner[i].x+5;
pt2.y=240-corner[i].y+5;
CvScalar y,z;
z.val[0]=300.00; y.val[0]=50.0;
z.val[1]=100.00; y.val[1]=50.0;
z.val[2]=100.00; y.val[2]=500.0;
z.val[3]=300.00; y.val[3]=300.0;
if (n=='1')//only draw all corners on FIRST frame, COLOR image
{
cvRectangle(image1,pt1,pt2,z);
cvPutText(image1,itoa(i+1, b,10),cvPoint(pt1.x,pt1.y),font,y);
}
if (corner[i].x<1 )
{
corner[i].x=corner[i].y=0;
}
outFile<<corner[i].x<<" "<<corner[i].y<<endl; //writes corner vector to
output file
}
outFile<<-100<<endl; //creates marker for visual check of addtional output file
outFile<<endl;
//file.insert(9,1,f);
if (n=='1')
{
cvSaveImage(f1,image1);
cvSaveImage(f3,gray);
}
else
{
cvSaveImage(f2,image1);
cvSaveImage(f4,gray);
}
n++;
//release temporary images
38
cvReleaseImage(&image1);cvReleaseImage(&gray);cvReleaseImage(&eigImage);cvRelea
seImage(&tempImage);
}
int findCorrPoints()
{
//function findCorrPoints finds corresponding points between two images.
IplImage* im1=cvLoadImage(f3,0); //loads frame 1
IplImage* im2=cvLoadImage(f4,0); //loads frame 2
IplImage* tempim1=cvCreateImage(cvSize(im1->width,im1->height),IPL_DEPTH_8U,1);
IplImage* tempim2=cvCreateImage(cvSize(im1->width,im1->height),IPL_DEPTH_8U,1);
IplImage* tim1=cvCreateImage(cvSize(10,10),IPL_DEPTH_8U,1);
IplImage* tim2=cvCreateImage(cvSize(10,10),IPL_DEPTH_8U,1);
IplImage* result=cvCreateImage(cvSize(1,1),IPL_DEPTH_32F,1);
tempim1=cvCloneImage(im1); //copy frame 1 to tempim1
tempim2=cvCloneImage(im2);
CvRect rec,rec2;
//initialization of rectangle region to scan
for in images
rec.height=10;rec.width=10; //
rec2.height=10;rec2.width=10;//
const int index=500;
//rectangular region 1 to use for comparision
of regions of image 2
double min[index],min2[index];
//
int min_corners1[index];
int min_corners2[index];
int min_corners[index];
CvScalar y,z,crnclr;
//colors used in drawing on
images
z.val[0]=300.00; y.val[0]=50.0;
z.val[1]=100.00; y.val[1]=50.0;
z.val[2]=100.00; y.val[2]=500.0;
z.val[3]=300.00; y.val[3]=300.0;
crnclr.val[0]=100.00;
crnclr.val[1]=300.00;
crnclr.val[2]=300.00;
crnclr.val[3]=300.00;
//detect edges of first frame
cvCanny(im1,edgeim,30,500);
cvSaveImage(f6,edgeim);
//reads in corner file
ifstream inFile;
inFile.open("corner.dat");
int i=0;
int check=0;
while(check==0)
{
inFile>>cr1[i].x;
if (cr1[i].x==-100)
{
cr1[i].x=NULL;
check=1;
break;
}
inFile>>cr1[i].y;
corners1 = i;
i++;
}
i=0;
check=0;
while(check==0)
{
inFile>>cr2[i].x;
inFile>>cr2[i].y;
corners2 = i;
39
if (cr2[i].x==-100)
{
check=1;
}
i++;
}
//initialize array min and min_corners to 1000, and -1
for(int i=0; i<index;i++)
{
min[i]=1000;
min2[i]=1000;
min_corners1[i]=-1;
min_corners2[i]=-1;
min_corners[i]=-1;
}
//match corners of first image to those of second image:
for (int i=0;i<=corners1-1;i++)
{
rec.x=cr1[i].x-5;rec.y=cr1[i].y-5;
if(rec.x<=6) rec.x=6;
if(rec.x>=310) rec.x=310;
if(rec.y<=6) rec.y=6;
if(rec.y>=230) rec.y=230;
CvScalar z;
z.val[0]=100.00;z.val[1]=100.00;z.val[2]=100.00;z.val[3]=300.00;
cvSetImageROI(tempim1,rec);
cvCopy(tempim1,tim1);
//find best matches of a given template image ,in this case the corner we
want to find its corresponding pair
for(int j=0;j<=corners2-1;j++)
{
rec2.x=cr2[j].x-5;rec2.y=cr2[j].y-5;
if(rec2.x<=6) rec2.x=6;
if(rec2.x>=310) rec2.x=310;
if(rec2.y<=6) rec2.y=6;
if(rec2.y>=230) rec2.y=230;
cvSetImageROI(tempim2,rec2);
cvCopy(tempim2,tim2);
//cvMatchTemplate is used to find corresponding points using square
difference method
cvMatchTemplate(tim2,tim1,result,CV_TM_SQDIFF_NORMED);
double min_val;
CvPoint loc;
cvMinMaxLoc(result,&min_val,NULL,&loc,NULL);
if (min_val<min[i] && min_val<CP_Threshold) {
min[i]=min_val;
min_corners1[i]=j+1;
}
}
//compare min corners for best match if more than one corner in the
second image is selected as a matching corner in the first image
int val1,val2;
val1=min_corners1[i];
for(int p=0; p<i;p++)
{
val2=min_corners1[p];
if (val1==val2)
{
if (min[i]<min[p])
min_corners1[p]=-1;
}
}
}
40
//now match corners of second image to those in first image
for (int i=0;i<=corners2-1;i++)
{
rec.x=cr2[i].x-5;rec.y=cr2[i].y-5;
if(rec.x<=6) rec.x=6;
if(rec.x>=310) rec.x=310;
if(rec.y<=6) rec.y=6;
if(rec.y>=230) rec.y=230;
CvScalar z;
z.val[0]=100.00;z.val[1]=100.00;z.val[2]=100.00;z.val[3]=300.00;
cvSetImageROI(tempim2,rec);
cvCopy(tempim2,tim2);
//find best matches of a given template image ,in this case the corner we
want to find its corresponding pair
for(int j=0;j<=corners1-1;j++)
{
rec2.x=cr1[j].x-5;rec2.y=cr1[j].y-5;
if(rec2.x<=6) rec2.x=6;
if(rec2.x>=310) rec2.x=310;
if(rec2.y<=6) rec2.y=6;
if(rec2.y>=230) rec2.y=230;
cvSetImageROI(tempim1,rec2);
cvCopy(tempim1,tim1);
//cvMatchTemplate is used to find corresponding points using square
difference method
cvMatchTemplate(tim1,tim2,result,CV_TM_SQDIFF_NORMED);
double min_val;
CvPoint loc;
cvMinMaxLoc(result,&min_val,NULL,&loc,NULL);
if (min_val<min2[i] && min_val<CP_Threshold) {
min2[i]=min_val;
min_corners2[i]=j+1;
}
}
//compare min corners for best match if more than one corner in the
second image is selected as a matching corner in the first image
int val1,val2;
val1=min_corners2[i];
for(int p=0; p<i;p++)
{
val2=min_corners2[p];
if (val1==val2)
{
if (min2[i]<min2[p])
min_corners2[p]=-1;
}
}
}
for (int i=0;i<corners1;i++)
{
int val1,val2;
val1=min_corners1[i];
val2=min_corners2[val1-1];
if (val2==i+1)
min_corners[i]=val1;
}
//draw corresponding points on second frame
IplImage* im2out=cvLoadImage(f2,3);
CvPoint pt1,pt2;
for( int i=0;i<=corners2;i++)
{
if(min_corners[i]>0)//if corner is valid
{
41
//CvPoint pt1,pt2;
pt1.x=cr2[min_corners[i]-1].x-5;
pt1.y=cr2[min_corners[i]-1].y-5;
pt2.x=cr2[min_corners[i]-1].x+5;
pt2.y=cr2[min_corners[i]-1].y+5;
cvRectangle(im2out,pt1,pt2,z);
cvPutText(im2out,itoa(i+1,
b,10),cvPoint(pt1.x,pt1.y),font,crnclr);
cornersMatched1[k][0]=cr1[i].x;
cornersMatched1[k][1]=cr1[i].y;
cornersMatched2[k][0]=cr2[i].x;
cornersMatched2[k][1]=cr2[i].y;
k++;
}
}
cvSaveImage(f2,im2out);
if (cornersMatched2[8][0]== 0 && cornersMatched2[8][1]==0)
return 0;
return 1;
}
void getDepthMap()
{
//function getDepthMap gets the depth map from the two rectified images and the
fundamental matrix
CvMatrix3 FMAT;
CvMat* points1=cvCreateMat(1,2*k,CV_32FC2);
CvMat* points2=cvCreateMat(1,2*k,CV_32FC2);;
CvMat* f_mat=cvCreateMat(3,3,CV_32FC1);
int *scanlines1;
int *scanlines2;
int *lengths1=0;
int *lengths2=0;
int line_count=0;
uchar* buffer_1;
uchar* buffer_2;
IplImage* im1=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,3);
IplImage* im2=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,3);
IplImage* gim1=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1);
IplImage* gim2=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1);
IplImage* depthim=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1);
IplImage* depthImageScaled=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1);
IplImage* mergedim=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1);
im1=cvLoadImage(f1,1);
im2=cvLoadImage(f2,1);
//calculate fundamental matrix from corresponding points
for (int i=0;i<k;i++)
{
points1->data.fl[2*i]=cornersMatched1[i][0];
points1->data.fl[2*i+1]=cornersMatched1[i][1];
points2->data.fl[2*i]=cornersMatched2[i][0];
points2->data.fl[2*i+1]=cornersMatched2[i][1];
}
int f_check=cvFindFundamentalMat( points1,points2,f_mat,CV_FM_8POINT);
//convert fundamental matrix of type cvMat into type cvMatrix3
for (int i=0;i<3;i++)
{
FMAT.m[i][0]=f_mat->data.fl[i*3];
FMAT.m[i][1]=f_mat->data.fl[i*3+1];
FMAT.m[i][2]=f_mat->data.fl[i*3+2];
}
//get scanlines of image from fundamental matrix
cvMakeScanlines( &FMAT, cvSize(320,240), 0, 0, 0, 0, &line_count);//this gets
the number of scanlines
42
scanlines1 = (int*)(calloc( line_count* 2, sizeof(int) * 4)); // |
scanlines2 = (int*)(calloc( line_count * 2, sizeof(int) * 4)); //allocating
memory for parameters
lengths1 =(int*)(calloc( line_count* 2, sizeof(int) * 4));
//of
cvMakeScanlines
lengths2 = (int*)(calloc( line_count* 2, sizeof(int) * 4));
// |
cvMakeScanlines(&FMAT,cvSize(320,240),scanlines1,
scanlines2,lengths1,lengths2,&line_count);
//rectify images
buffer_1 = (uchar*)(malloc(320*line_count*3)); //allocate memory for parameters
of
buffer_2 = (uchar*)(malloc(320*line_count*3)); //cvPreWarpImage
cvPreWarpImage(line_count,im1,buffer_1,lengths1,scanlines1);
cvPreWarpImage(line_count,im2,buffer_2,lengths2,scanlines2);
//find depthmap from rectified images
cvCvtColor(im1,gim1,CV_BGR2GRAY); //cvFindStereoCorrespondence requires
cvCvtColor(im2,gim2,CV_BGR2GRAY); //rectified grayscale images
cvFindStereoCorrespondence(gim1,gim2,CV_DISPARITY_BIRCHFIELD,depthim, 255, 25,
5, 12, 12, 25);
cvCopyImage(depthim,depthImageScaled,0);
cvConvertScale(depthim, depthImageScaled, 2.5);
cvSaveImage(f5,depthImageScaled);
//combine edge image from canny edge detector and the depth map image into new image
cvOr(depthImageScaled,edgeim,mergedim);
cvSaveImage(f7,mergedim);
//release images
cvReleaseImage(&im1); cvReleaseImage(&im2); cvReleaseImage(&gim1);
cvReleaseImage(&gim2);
cvReleaseImage(&depthim); cvReleaseImage(&depthImageScaled);
cvReleaseImage(&mergedim);
}
int ObjectDetect()
{
//function ObjectDetect will use information from the edge lines from the canny
//edge detector and the depth map to detect objects, their locations and also
//determine a plan of action.
IplImage* depthim=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1);
IplImage* im=cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,0);
CvRect rec;
CvPoint pt1;
CvPoint pt2;
CvScalar z; //color for drawing on image frame
z.val[0]=100.00;
z.val[1]=400.00;
z.val[2]=100.00;
z.val[3]=300.00;
//check for image with no edges.
//image with no edges --> very large object
//is right in front of camera.
int edgecheck=0;
for (int j=0;j<240;j++){
for (int i=0;i<320;i++){
CvScalar val=cvGet2D(edgeim,j,i);
if (val.val[0]==255)
edgecheck++;
}
}
if (edgecheck <20) //less than 20 edge pixels
{
openglval[runtime-1][0]=0;
openglval[runtime-1][1]=0;
openglval[runtime-1][2]=320;
43
openglval[runtime-1][3]=240;
openglval[runtime-1][4]=2;
return 2;
}
im=cvLoadImage(f1,1);
depthim=cvLoadImage(f5,0);
//find an object and determine its size/location
GetObjVert(depthim);
if (yval[0]==1000 && yval[1]==-1)
return 1;
rec.x=0;
rec.y=yval[0];
rec.width=320;
rec.height=yval[1]-yval[0];
cvSetImageROI(edgeim,rec);
GetObjHoriz(depthim);
//if one of object corner points is invalid, no object in path, move forward
if ( (xval[0]==1000) || (xval[1]==-1) || (yval[0]==1000) ||(yval[1]==-1) ||
(xval[1]-xval[0]<20))
{
openglval[runtime-1][5]=1;//save variables to map in opengl
return 1;//send move to be 1
}
//object corner points are valid, therefore object is in path
//draws rectangle of the obj that is detected
pt1.x=xval[0];
pt1.y=yval[0];
pt2.x=xval[1];
pt2.y=yval[1];
cvRectangle(im,pt1,pt2,z,3);
cvSaveImage(f9,im);
openglval[runtime-1][0]=xval[0];//save variables to map in opengl
openglval[runtime-1][1]=yval[0];
openglval[runtime-1][2]=xval[1];
openglval[runtime-1][3]=yval[1];
//first check for stopping condition
IplImage* Frame1 = cvLoadImage(f8,1);
int end = isPopcorn(Frame1);
if (end) {
openglval[runtime-1][4]=0;
return 0;
}
//robot should not stop, determine action
if (xval[0] >Move_Threshold)
{
openglval[runtime-1][4]=2;
return 2;//send move to be 2
}
else
{
openglval[runtime-1][4]=3;
return 3;//send move to be 3
}
cvResetImageROI(edgeim);
cvReleaseImage(&im); cvReleaseImage(&depthim);
}
void GetObjVert(IplImage * dim)
{
//function GetObjVert gets the vertical bounds of an object
//in the image in pixels and saves it to yval[].
//this function is called by ObjectDetect().
int min_val,max_val;
CvScalar temp;
44
min_val=1000;
max_val=-1;
for(int i=0; i<320;i++)
{
for(int j=0;j<240;j++)
{
temp=cvGet2D(dim,j,i);
//check depthmap to see if object is close to robot
if( temp.val[0] >= G_Threshold)
{
min_val=min(min_val,j);
max_val=max(max_val,j);
}
}
}
yval[0]=min_val;
yval[1]=max_val;
}
void GetObjHoriz(IplImage * dim)
{
//function GetObjHoriz gets the horizontal bounds of an object
//in the image in pixels and saves it to xval[].
//this function is called by ObjectDetect().
int min_val,max_val;
min_val=1000;
max_val=-1;
CvScalar temp,temp2;
for(int i=0; i<320;i++)
{
for(int j=0;j<edgeim->roi->height;j++)
{
temp=cvGet2D(edgeim,j,i);
temp2=cvGet2D(dim,j,i);
//if there is an edge
if (temp.val[0]==255)
{
//check depthmap to see if object is close to robot
if(temp2.val[0]>=G_Threshold)
{
min_val=min(min_val,i);
max_val=max(max_val,i);
}
//erase edges on edge image that are not close
//to robot.
else if(temp2.val[0]<100)
cvSet2D(edgeim,j,i,cvScalar(0));
}
}
xval[0]=min_val;
xval[1]=max_val;
}
}
void Reset()
{
//function Reset will reinitialize certain global variables
//to their appropriate initial values to be used in the next
//run of the program.
for (int i=0; i<500;i++)
{
cr1[i].x=0;
cr1[i].y=0;
cr2[i].x=0;
45
cr2[i].y=0;
cornersMatched1[i][0]=0;
cornersMatched1[i][1]=0;
cornersMatched2[i][0]=0;
cornersMatched2[i][1]=0;
}
k=0;
corners1=0;
corners2=0;
xval[0]=1000;
xval[1]=-1;
yval[0]=1000;
yval[1]=-1;
n='1';
file1=("FrameImage1corners.bmp");
file2=("FrameImage2corners.bmp");
//filenames for saved images
file3=("GrayImage1.bmp");
file4=("GrayImage2.bmp");
file5=("scaleddepthimage.bmp");
file6=("edges.bmp");
file7=("depth&edge.bmp");
file8=("FrameImage1.bmp");
file9=("objdetect.bmp");
run=("Run");
cvReleaseImage(&edgeim);
edgeim = cvCreateImage(cvSize(320,240),IPL_DEPTH_8U,1);
}
int isPopcorn(IplImage* image1)
{
//function isPopcorn will scan through the input image and try to match
//a region to a template image to determine whether or not the popcorn
//box is present in front of the robot.
IplImage* popcorn = cvLoadImage("Image/popcornsmall.bmp",3);
IplImage* result = cvCreateImage(cvSize(219,204),IPL_DEPTH_32F,1);
//first scan for smallest template.
cvMatchTemplate(image1,popcorn,result,CV_TM_CCORR_NORMED);
double max_val;
cvMinMaxLoc(result,NULL,&max_val,NULL,NULL);
if (max_val>PC1_Threshold) {
cvReleaseImage(&popcorn); cvReleaseImage(&result);
return 1;//return 1 --> popcorn found.
}
//scan for medium template.
popcorn = cvLoadImage("Image/popcornmedium.bmp",1);
result = cvCreateImage(cvSize(176,189),IPL_DEPTH_32F,1);
cvMatchTemplate(image1,popcorn,result,CV_TM_CCORR_NORMED);
cvMinMaxLoc(result,NULL,&max_val,NULL,NULL);
if (max_val>PC2_Threshold) {
cvReleaseImage(&popcorn); cvReleaseImage(&result);
return 1;//return 1 --> popcorn found.
}
//scan for large template.
popcorn = cvLoadImage("Image/popcornlarge.bmp",3);
result = cvCreateImage(cvSize(165,185),IPL_DEPTH_32F,1);
cvMatchTemplate(image1,popcorn,result,CV_TM_CCORR_NORMED);
cvMinMaxLoc(result,NULL,&max_val,NULL,NULL);
if (max_val>PC3_Threshold) {
cvReleaseImage(&popcorn); cvReleaseImage(&result);
return 1;//return 1 --> popcorn found.
}
cvReleaseImage(&popcorn); cvReleaseImage(&result);
return 0;//return 0 --> popcorn not found.
}
46
int opengl()
{
//int* argc=0; char** argv=opengl;
glutInit(&__argc,__argv);
glutInitDisplayMode (GLUT_DOUBLE | GLUT_RGB);
glutInitWindowSize (640, 480);
glutInitWindowPosition (100, 100);
glutCreateWindow (__argv[0]);
init ();
glutDisplayFunc(display);
glutReshapeFunc(reshape);
glutMouseFunc(mouse);
if(runtime==time)
exit(0);
glutMainLoop();
return 0;
/* ANSI C requires main to return int. */
}
/*
void loadTextures(void)
{
IplImage* im=cvLoadImage("Image/robot.bmp",1);
int imageWidth=320;
int imageHeight=240;
for(int m=0; m<320;m++)
{
for(int n=0; n<240;n++)
{
CvScalar q=cvGet2D(im,n,m);
imagedata[m][n][0]=q.val[0];
imagedata[m][n][1]=q.val[1];
imagedata[m][n][2]=q.val[2];
}
}
glBindTexture (GL_TEXTURE_2D, 0);
glPixelStorei (GL_UNPACK_ALIGNMENT, 1);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_DECAL);
glTexImage2D (GL_TEXTURE_2D, 0, GL_RGB, imageWidth, imageHeight, 0,
GL_RGB, GL_UNSIGNED_BYTE, imagedata);
for (int i=1; i<=runtime;i++)
{
run.insert(3,itoa(i,b,10));
file8.insert(0,run);
f8=file8.c_str();
im=cvLoadImage(f8,1);
CvRect roi;
roi.x=openglval[i-1][0];
roi.width=openglval[i-1][2]-openglval[i-1][0];
roi.y=openglval[i-1][1];
roi.height=openglval[i-1][3]-openglval[i-1][1];
for(int m=0; m<roi.width;m++)
{
for(int n=0; n<roi.height;n++)
47
{
cvSetImageROI(im,roi);
CvScalar q=cvGet2D(im,n,m);
imagedata2[m][n][0]=q.val[0];
imagedata2[m][n][1]=q.val[1];
imagedata2[m][n][2]=q.val[2];
}
}
imageWidth=openglval[i-1][2]-openglval[i-1][0];
imageHeight=openglval[i-1][3]-openglval[i-1][1];
glBindTexture (GL_TEXTURE_2D, i);
glPixelStorei (GL_UNPACK_ALIGNMENT, 1);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri (GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_DECAL);
glTexImage2D (GL_TEXTURE_2D, 0, GL_RGB, imageWidth, imageHeight,
0, GL_RGB, GL_UNSIGNED_BYTE,imagedata2 );
run=("Run");
file8=("FrameImage1.bmp");
}
}*/
void robot(void)
{
glPushMatrix();
glColor3f(1,1,0);
glRotatef(spin,0.0,0.0,0.1);
glTranslatef(x_translate,y_translate,0);
glRectf(-5.0, -5.0, 5.0, 5.0);
glPopMatrix();
}
void obj(void)
{
glPushMatrix();
glTranslatef(translate_x,translate_y,0);
//glBindTexture (GL_TEXTURE_2D, time);
glRectf(-5.0, -5.0, 5.0, 5.0);
glPopMatrix();
}
void display(void)
{
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
//glEnable(GL_TEXTURE_2D);
//loadTextures();
if(time==0)
{
robot();
glFlush();
glutSwapBuffers();
}
else if(time==runtime+1)
exit(1000);
else
{
if (openglval[time+1][4]==1)
{
glPushMatrix();
glColor3f(1,1,0);
48
robot();
glPopMatrix();
glFlush();
glutSwapBuffers();
}
if (openglval[time+1][4]==2)
{
translate_x=x_translate-15;
translate_y=y_translate+15;
glColor3f(1.0,0, 0);
obj();
spinDisplay();
glColor3f(1,1,0);
robot();
glFlush();
glutSwapBuffers();
}
if (openglval[time+1][4]==3)
{
translate_x=x_translate+10;
translate_y=y_translate+15;
glColor3f(1.0,0, 0);
obj();
spinDisplay();
glColor3f(1,1,0);
robot();
glFlush();
glutSwapBuffers();
}
if (openglval[time+1][4]==0) //end of program, popcorn found
{
robot();
translate_y=y_translate+15;
glColor3f(0,0,1);
obj();
glFlush();
glutSwapBuffers();
}
if (openglval[time+1][4]==5) //robot didnt find 8 corners
{
spinDisplay();
robot();
glFlush();
glutSwapBuffers();
}
}
}
void mouse(int button, int state, int x, int y)
{
switch (button) {
case GLUT_LEFT_BUTTON:
if (state == GLUT_DOWN)
{
time++;
display();
}
break;
default:
break;
}
}
49
void spinDisplay(void)
{
if (openglval[time][4]==2 || openglval[time][4]==5)
{
spin=spinL;
glutPostRedisplay();
}
else if (openglval[time][4]==3)
{
spin=spinR;
glutPostRedisplay();
}
else
spin=0.0;
}
void init(void)
{
glClearColor(0,0,0,0);
run=("Run");
file8=("FrameImage1.bmp");
}
void reshape(int w, int h)
{
glViewport (0, 0, (GLsizei) w, (GLsizei) h);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrtho(-50.0, 50.0, -50.0, 50.0, -1.0, 1.0);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
}
50
X. CODE
b. NQC Code
start.nqc: moves robot in a straight line forward a set distance.
#define left OUT_C
//defines the variable left as the output C of the rcx
#define right OUT_A //defines the variable left as the output A of the rcx
task main()
{
SetPower(left,10);
SetPower(right,6);
On(left);
//turns on the left motor of the rcx
On(right);
//turns on the right motor of the rcx
Wait(100);
Off(left+right);
}
moveleft.nqc: turn robot to the left by 90 degrees.
#define left OUT_C
//defines the variable left as the output C of the rcx
#define right OUT_A //defines the variable left as the output A of the rcx
task main()
{
SetPower(right,7);
SetPower(left,10);
OnRev(left);
//turns on the left motor of the rcx
On(right);
//turns on the right motor of the rcx
Wait(65);
Off(left+right);
}
moveright.nqc: turn robot to the right by 90 degrees.
#define left OUT_C
//defines the variable left as the output C of the rcx
#define right OUT_A //defines the variable left as the output A of the rcx
task main()
{
SetPower(right,7);
SetPower(left,9);
On(left);
//turns on the left motor of the rcx
OnRev(right);
//turns on the right motor of the rcx
Wait(65);
Off(left+right);
}
camright.nqc: move camera on robot to the right a set distance.
task main()
{
SetPower (OUT_B,10);
On (OUT_B);
Wait(35);
Off(OUT_B);
}
camleft.nqc: move camera on robot to the left a set distance.
task main()
{
SetPower (OUT_B,10);
OnRev (OUT_B);
Wait(35);
Off(OUT_B);
}
51
X. CODE
c. Removed FindGroups() C++ Function Code
void FindGroups() //note: function no longer used.
{
// This function groups neighboring corners by a certain distance togethter
// in a group and finds the min and max x and y value for that group
double max_dist=40.0; //max distance between grouped corners
int g=1;
int checker=0;
CvPoint2D32f min_val;
CvPoint2D32f max_val;
for(int i=0; i<=500;i++)
{
min_val.x=cr1[i].x;
min_val.y=cr1[i].y;
max_val.x=cr1[i].x;
max_val.y=cr1[i].y;
for(int j=i+1;j<500;j++)
{
if( cr1[j].x!=0 && cr1[j].y!=0)
{
//compare distance between corners
if (sqrt( pow( (cr1[j].x-cr1[i].x) ,2)+ pow( (cr1[j].ycr1[i].y) ,2) )< max_dist)
{
if (group[j]<=0)
{
group[i]=g;
group[j]=g;
checker=1;
//sets min max values for each group window
min_val.x=min(min_val.x,cr1[j].x);
min_val.y=min(min_val.y,cr1[j].y);
max_val.x=max(max_val.x,cr1[j].x);
max_val.y=max(max_val.y,cr1[j].y);
}
}
}
}
if (checker==1)
{
min_max_of_group[g-1][0]=min_val;
min_max_of_group[g-1][1]=max_val;
g++;
checker=0;
}
}
}
52
XI. Individual Contributions
About 80-90% of our code was written with one of us sitting at the computer typing with
the other person looking over his shoulder pointing out all the little typos made. We took
turns at this so we are about 50-50 when it comes to the amount of code written (aside from
the OpenGL done by Giovanni).
Here is a list of the parts of the project that were done separately:
Everything that is not mentioned should be assumed that it was done cooperatively.
Code:
Giovanni created the original capture method using cvCam functions and the appropriate
callback functions.
Mike created the revised capture method using cvCaptureFromCAM functions.
Giovanni created the original FindGroups() function.
Mike wrote the isPopcorn() function, that Giovanni later revised.
Giovanni wrote all code involving OpenGL.
Report:
Giovanni typed up the goals, opencv initialization, and the algorithms used.
Mike typed up the mechanical design, the results, the limitations / areas for improvement,
and the error / threshold analysis sections.
Mike commented most of the code to make it look nice and neat and fit into our report.
Mike also typed up the second and third parts of the opencv function descriptions.
OpenCV Installation:
Giovanni found out what was needed to install.
Mike downloaded some installation files from the Microsoft website.
Robot Construction and Design
Giovanni came up with the initial idea to mount the camera onto the robot.
Mike designed and built the new wheel base.
Giovanni designed and built the reinforcements for the camera mount and camera cable
holder.
Mike designed and built the IR tower holder.
53
XII. Robots in Society and Economics Paper
The robotic field has been around for quite some time know, but has been one of those
little known fields. Usually when one thinks or robots they think or wither some sci-fi movie
robot or a children’s toy. Robots now have hit the mainstream and showing up in different
sectors on the market today. A few or these sectors are military, medical, and
manufacturing/consumer sectors.
World militaries have always relied upon human forces to get the job done. However,
today’s army is spending more and more money for each soldier sent to war. First, soldiers in
the military must be trained before they can do anything at whatever cost necessary. While in
the field, the soldiers must be equipped with the latest gear that will help keep the good guys
alive and help eliminate the bad guys. As technology progresses, so does the cost of this
equipment. At the end of the day, the cost of these soldiers can be very expensive for the taxpaying citizens.
Robots can do the more dangerous tasks at different price. The main cost of a robot
comes from its research and development costs. After that, all that is left is the individual price
of the unit and its maintenance costs. These robots do not need to be trained or fed. After all
this, the military may have realized that it would be cheaper in the long run to lose a robot to a
sniper or to a bomb than it would be to lose a soldier. Recently, “iRobot Corp. …won a $95.6
million contract modification to increase the number of available MTRS [Man Transportable
Robotic System bomb-disposal robot] units from 250 to 1,200” while “Foster-Miller Inc. …won
a similar $96.1 million contract modification to provide an additional 250 to 1,200 MTRS
production units” from the U.S. Department of Defense [6]. This shows how the government is
willing to spend more upfront money in order to help save the lives of its soldiers. Because the
54
robots do not require any sort of training period, they can be sent to war as soon as they are
manufactured, thus working to save more soldiers’ lives. The government will be able to see a
quick return on this investment as fewer soldiers are killed in war, thus reducing the need to train
more replacement soldiers.
The United States government places a high value on human life. They realize that these
soldiers are not just tools used to win a war; they also have families of their own. Not only will
the family be devastated by the loss of a loved one, but they will also have to make up for a new
lack of income. This will indirectly affect the local economy as the family is forced to save more
and more money just make ends meet and thus return less money to the community. The family
may also be forced to seek government financial assistance which would be paid by the
taxpayers. No matter what happens, society as a whole will feel the burden of the lost soldier(s).
The U.S. government also has a strong degree of national pride for its citizens and soldiers in
foreign lands. The idea of ‘no man left behind’ has caused the whole nation to go to great
lengths for a handful of people in need. We would be willing to make more sacrifices just to
release one or two prisoners of war, which could even cause more loss of life.
One way to prevent the unnecessary loss of human life on the battlefield would be to use
robots as replacement soldiers or to clear the way for human soldiers. The iRobot Corp. have
devised a system that is “a remote, deployable sensor suite designed to provide early warning
information, gunshot detection, intelligence, surveillance, and targeting capabilities to military
forces.”[7] This sniper detection robot would be able to assist soldiers in locating and attacking
the enemy soldiers, effectively keeping themselves alive. “Snipers have had the advantage of
being effectively invisible, making them a deadly threat on the battlefield and in urban
settings.”[7] The sniper robot will put U.S. soldiers at more of an advantage with the additional
55
intelligence. While the military may lose a few robots here and there at war, most of the country
would gladly pay the price of a few pounds of metal over the price of losing a loved one. At the
same time, iRobot’s bomb-detection robot can achieve the same goals of saving human lives. In
the cases where the situation is just too dangerous to risk a human life, a robot can excel. Their
MTRS robot will be used for “remote reconnaissance of unexploded ordinance and improvised
explosive devices.”[6] This robot will be able to do the jobs that no human would be willing to
do, soldier or otherwise.
One of the major contributors to the increase in technology of robots in military
applications has been the United States government. The Defense Advanced Research Projects
Agency (DARPA) held a competition for $1 million for any research team that could design and
build an automated vehicle to drive across a desert track. “The point of the Grand Challenge was
not to produce a robot that the military could move directly to mass production, [5]” its real goal
was to motivate researchers to think in new directions and further improve the technology
available. Participating in the Grand Challenge were teams from universities across the country
such as Carnegie Mellon University, Stanford, Princeton, Cornell, UCLA, MIT, Caltech, and
Virginia Tech. The winning team was the Stanford team, showing that universities can advance
technology just as well as private companies.
In industrial sectors, robots have become an accepted part of the work day. A sector that
is new to robotics is the medical sector. One of the driving forces of this change is economics.
While the robots have a very high price tag, “about $1.3 million apiece”[1], the robots are able to
perform surgeries with such greater precision that the price becomes a bargain. Intuitive
Surgical’s da Vinci robotic system has “transformed the field of prostate surgery, for which it
was approved in May 2001. That year it was used in less than 1 percent of all prostatectomies.
56
This year more than 20 percent will be done with the robot. And that figure is expected to
double next year.”[1] Doing that many operations each year, it is only a matter of time before
the hospital turns a large profit. The additional precision provided by the robotic system allows
for a less intrusive surgery, which is less dangerous and also has a shorter recovery time. Both of
those contribute to lowered costs of the hospital.
The downside to robots entering the medical field is the unwillingness to accept these
techniques. The difference between the U.S. and Europe, where people are “less squeamish
about such things”, is that in Europe, “robots have assisted in operations on the brain, the
prostate and the inner ear.”[8] While only part of the problem is the patients here in the states,
some doctors think that “once a hospital invests in such an expensive system, surgeons might
feel pressured to use it and steer patients toward surgery over other treatment options” and others
worry is that “some surgeons are reluctant to commit the hours necessary to learn robotic
techniques.”[1] We can see that the problem is with the country as a whole and its prejudice
against new techniques. Other countries have already accepted robotics as simply a tool to
perform a job better, yet the U.S. seems to overlook this fact.
The sector with the most robot market penetration would have to be the industrial or
manufacturing sector. Robots of a type have been used in the automation of assembly lines for
years. Companies now such as Adept Technology and CRS Robotics are selling robots that can
automate drug discovery and assembly of fiber-optic equipment [2].
At the same time, companies like Honda and Sony are taking the robots in a new
direction. Honda, with its ASIMO, and Sony, with its QRIO, have developed more humanoid
robots. These robots are designed to be more lifelike in the way they look by having visible
features like eyes, mechanical lungs and artificial lips[3]. These companies are driving Japan in
57
particular to be the world leader in robotics. It has been estimated that the market for ‘service
robots’ will reach $10 billion within a decade. These robots will be used to replace workers from
the shrinking Japanese economy.
Japan is a country that would be able to adjust to such a change in the workforce. This
could be because “few Japanese have the fear of robots that seems to haunt westerners in
seminars and Hollywood films.”[4] In our culture, robots are often seen as the evil monster in
most situations. However, in Japanese culture, robots are seen in more of a positive way. This
difference in perspective is what could be the difference in what brings technology to Japan
faster than the United States.
One can see all the aspects in which robots are being introduced or expanded on today
and how the future or the robotics filed will continue to change. This will lead to social and
economic aspects that will involve humans in one way or another, whether they be personal
governmental or industrial wise.
58
1. Barrett, Jennifer. “Cutting Edge” Newsweek 12/12/2005, Vol. 146 Issue 24, p50-54,
4p, 2c
2. Chediak, Mark. “ROBOTICS SECTOR PROFILES.” Red Herring; Dec2001 Issue
108, p94-95, 2p
3. “Humanoids on the march.” Economist; 3/12/2005, Vol. 374 Issue 8417, Special
Section p3-4, 2p, 2c
4. “Better than people.” Economist; 12/24/2005, Vol. 377 Issue 8458, p58-59, 2p, 3c
5. Gibbs, W. Wayt. “INNOVATIONS FROM A ROBOT
Jan2006, Vol. 294 Issue 1, p64-71, 8p, 19 diagrams, 9c
RALLY.” Scientific American;
6. Keller, John. “DOD issues two quick urgent orders for antiterrorism bomb-disposal
robots.” Military & Aerospace Electronics; Nov2005, Vol. 16 Issue 11, p22-22,
1/4p
7. McHale, John. “iRobot and Photonics Center create sniper detection system for land
robots.” Military & Aerospace Electronics; Nov2005, Vol. 16 Issue 11, p6-8, 2p,
1bw
8. “And
now, robodoc!” Time; 11/23/92, Vol. 140 Issue 21, p23, 1/5p, 1c
59
Download