Harris Corner Detection

advertisement
Ankitha Miryala
Computer Vision - CMPEN 454
Project 1 Report
a)
This first project was to lay a foundation to become comfortable with the processes of various
types of feature extraction and identification for finding corresponding matching points on two
images. The processes include utilizing the mathematical theories we've learned in class thus
far and implementing these algorithms and mathematical models in Matlab, or in our case, C++.
Through this project, we will learn how to combine Harris corner detection with converting
images into grayscale to efficiently detect points of interest in order to identify local features.
Since we will be matching these point scores between images, we will combine different types
of matching methods, such as NCC and SSD scores to find the corresponding feature on
another image. After finding matches, we will then implement part of group theory, specifically
homographies, in order to perform image transformations from alternative coordinate systems.
Last, we will evaluate the performance of our method of Harris Corner Detection along with the
matchability of feature descriptors we’ve used. A more detailed description is provided in the
following section.
By the completion of this project, all team members should obtain some level of proficiency in
C++ programming and developing code for using algorithms of computer vision.
b)
Initially we included all the basic libraries and the libraries for Open CV as our code was in C++.
We defined the values for k and the SSD threshold, explanation of the design decision is
explained later. We initiated a matrix called myHarris, this is a function we made which is
explained below. Next we started our main function. We assigned the first image to a matrix
called ‘img’ and the second image to matrix called ‘img1’. Then we declared the matrices called
dst_norm_scaled, dst_norm, dst_norm_scaled1 and dst_norm1. The threshold is set to 75.
As previously mentioned, we made a function, myHarris to compute the Harris corners. First, we
converted the image to grayscale because it converts all pixels to a single channel. We chose to
use a kernel of size 3x3 and sigma value of 1 to blur the image after trying various combinations
of kernel sizes and sigma values since this combination seemed to yield the best results. We
then used the Sobel operator in order to compute the derivatives of the horizontal and vertical
pixels to make the edges of the smoothed image more prominent. Next, we compute the
gradients, 𝑔𝑟𝑎𝑑𝑥𝑥 , 𝑔𝑟𝑎𝑑𝑥𝑦 , 𝑎𝑛𝑑 𝑔𝑟𝑎𝑑𝑦𝑦 , that will eventually be put into the autocorrelation (A)
matrix. We blur the gradients with a 3x3 kernel and a sigma value of 2 and denote them as
shown in the following equation: 𝐴 = [𝑆𝑥𝑥 , 𝑆𝑥𝑦 ; 𝑆𝑥𝑦 , 𝑆𝑦𝑦 ]. Next we calculate the determinant and
trace of the A matrix in order to be able to calculate the Harris R value. We then plug all the
computed values into the following equation : 𝑅 = 𝑑𝑒𝑡(𝐴) − 𝑘𝑇𝑟(𝐴)2 . At the end of myHarris
function, we normalize the R-score values, which we will denote as dst_norm, such that they lie
between 0 and 255 to make life simple.
Now the harris corners of the first image are assigned to dst_norm and the harris corners of the
second image are assigned to dst_norm1. The harris corner values are then scaled to absolute
values using the function ‘convertScaleAbs’. The scaled values of dst_norm and dst_norm1 are
assigned to dst_norm_scaled and dst_norm_scaled1 respectively.
Four vectors are defined in type int to indicate the rows are columns of the two images. We
chose type int as the rows are columns are integers themselves. The vectors for the first image
are pointX, pointY and the vectors for the second image are pointX1, pointY1. Next we define
two special vectors called pointXY and pointXY1 in type Point2f. This is to indicate a point in a
matrix in (x,y) format. We also define two other variable XY and XY1 which are used later in the
code.
Next part of the code is to print circles where edges are found in the images. A two
layer(subsetted) for loop structure is used to check through each row from zero to the last row
and each column from zero to the last column in dst_norm(image 1). Here, we compare each
point to the threshold value check if it is greater than threshold to draw red circles using the
command ‘circle( dst_norm_scaled, Point( i, j ), 1, Scalar(0,0,255), 1, 8, 0 );’ and the push the
value into the vector that we defined called pointXY. The same code is repeated for the second
image and push the values into the vector called pointXY1. Now we have two vectors from the
two images that have all points which are edges. Then the size of the vectors are printed out
that is nothing but the number of detected edges in each image.
Vectors called SSD, mat and mat1 are initiated in type float. The minimum size of both the
vectors above is computed and is called ‘bound’. This gives us the common points where we
can compute SSD. Vector pointXY is assigned to XY and vector pointXY1 is assigned to XY1.
Next for all the points given by the bound value we create a 5*5 matrix and save them into a
vector called ‘mat’ for the first image and a vector called ‘mat1’ for the second image.
Here, we calculate the value of SSD at each point. A type float called temp_ssd is initiated to
hold the value of SSD at each edge of the bound value.To do this we calculate the SSD value at
each of the 25 places in the vectors mat and mat1. The last values in these matrices are
assigned to variables var and var1 their values are subtracted. This value of the difference is
added to temp_ssd. Repeating it 25 times the final value of SSD called temp_ssd at a particular
point. This value is the pushed back into a vector called ‘SSD’. Hence, we have calculated the
SSD value for each edge of the bound value and the stored it in a vector called ‘SSD’.
Two vectors of type Point2f are created called newXY and newXY1. Now we compare the value
of SSD that we got at each point to the threshold value of the SSD that we defined at the
beginning. If the calculated SSD value is less than the threshold value, we send those locations
point from the pointXY to newXY. Similarly, for the second image with the same condition we
send the value from vector pointXY1 to newXY1. Now, we have two vectors called newXY and
newXY1 which have the desired locations satisfying the SSD conditions. We then print out the
size of these two vectors, which gives the number of match points using SSD.
For Part 4: Repeatability and Matchability, we first found the homography matrix, H, from the
images. The findHomography function find a 3x3 homography matrix which contains the
transform of the matched keypoints. It uses the coordinates of the points from the original image
and from the destination image. We chose to use CV_LMEDS, which is the Least - Median
robust method, because the recommended papers seemed to allude that this would yield a
more accurate response than the infamous RANSAC.
Next, we used perspectiveTransform to map these interest points from image 1 to image i. We
repeated this processes to map interest points from image i to image 1. The only difference was
that we needed to compute the inverse of the homography matrix to map image i interest points
onto image 1. We were then able to count the number of interest points that were transferred to
image i or image 1 by the homography or inverse homography matrix.
Now, we are able to compute repeatability rate. We chose the distance (𝜖 𝑜𝑟 𝑒𝑝𝑠) between the
transferred interest points and one of the interest points in image i to be 1.5 pixels after testing
various values ranging from 0 to 100 because the Smid paper seems to favor 1.5 pixels.
We then transferred image 1 interest points to image i using the homography matrix to count the
number R_i which are within “eps” distance of one of the interest points on the destination
image i. With this value, we can compute the repeatability rate (denoted in the code as r_i)
using the equation: 𝑟𝑖 (𝜖) = 𝑅𝑖 (𝜖) / 𝑚𝑖𝑛(𝑛1, 𝑛𝑖). Repeatability rate is commonly viewed as a
percentage of interest points that have been located on another image, creating a collection of
possible matching features. This being said, repeatability rate could be any value ranging from 0
to 1.
We calculated matchability rate in a similar way except that now we counted the number (M_i)
of matched interest points in image i that fell less than or equal to the eps distance, that we
defined previously, to its predicted location. After we were able to compute matchability rate
using the equation: 𝑚𝑖 (𝜖) = 𝑀𝑖 (𝜖) / 𝑚𝑖𝑛(𝑛1, 𝑛𝑖. Matchability rate is calculated to serve as a
way to check the accuracy of our feature descriptor and matching process. To check that we
calculated repeatability rate and matchability rate correctly, we checked if the matchability rate
was less than or equal to the repeatability rate.
We wanted to implement our real algorithms in C++ using the openCV library because we didn’t
get good results for the feature descriptor part. With this knowledge, we provided another
version of the code using some built-in functions from the openCV library. The code basically
uses the Keypoints and Descriptors types that are implemented in the openCV library, making it
a lot more accurate. The Keypoints type contain more than just the coordinates of the Keypoint,
but also the size, angle, response and octave. We also used the FlannBasedMatcher, which
has the ability to train a net of computed points. Essentially, this means the more matches we
try the better result we get. By doing this, we accomplished a good knowledge of new
algorithms that are being used for feature detection, like SURF, and also were lucky enough to
obtain much more accurate results.
c) Run the different portions of your code and show pictures of intermediate and final results
Harris Corner Images:
set 1, 2: 75
set 1, 3: 65
set 1, 4: 75
set 1, 5: 95
set 1, 6: 105
Harris_image#, Set#
put 1_1.2 with 2_1,2
put 1_1.3 with 3_1,3, etc
d) Run repeatability and matchability experiments on the images given and plot or tabulate the
quantitative results.
e) Experimental observations:
The first code works, but we wanted to see if the functions from the openCV library were more
accurate or robust than our original code. We found that, like our prediction, the opencv
functions we implemented into our alternative code yielded better results.
If we apply Gaussian smoothing too much, we eventually lose the ability to detect corners using
Harris corner detection because the pixel to pixel contrast becomes less prominent. We did not
visually notice much of a difference when selecting parameters k and Rlowthreshold, but we
settled for k to equal 0.5 because it was suggested in the project description and we selected
different threshold values ranging from 65 to 105 depending on the relative amount of corners
we were detecting. For the images that we were detecting a substantially larger amount of
corners, we used a higher threshold to limit the number of corners. If we increased or decreased
k, we needed to change the threshold to maintain a reasonable resolution because these
parameters are coupled.
For repeatability and matchability the only design consideration we had to take into account was
deciding on the epsilon value. At first, we were getting a count of zero unless we increase
epsilon to nearly 100. Finally, after comparing the keypoints in a better way before computing
the homography, we were able to use the epsilons that were suggested as appropriate values
for the distance in the project description and Schmid’s paper.
For our original function, we would not be comfortable using the feature descriptors, but we
were comfortable with our Harris corner method and SSD computations. However, we would be
comfortable to use the SURF feature descriptor from openCV in future projects.
Download