Project 3 Report

Trevor McCasland Arch Kelley Project 3 Report - Dynamic Programming for Stereo Reconstruction Brief summary of project Students were tasked with achieving one primary goal in project 2: taking two distinct images depicting the same location or situation in the real world, and deriving the spatial relationship between the two images. This task was to be completed using a sequence of steps, the first of which was feature detection using the Harris corner-detection algorithm. Feature matching involving a bidirectional matching algorithm was then conducted using those detected corners, which then made it possible to finally compute a homography matrix that described the transformation from one image to another. The homography matrix was used to warp one of the images into the coordinate system of the other image, and the fitness of the matrix (or the relationship between the images themselves) was then tested by visually comparing the warped image to the actual second image and identifying how closely the warped image matched the real second image. Brief outline of the algorithmic approach The steps of this project required work completed in previous homework assignments and two overlapping images taken from a camera. First, the images were loaded into the program and then passed to two functions written in previous homework assignments. The first function, which was an implementation of the Harris corner detection algorithm, scanned the image and returned a set of identified corners, which acted as ‘features’ later in the process. The features were then matched between images based on similarity and relative location by a separate function written in another previous homework assignment. Once matches were determined between images, an implementation of the RANSAC algorithm was used to classify each member of the set of matching features as either outliers (which were ultimately ignored) and inliers. RANSAC relied on random samplings of three matches (which we referred to as s=3), and ran a calculated N number of times to find the function of best fit, thereby gathering the largest possible pool of inliers. Using the newly-found set of inliers, two 3x3 homography matrices were created under two conditions. The first matrix was constructed under the assumption that the final element in the matrix, h33, was always equal to 1, which allowed for a more simplistic calculation that only had to produce eight of the nine h values. On the other hand, the second matrix was computed assuming all nine of its values were variable but that the magnitude of the matrix itself was always equal to one, which made it possible to find a nontrivial solution to the system. Finally, forward and backward warping techniques were used to warp one image to the coordinate system of another image using either of the calculated homography matrices, which were mathematically similar. Pictures of intermediate results Images of the results of our program are shown below. Design decisions By and large, we created our image warping program by following the lecture notes and instructions given to us by Professor Yin. We modified exactly one existing file—project2Main.m (the main program)— which included calls to the functions written for homeworks 3 and 4 and contained the code for homography matrix formulation and image warping. Figures displaying intermediate and final results of the entire warping process were generated inside this main program file. We chose not to implement any special features or bonus features, including the creation of image mosaics, due to time and effort constraints. One design decision that we made was to assume that the probability of an image containing an outlier was 50%. This decision, which mainly influenced the RANSAC implementation alone, was made to ensure that RANSAC would run a sufficient number of times to arrive at an acceptable accurate result. If we assumed that there were fewer than 50% outliers, the algorithm had a tendency to run too few times, which led to increased inaccuracy in our results. For similar reasons, we chose to use a RANSAC accuracy probability of 99% to further increase the number of times that the algorithm’s loop ran and to only consider matches within a certain distance that we determined to be roughly 5. In terms of performance and robustness, we made a key choice while developing the program that we believe increased its overall performance and ‘niceness’. Using our method, we were able to avoid creating a completely new set of points and storage for points and labelling that set as ‘inliers’; doing so would have required us to write a segment of boilerplate loop code to copy inliers into a newly resized container each time a new inlier was found. Instead, we chose to create an Mx1 boolean array in which all values were initialized to zero. If a set of matched points was found to be an inlier by RANSAC, the index for that match in the matrix of matches was marked as a 1 at that same index in the inliers boolean array. In this fashion, we could construct the homography matrix through considering all points but only conditionally using the points if they were part of a match that was identified as a 1 in the boolean array. This meant that we avoided the computational cost of creating a resized inlier container and copying over all of the inlier info from the old to new storage each time another inlier was found. Experimental observations When using images with a translation in a certain direction, there were clear differences in the outcome of forward and backward warping. Forward warping preserved all overlapping pixels in the first image while backward warping preserved all overlapping pixels in the second image. When the warping was complete, the black pixels that were in the warped image represented the direction of the translation. For example, if there was a one dimensional translation from left to right, then there would be black pixels on the left of the image for forward warping and black pixels on the right for backward warping. The reason for this is because the pixels from one image are trying to fit into the area of the matched features in the second image which is only the overlapping part. It is also worth mentioning that the forward warping operations tended to run faster than both variations of the backward warping operations. Adjusting parameters related to RANSAC often led to drastically improved or drastically worsened results. Generally, we found that tweaking parameters to maximize the number of times RANSAC’s loop ran resulted in better results; this made sense because more iterations would make it more statistically likely that a better set of inliers for the result homography matrix was found. Specifically, larger values of e (predicted percentage of incorrectly classified outliers) and larger values of p (the target percentage number of inliers that RANSAC should find) typically led to a larger N (the number of RANSAC loop iterations), which then typically led to a larger number of correctly-identified inliers. We observed that after the computation of our two versions of the homography matrix, the matrices were indeed mathematically similar. We included code that multiplied the matrix derived by assuming h33=1 with the actual h33 value from the magnitude-derived matrix, and discovered that the resulting matrix (which we named Htest) was very close to the magnitudebased matrix. We displayed the sum of squared differences value between the two matrices as an output to the console from our application. Though our application largely worked as expected, forward warping with and without for-looping would sometimes leave black pixels in the shape of a warped grid on the warped image. In this grid, the cells would shrink as they got closer the top left corner of the image and the cells would expand as they got closer to the bottom right corner of the image. However, this behavior was to be expected; warping an image also scales it in certain situations, which would leave gaps in the resulting image that had no sibling in the other image to derive intensity from. Otherwise, the act of borrowing intensity values from the sibling image could have missed a set of pixels as a result of rounding operations like floor, ceiling, and averaging.

Project 3 Report

Related documents

Products

Support

Project 3 Report

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib