Halbach-Hockney_Join..

advertisement
Brian Hook
Kyle Halbach
Mike Treffert
Semi-Automated Joiners
Figure 0: Joiner by David Hockney
Abstract
Joiners – see figure 0 – have rapidly gained popularity ever since artist David Hockney introduced them
and coined the term “joiners.” By simply going to an image sharing site such as Flickr and searching for
joiners, one can see the immense popularity that this artist style has experienced in the digital age. This
paper focuses on the process of semi-automating the process of creating joiners in order to create a
preliminary joiner. This preliminary joiner would be used by an artist to determine whether or not that
is the ordering of the images they would want to do in their actual joiner.
Introduction
The idea of a joiner came about from the fact that a single viewpoint of a scene may not fully capture
the scene or represent it as is desire. A joiner is an image that is generated by combining multiple
images of the same thing. This is done by layering images taken from the same object or scene at
various viewpoints. Since images from different viewpoints are used, the end result can give a sense of
the 3d view of the object or scene. While joiners can be very appealing, the time that it takes to
generate them is not. For simple joiners created using photographs, it can take anywhere between 30
minutes to an hour to create an appealing joiner which contains less than 20 photographs. However,
with advances in computational image processing, this time can be reduced by executing a program that
generates a joiner for the user. Even though computer generated can be very appealing, they are often
not what the user desires. However, in this case, computer generated joiners are still useful because
they give the user a basic starting point for their rendition of the joiner.
Motivation
The basic motivation behind joiners is the fact that there is an easy way to smoothly piece together
images that are taken from a single viewpoint. This method is called panorama generation. However,
the methods used to generate panoramas cannot be used to piece together photos taken from multiple
viewpoints. Joiners set out to solve this problem by piecing together images from multiple viewpoints.
The motivation for this project came from a paper by Lihi Zelnik-Manor and Pietro Perona called
“Automating Joiners.” In the paper the authors explain their method of fully automating the joiner
creation process. More about this paper will be discussed in the following section. Another motivation
1
Brian Hook
Kyle Halbach
Mike Treffert
for this project was an implementation of Zelnik-Manor and Perona’s algorithm by Ekapol
Chuangsuwanich [2]. In this implementation, Chuangsuwanich implemented the algorithm laid out by
Zelnick-Manor and Perona but also added in a slight twist to the algorithm by allowing users to select
the ordering of the images in the joiner.
“Automating Joiners” Summary
This paper details an algorithm that the author’s used to automate the process of generating joiners.
This algorithm had the following four main steps: feature detection, global alignment, ordering and
iterative refinement.
The first step of their algorithm used the SIFT feature detector to find corresponding points
between the set of input images.
Once the authors obtained a set of feature points between each image, they then moved onto
the second step of their algorithm. This step, global alignment, can be explained in two sub-steps:
RANSAC and solving for similarity matrices. The first sub-step involved executing the RANSAC algorithm
to throw away feature points that were false positive matches between each pair of input images.
The second sub-step involved them computing similarity transforms that would map each set of
feature points in one image to the corresponding features points in another image. A similarity
transformation is a transformation that can rotate, resize, and translate the image all while maintaining
shape. This transformation is used to avoid distorting the images in the joiner. In the paper, the authors
briefly explain their method of computing the transformations as solving a specific optimization
problem. The optimization problem that they set up uses weighting each feature point differently which
allows them to give more precedence to specific points when needed.
The third step of their general algorithm involves ordering the images in the outputted joiner.
This step is essentially determining which image goes on top of which other images. In the paper they
explain two different method of layering the images. The first method is a Photoshop-like method
which treats each image as a layer and you order these layers. An example of this method would be A
over B, B over D, and D over C. The second method allows for choosing a layering in each region of
overlap. This method allows for overlaps such as A over B, B over C, and C over A. In order to determine
the ordering of the images, the paper described three approaches. However, out of these three
approaches they found that only two of them were very successful. These two methods were
minimizing the gradient at the boundary between two overlapping images and minimizing the sum of
color difference between the two images. While the paper did state that both of these methods
worked, it did note that the minimization of the gradient appeared to work better.
The final step of the paper’s algorithm was the iterative refinement step. The purpose of this
step is essentially to fix the boundaries of overlap in the outputted joiner. The reason that the
boundaries need fixing is because the layering of the images occludes certain artifacts at the boundaries.
This can create inconsistencies at the boundaries. For example, a pole in one image may be 10 pixels to
the left at the boundary between two images. The approach that the paper took to solving this was to
weigh feature points near these boundaries more heavily and weigh the feature points further from
these boundaries less. Their algorithm would the recomputed the similarity matrices and remap the
images onto the joiner.
Finally, the authors discussed the idea of blending in the joiner. Although, it was not a step in
their algorithm, the authors stated that there was the potential to implement blending at the end of
their algorithm. The purpose of this would be to hide the seams that would still exist after iterative
refinement. This has the potential to make a joiner seem more realistic.
2
Brian Hook
Kyle Halbach
Mike Treffert
Other Related Work
A work that is related to joiner creation is the creation of panoramas. In panorama creation, the
objective is to piece together multiple images taken from a single viewpoint of a scene to recreate a
representation of the scene. Since all of the images are taken from a single viewpoint, the images can
be smoothly pieced together by being projected onto a specific output plane using prospective
projection. However, these same ideas do not apply to joiner creation because joiners allow for pictures
from multiple viewpoints. This creates that unique case where things that are occluded in one image
may not be occluded in another. This is not the case in panoramas because all images come from one
viewpoint where what is occluded in one image will be occluded in another image – assuming a
stationary scene.
Problem Statement
To implement an application that semi-automatically generates preliminary joiners using photos from
multiple viewpoints. The output of this application will not be a true joiner but rather a preliminary
joiner that will give the artist a general idea or starting point for the true created joiner.
Method
For our method we implemented a hybrid algorithm based on the algorithm in “Automating
Joiners.” For the first step of our algorithm we used the SIFT feature detector [3] to obtain matching
feature points between a set of images. For this step, we assumed that the images in the source
directors were in order of overlap. This means that we assumed that image one overlapped with image
two, image two overlapped with image three and so on.
For the second step of our algorithm, we used RANSAC to compute the “best” similarity
transformation. To do this, we used six random points out of a matching feature point set and used the
built in Matlab function cp2tform to compute a non-reflective similarity transformation between the six
points. Once we obtained that, we checked to see how many of the feature points this transformation
mapped correctly. If it was the best similarity transformation we had seen so far, we kept it. Otherwise,
we moved on to the next iteration of RANSAC. We iterated through RANSAC one hundred times
because we felt like that was sufficient enough to obtain a similarity transformation that properly
mapped all the feature points.
For the next step of our algorithm, we selected a reference image and computed the similarity
transformations that mapped each image to the reference image by creating a combination of all
similarity matrices between the current image and the reference image. Once we obtained these
transformations, we then mapped each image onto its own output-sized image plane.
For the layering step of the algorithm, we chose not to go with an automated method of
layering. Instead of minimizing the gradient across the edges, we chose to accept user input about the
desired layering of the images. The main reason that we went with this method was because we are not
implementing an application that creates a true joiner but rather a tool that an artist could use to see
what their joiner may look like if they layered it in such a manner. Another reason for this approach is
that using the “best” ordering according to minimizing the gradient may not have the desired look that
the artist was aiming for. For example if the artist desires to have one photo in the center that contains
writing or a specific object that they don’t want occluded by any overlapping images, then this image
should be the very top layer. However, the “best” ordering may cause there to be an image that
partially overlaps with that text or object. See images 1a and 1b for examples of such a case where this
might happen.
3
Brian Hook
Kyle Halbach
Mike Treffert
Image 1a: The top layer is the image that has the text on the
wall. This makes it so the text is completely readable and
none of it is occluded form the joiner.
Image 1b: The text on the wall is a combination of two
overlapping images. In this image the text is not as readable as
it is in image 1a.
Limitations
A big limitation to our application is the fact that we do not implement bundle adjustment to
determine which images overlap with which. This fact is what causes us to make the assumption that
the input images are ordered in the directory based on order of overlap. Due to this limitation, the
quality of our output is not as good as if we did do bundle adjustment and computed similarity
transforms for images based on what images a specific image overlaps with.
Another limitation of our implementation is the fact that we didn’t implement iterative
refinement. During the implementation phase of our project, we made five different attempts at
implementing this step of the algorithm. However, each of the implementations seemed to make the
output worse than it was without this step implemented. The way that we attempted to implement it
was to select the N closest points to the overlap edges in the joiner and use those points to recalculate
the similarity transformations. The reason we had to do this instead of weighing the feature points like
the paper did was because the way we computer similarity transforms did not allow us to weigh feature
points. Our justification behind trying to use this method is that the feature points along the edges of
the overlap boundaries are the most important points because matching these points will fix
discontinuities between images in the joiner.
Results
Even without the iterative refinement step implemented, we were still able to get some decent results.
Figures 2, 3 and 4 are good results that we got using our application.
4
Brian Hook
Kyle Halbach
Mike Treffert
Figure 2
Figure 3
5
Brian Hook
Kyle Halbach
Mike Treffert
Figure 4
Clearly for all of these examples, there are still discontinuities along the edges because there is no
iterative refinement to clear them up. However, since we did our layering based on user input instead
of computing the “best” layering, there were many failures if the user did not pass in a good ordering.
Figures 5 and 6 are examples of such cases.
Figure 5
6
Brian Hook
Kyle Halbach
Mike Treffert
Figure 6
Concluding Remarks
In conclusion, even with the two major limitations that our problem contains, we were still able
to get decent joiners to be generated. Considering the intent of our application is to just give the artist
an idea of how they should go about creating their joiner, we see this as the desired result. Since we got
bad results if we did certain ordering, this is actually what we want because this would tell the user of
our application that when they create the actual joiner, they shouldn’t layer their images that way.
Even though we believe that our application meets its purpose, there are many aspects that we
would like to improve upon in the future. One of these is to implement some sort of bundle adjustment
in order to eliminate the assumption that images are given in the order of overlap. This would allow us
to compute better joiners because similarity transforms could be combinations of similarity transforms
that map each image to all the other images. This is in contrast to our method which only creates a
similarity transform that accounts for the mapping from one image to one other image. Another
improvement that we would like to make on our application is getting the iterative refinement step to
work. This step was only excluded from this version of our application due to the fact that we were
constrained by time. We feel like if we had enough time, we could get this step functioning properly.
This would greatly improve the output of our application. This would also allow our application to
generate actual joiners instead of just preliminary joiners. A final improvement that we hope to do is
add a user interface that is easy and simple to use. Currently we just have everything implemented as a
Matlab function which takes the ordering of the images as a parameter. Our original intention was to
have a website user interface. However, that became infeasible as we decided to focus on different
aspects of our project (ie we focused more on photomosaic generation). We feel like a user interface
would be beneficial to our application because it would allow the user to graphically manage the images
7
Brian Hook
Kyle Halbach
Mike Treffert
that they are using. It would also allow the user to easily reorder the images as well as preview how
each image might impact the result.
References
[1] Zelnik-Manor, Lihi and Perona, Pietro. “Automating Joiners”
http://www.vision.caltech.edu/lihi/Demos/AutoJoiners.html
[2] http://www.cs.cmu.edu/afs/andrew/scs/cs/15-463/f07/proj_final/www/echuangs/#Readjustment
[3] SIFT Feature Detector http://www.cs.ubc.ca/~lowe/keypoints/
8
Download