report - Video and Image Processing Lab

advertisement
The Narcissus Project
searching for reflections in troubled waters
Team Members:
Gabe Moy
Thinh Nguyen
Chamberlain Fong
Gang Zhou
David Chinnery
Project Web Page (includes demo query results):
http://www-cad.eecs.berkeley.edu/~chinnery/ee225b
1 Introduction
A database with roughly 17,000 images in Kodak Photo CD (PCD) format, typically of
three to four megabytes, was provided. There were a wide variety of images, ranging
from photos of animals, plants, people, buildings and scenery. The images were
preprocessed into 24-bit color, 256256 bit maps to simplify computation and to provide
thumbnails for display.
The Narcissus Project implemented different color and texture searches of the images
in the bit map database, returning the best matches for a given query. A graphical user
interface (GUI) was provided to display search results, and to allow the user to choose
different weightings for the search methods.
2 Initial Database
In eight hours, the 65GB of PCD images were downsized with Paint Shop Pro to
256256 24-bit color bitmaps, without preserving the aspect ratio. This provided a
database that could be stored on two of the computers. It did not require slow network
access for processing, and as the database was significantly smaller, the computation
time to calculate indexes for the various algorithms was less. In addition, these bitmaps
were used for thumbnails in the GUI.
3 Algorithms
For all the metrics used for the different image comparisons, a smaller difference meant
a better match.
3.1
Histograms
Color RGB, YIQ and HSV histograms of the images were computed, and the indexes
for each stored in separate files. YIQ and HSV color representations can be readily
computed from the RGB color representation used in the bitmaps [1].
For all the histograms, separate histograms for the top and bottom halves of the images
were calculated, because most images had a distinct top part, typically sky. This gave
better matches for comparisons, than without separate histograms for the top and
bottom.
The RGB and YIQ histograms were quantized into three dimensions, with eight bins in
each of the color dimensions, giving a total of 512 bins (888). The HSV histograms
were quantized into 512 bins only in the H dimension (hue).
Metric
For all the histograms, the metric was the least squares distance between the
histograms.
Results
In general RGB and YIQ comparisons worked very well, and better than HSV. HSV
returned good matches on query images, when seeking matches with varying
background intensity, for which RGB and YIQ didn’t perform as well on.
3.2
3232 Difference
The images were further downsized to 3232 bitmaps, with 3 bits to represent each
color dimension. Assuming sufficient stationarity in the regions of the query image, this
can return a good match, by directly comparing the pixels of the downsized images.
Initially downsizing to 88 was tried, but this was too coarse and gave poor results – the
color of the images varied significantly over regions of this size. 1616 returned good
matches, but 3232 performed as well and sometimes better.
Metric
The absolute difference between the colors was computed and summed for the 3232
regions. Absolute difference was used, as it is a faster metric than least squares
distance.
Results
Performed very well on paintings in the database due to the black boarder around the
edges, and generally gave good matches except on images with little local stationarity
in color.
3.3
3232 Standard Deviation
Following the success of the 3232 difference, comparing the standard deviation of
each RGB color dimension within each of the 3232 regions was tried as a measure of
the texture.
Metric
Again, the absolute difference between the standard deviations was used for speed.
Results
Comparing the standard deviations alone, seldom returned better matches than using a
color histogram method. This method worked well for bricks and other query images
where matches should be predominantly based on texture rather than color.
3.4
Wavelet Transform
The wavelet transform was used as a measure of the texture within the image, as
wavelet decompositions effectively store the edges and shape of the image. The
coefficients of a wavelet decomposition provide information that is independent of the
original image resolution, allowing some compensation for changes in image aspect
ratio in computing the 256256 bit map database and comparison with the query image.
A separable two-dimensional wavelet transform, using the Haar wavelet transform [2],
of YIQ color space was used (the pseudo-code for the wavelet transform used is listed
in Appendix A). The sign of the forty coefficients of the largest magnitude was stored as
an index for each image (2 bin quantization). The average YIQ color was also stored, to
improve the matches returned.
Metric
Distance  wcolor CQ  CT  wwavelet
Q[i, j]  T [i, j]
Q[i, j ] 0
Where Q[i, j] and T[i, j] are the quantized wavelet coefficients of the query and target
images respectively, and CQ and CT are average colors of the query and target images
respectively. The function Q[i, j ]  T [i, j ] returns one if the coefficients are not equal,
zero if they are equal.
The wavelet and color distances are weighted by wwavelet and wcolor respectively. The
weights were adjusted during testing, to the best values for matching the query images
tested.
Results
RGB and YIQ color histograms generally performed much better than the wavelet
transform, but there were some images for which the wavelet transform worked well (for
example, the third test image in the demo).
The discrete cosine transform (DCT) was also tried to compare textures of images, but
the wavelet transform gave significantly better matches to the query images.
4 Database
The indexes for each image for each algorithm were stored in a simple linear database,
for linear searching (i.e. a comparison with the index of the query is made for each
image index in the database). As the number of images to process was relatively small,
it was not necessary to use a faster approach. The index databases are loaded into
memory when the program starts to increase the speed.
The best twenty matches for each comparison method were returned, to give up to 120
images on which comparisons weighting all the methods to return the best matches for
display. To find the best matches for all the algorithms above takes about 45 seconds.
Once the best matches for each method have been computed, the user can change the
weight of each algorithm, to find the best weight matched (the weight for each algorithm
being normalized to between zero and one at this stage).
4.1
Time Taken for Searches by the Algorithms
The user can query by a specific comparison method, which is significantly faster, and
the weights for each image are displayed. The time taken to build the index databases
and query search times are listed below.
Comparison Algorithm
Search Time(s) Time to Build Index
Database(minutes)
RGB histogram
4
20
YIQ histogram
5
25
HSV histogram
5
25
11
60
7
240
11
40
3232 difference
3232 standard deviation
Wavelet transform
Table 1: Time taken to build the index database and search times for the algorithms.
The 3232 difference and variance comparisons take longer as there are many
differences to compute and sum for each image comparison (210 rather than 28 for the
color histograms, but some computation time is saved by taking the absolute
difference). The wavelet comparison takes more time due to the initial overhead to
calculate the wavelet transformation of the query image. YIQ and HSV take slightly
longer than RGB, to compute the color transformation of the RGB query image.
Building the color histogram index databases was fast. Having transformed to the
appropriate color space, the color components for RGB and YIQ were quantized into
eight bins in each color dimension by a five bit shift and then summing the number of
occurrences of each three-dimensional color vector to take the histogram.
5 Conclusion
The different algorithms implemented for texture and color comparisons were able to
find similar images for most query images tried. Generally, the RGB and YIQ
histograms and 3232 difference returned better matches than the other algorithms.
The wavelet and variance comparisons performed well on images where finding
matches of similar texture was important.
The HSV color histogram seemed mostly redundant – RGB and YIQ almost always
performed as well or better. HSV was able to find matches of different intensity, as only
the hue component was considered, which RGB and YIQ could not (as only a direct
comparison between color bins was made).
6 Possible Improvements
Some of the algorithms return the same image if it is in the database, and some do not.
The consistency of the software could be improved, by not returning the original image if
it is found in the database. In addition, allowing further comparisons with one of the
thumbnails returned would improve the GUI.
Reducing the images to 256256 bit maps removes some of the high frequency
components in the images, and may bias texture comparisons. This may be why the
wavelet comparison seldom performs well. The texture indexes could be calculated
from the larger images.
Occasionally, there are problems with the images returned by integrated query, which
allows the user to change the weight of each algorithm. In the demo, only three planes
were returned for the ninth test image of a plane, but the YIQ comparison actually finds
far better matches, and there is some bug that changes what it is returned in the
integrated query when only YIQ is used. Each algorithm works correctly alone, and
matches for the ninth image were the only ones significantly affected by the bug.
6.1
More Sophisticated Metrics for Image Comparison
Implementing a similarity matrix to compare color histograms would improve the
matches returned by RGB and YIQ, when trying to find images with different lighting
and intensity. A similarity metric could be used on a subset of the better image matches
in the database, reducing the amount of computation required, because similarity matrix
comparisons require many computations.
The wavelet comparison could be improved by quantizing coefficients into more than
two levels and storing more coefficients, which would give better matches but be slower.
6.2
Shape
More than just texture and color are required to find similar images, comparing shapes
and objects within images is also important. Texture comparisons of spatial frequencies
in the image does include, indirectly, comparison between shapes and objects in the
image, but this is not a direct method of finding similar objects in the image.
Finite element analysis to find similar shapes, allowing distortion and rotation, allows
comparison of the shapes within an image. Implementation of finite element analysis
algorithms to search for similar shapes in a window of the image would detect important
objects that a user might want to search for.
It may also be important to include some rotational invariance in texture comparisons,
before finding a subset of images to perform finite element analysis comparisons on, as
finite element analysis is computationally very expensive.
6.3
Improving the Index Database Searches
Larger image databases would require more sophisticated searching, to reduce the
number of images on which comparisons need to be made. An appropriate database
style for indices would be a quad-tree, with successive searching at greater depths if the
representative image for a branch of the tree was sufficiently similar.
The search speed would also be approximately doubled, by using the dual Pentium
processors with multi-threading software.
References
[1] Eugene Vishnevsky, “Color Conversion Algorithms,” 1998
http://www.cs.rit.edu/~ncs/color/t_convert.html
[2] Charles E. Jacobs, Adam Finkelstein, David H. Salesin, “Fast Multiresolution Image
Querying.”
Bibliography
S. Maruzzi, The Microsoft Windows 95 Developer's Guide, Ziff Davis Press, New
York, 1996.
J.R. Parker, Algorithms for Image Processing and Computer Vision, Wiley
Computer Publishing, New York, 1997.
W. Press and others, Numerical Recipes in C, Cambridge University Press,
Cambridge, 1992.
Eric J. Stonllnitz, Tony D. Derose, David H. Salesin, Wavelets for Computer
Graphics, Morgan Kaufmann Publishers, 1996.
Appendix A: Pseudo-Code for the Haar Wavelet Transform
1-dimensional Haar’s wavelet transform:
proc DecomposeArray(A:array[0..h-1] of color):
A = A/sqrt(h)
while (h > 1) do:
h = h/2
for i = 0 to h – 1 do:
A’[i] = (A[2i] + A[2i+1])/sqrt(2)
A’[h+i] = (A[2i] – A[2i+1])/sqrt(2)
end for
A = A’
end while
end proc
2-dimensional wavelet transform:
proc DecomposeImage(T: array[0..r-1, 0..r-1] of color):
for row = 1 to r do:
DecomposeArray(T[row, 0..r-1])
end for
for col = 1 to r do:
DecomposeArray(T[0..r-1, col])
end for
end proc
Download