The Narcissus Project searching for reflections in troubled waters Team Members: Gabe Moy Thinh Nguyen Chamberlain Fong Gang Zhou David Chinnery Project Web Page (includes demo query results): http://www-cad.eecs.berkeley.edu/~chinnery/ee225b 1 Introduction A database with roughly 17,000 images in Kodak Photo CD (PCD) format, typically of three to four megabytes, was provided. There were a wide variety of images, ranging from photos of animals, plants, people, buildings and scenery. The images were preprocessed into 24-bit color, 256256 bit maps to simplify computation and to provide thumbnails for display. The Narcissus Project implemented different color and texture searches of the images in the bit map database, returning the best matches for a given query. A graphical user interface (GUI) was provided to display search results, and to allow the user to choose different weightings for the search methods. 2 Initial Database In eight hours, the 65GB of PCD images were downsized with Paint Shop Pro to 256256 24-bit color bitmaps, without preserving the aspect ratio. This provided a database that could be stored on two of the computers. It did not require slow network access for processing, and as the database was significantly smaller, the computation time to calculate indexes for the various algorithms was less. In addition, these bitmaps were used for thumbnails in the GUI. 3 Algorithms For all the metrics used for the different image comparisons, a smaller difference meant a better match. 3.1 Histograms Color RGB, YIQ and HSV histograms of the images were computed, and the indexes for each stored in separate files. YIQ and HSV color representations can be readily computed from the RGB color representation used in the bitmaps [1]. For all the histograms, separate histograms for the top and bottom halves of the images were calculated, because most images had a distinct top part, typically sky. This gave better matches for comparisons, than without separate histograms for the top and bottom. The RGB and YIQ histograms were quantized into three dimensions, with eight bins in each of the color dimensions, giving a total of 512 bins (888). The HSV histograms were quantized into 512 bins only in the H dimension (hue). Metric For all the histograms, the metric was the least squares distance between the histograms. Results In general RGB and YIQ comparisons worked very well, and better than HSV. HSV returned good matches on query images, when seeking matches with varying background intensity, for which RGB and YIQ didn’t perform as well on. 3.2 3232 Difference The images were further downsized to 3232 bitmaps, with 3 bits to represent each color dimension. Assuming sufficient stationarity in the regions of the query image, this can return a good match, by directly comparing the pixels of the downsized images. Initially downsizing to 88 was tried, but this was too coarse and gave poor results – the color of the images varied significantly over regions of this size. 1616 returned good matches, but 3232 performed as well and sometimes better. Metric The absolute difference between the colors was computed and summed for the 3232 regions. Absolute difference was used, as it is a faster metric than least squares distance. Results Performed very well on paintings in the database due to the black boarder around the edges, and generally gave good matches except on images with little local stationarity in color. 3.3 3232 Standard Deviation Following the success of the 3232 difference, comparing the standard deviation of each RGB color dimension within each of the 3232 regions was tried as a measure of the texture. Metric Again, the absolute difference between the standard deviations was used for speed. Results Comparing the standard deviations alone, seldom returned better matches than using a color histogram method. This method worked well for bricks and other query images where matches should be predominantly based on texture rather than color. 3.4 Wavelet Transform The wavelet transform was used as a measure of the texture within the image, as wavelet decompositions effectively store the edges and shape of the image. The coefficients of a wavelet decomposition provide information that is independent of the original image resolution, allowing some compensation for changes in image aspect ratio in computing the 256256 bit map database and comparison with the query image. A separable two-dimensional wavelet transform, using the Haar wavelet transform [2], of YIQ color space was used (the pseudo-code for the wavelet transform used is listed in Appendix A). The sign of the forty coefficients of the largest magnitude was stored as an index for each image (2 bin quantization). The average YIQ color was also stored, to improve the matches returned. Metric Distance wcolor CQ CT wwavelet Q[i, j] T [i, j] Q[i, j ] 0 Where Q[i, j] and T[i, j] are the quantized wavelet coefficients of the query and target images respectively, and CQ and CT are average colors of the query and target images respectively. The function Q[i, j ] T [i, j ] returns one if the coefficients are not equal, zero if they are equal. The wavelet and color distances are weighted by wwavelet and wcolor respectively. The weights were adjusted during testing, to the best values for matching the query images tested. Results RGB and YIQ color histograms generally performed much better than the wavelet transform, but there were some images for which the wavelet transform worked well (for example, the third test image in the demo). The discrete cosine transform (DCT) was also tried to compare textures of images, but the wavelet transform gave significantly better matches to the query images. 4 Database The indexes for each image for each algorithm were stored in a simple linear database, for linear searching (i.e. a comparison with the index of the query is made for each image index in the database). As the number of images to process was relatively small, it was not necessary to use a faster approach. The index databases are loaded into memory when the program starts to increase the speed. The best twenty matches for each comparison method were returned, to give up to 120 images on which comparisons weighting all the methods to return the best matches for display. To find the best matches for all the algorithms above takes about 45 seconds. Once the best matches for each method have been computed, the user can change the weight of each algorithm, to find the best weight matched (the weight for each algorithm being normalized to between zero and one at this stage). 4.1 Time Taken for Searches by the Algorithms The user can query by a specific comparison method, which is significantly faster, and the weights for each image are displayed. The time taken to build the index databases and query search times are listed below. Comparison Algorithm Search Time(s) Time to Build Index Database(minutes) RGB histogram 4 20 YIQ histogram 5 25 HSV histogram 5 25 11 60 7 240 11 40 3232 difference 3232 standard deviation Wavelet transform Table 1: Time taken to build the index database and search times for the algorithms. The 3232 difference and variance comparisons take longer as there are many differences to compute and sum for each image comparison (210 rather than 28 for the color histograms, but some computation time is saved by taking the absolute difference). The wavelet comparison takes more time due to the initial overhead to calculate the wavelet transformation of the query image. YIQ and HSV take slightly longer than RGB, to compute the color transformation of the RGB query image. Building the color histogram index databases was fast. Having transformed to the appropriate color space, the color components for RGB and YIQ were quantized into eight bins in each color dimension by a five bit shift and then summing the number of occurrences of each three-dimensional color vector to take the histogram. 5 Conclusion The different algorithms implemented for texture and color comparisons were able to find similar images for most query images tried. Generally, the RGB and YIQ histograms and 3232 difference returned better matches than the other algorithms. The wavelet and variance comparisons performed well on images where finding matches of similar texture was important. The HSV color histogram seemed mostly redundant – RGB and YIQ almost always performed as well or better. HSV was able to find matches of different intensity, as only the hue component was considered, which RGB and YIQ could not (as only a direct comparison between color bins was made). 6 Possible Improvements Some of the algorithms return the same image if it is in the database, and some do not. The consistency of the software could be improved, by not returning the original image if it is found in the database. In addition, allowing further comparisons with one of the thumbnails returned would improve the GUI. Reducing the images to 256256 bit maps removes some of the high frequency components in the images, and may bias texture comparisons. This may be why the wavelet comparison seldom performs well. The texture indexes could be calculated from the larger images. Occasionally, there are problems with the images returned by integrated query, which allows the user to change the weight of each algorithm. In the demo, only three planes were returned for the ninth test image of a plane, but the YIQ comparison actually finds far better matches, and there is some bug that changes what it is returned in the integrated query when only YIQ is used. Each algorithm works correctly alone, and matches for the ninth image were the only ones significantly affected by the bug. 6.1 More Sophisticated Metrics for Image Comparison Implementing a similarity matrix to compare color histograms would improve the matches returned by RGB and YIQ, when trying to find images with different lighting and intensity. A similarity metric could be used on a subset of the better image matches in the database, reducing the amount of computation required, because similarity matrix comparisons require many computations. The wavelet comparison could be improved by quantizing coefficients into more than two levels and storing more coefficients, which would give better matches but be slower. 6.2 Shape More than just texture and color are required to find similar images, comparing shapes and objects within images is also important. Texture comparisons of spatial frequencies in the image does include, indirectly, comparison between shapes and objects in the image, but this is not a direct method of finding similar objects in the image. Finite element analysis to find similar shapes, allowing distortion and rotation, allows comparison of the shapes within an image. Implementation of finite element analysis algorithms to search for similar shapes in a window of the image would detect important objects that a user might want to search for. It may also be important to include some rotational invariance in texture comparisons, before finding a subset of images to perform finite element analysis comparisons on, as finite element analysis is computationally very expensive. 6.3 Improving the Index Database Searches Larger image databases would require more sophisticated searching, to reduce the number of images on which comparisons need to be made. An appropriate database style for indices would be a quad-tree, with successive searching at greater depths if the representative image for a branch of the tree was sufficiently similar. The search speed would also be approximately doubled, by using the dual Pentium processors with multi-threading software. References [1] Eugene Vishnevsky, “Color Conversion Algorithms,” 1998 http://www.cs.rit.edu/~ncs/color/t_convert.html [2] Charles E. Jacobs, Adam Finkelstein, David H. Salesin, “Fast Multiresolution Image Querying.” Bibliography S. Maruzzi, The Microsoft Windows 95 Developer's Guide, Ziff Davis Press, New York, 1996. J.R. Parker, Algorithms for Image Processing and Computer Vision, Wiley Computer Publishing, New York, 1997. W. Press and others, Numerical Recipes in C, Cambridge University Press, Cambridge, 1992. Eric J. Stonllnitz, Tony D. Derose, David H. Salesin, Wavelets for Computer Graphics, Morgan Kaufmann Publishers, 1996. Appendix A: Pseudo-Code for the Haar Wavelet Transform 1-dimensional Haar’s wavelet transform: proc DecomposeArray(A:array[0..h-1] of color): A = A/sqrt(h) while (h > 1) do: h = h/2 for i = 0 to h – 1 do: A’[i] = (A[2i] + A[2i+1])/sqrt(2) A’[h+i] = (A[2i] – A[2i+1])/sqrt(2) end for A = A’ end while end proc 2-dimensional wavelet transform: proc DecomposeImage(T: array[0..r-1, 0..r-1] of color): for row = 1 to r do: DecomposeArray(T[row, 0..r-1]) end for for col = 1 to r do: DecomposeArray(T[0..r-1, col]) end for end proc