A modern and simplified approach for CONTENT BASED IMAGE RETRIEVAL Nitesh Jain Ojaswi Gupta Bachelor of Technology Department of Electronics and Communication Bhagwan Parshuram Institute of Technology New Delhi, India [email protected] Bachelor of Technology Department of Electronics and Communication Bhagwan Parshuram Institute of Technology New Delhi, India [email protected] Abstract – This paper presents a simple approach for CONTENT BASED IMAGE RETRIEVAL. It means that, the retrieval of image takes place by the content of images themselves. With the increasing amount of data, Content Based Image Retrieval has become quite important these days. CBIR is primarily a part of image processing. It has its application in different domains like weather forecasting, medical images, surveillance, remote sensing, criminal record images, etc. To retrieve the images, the HSV Histogram, Color Moments (CM), Autocorrelogram Parameters are calculated and then these values are compared to the values known to the user. These similarities are compared by Euclidean distance, Manhattan Distance, Chebyshev Distance and Relative Deviation. Index Terms – Content Based Image Retrieval (CBIR), Color Moments (CM), Euclidean Distance, Manhattan Distance, Hue Saturation Value (HSV). queries of large image databases based on visual image content properties such as Example images, or Userconstructed sketches and drawings. Several online content-based web search engines can also be mentioned. “WebSEEk” developed by Image and Advanced Television Lab, Columbia University. It allows making queries by example and by desired color composition. Global Memory Net (GMNet) was launched for public access in late June 2006. It is a digital library of cultural, historical, and heritage image collections. Different CBIR systems use different types of user queries. Typically tools for the contentbased image retrieval consist of query statement and a result presentation; this query can be done by providing an example image a sketch, or by choosing desired colors for the image. Results are presented by the top several similar images based on the similarity measure. I. INTRODUCTION Content-based Image Retrieval (CBIR) is a method used for retrieving the similar images to a given query image from a database. It has many applications in the field of medical : used to diagnose the disease by comparing the various diagnostic information present initially , weather forecasting : to calculate whether information by comparing past history , survelinace : to know the thief is same or different , Crime prevention – it helps police in suspicious people’s identification from large image databases and Military : for detecting enemy soldiers. . Fig. 1 Basic CBIR Working II. LITERATURE SURVEY Many CBIR systems and tools have been developed to make queries based on visual content. IBM developed Query By Image Content (QBIC) system, which lets user to make III. BASIC CONTENT AND PROPOSED WORK Typical CBIR system has two main functionalities. This is Data insertion and query processing.Data insertion procedures are performed independent of user interaction. They are applied to all the data. The purpose of this process is to extract visual features from the images in the database. These features are obviously smaller than the actual image and they are then stored for easy comparison reasons, as a characterizers of each image. Query processing starts with user specific request. Request can be done in several ways: By an example image, by giving desired pattern or object, colour distribution and etc. Query processing module obtains the visual features from the given request, metric is defined. Features extraction itself involves, selecting the features that have to be extracted, it depends on the type of user query. The feature extracting algorithm is chosen to create the feature vector from the selected features. A. Colour Comparison Technique Image content comparison by colour is based on matching images by their colour distribution. In this case image feature identifies the proportion of pixels of specific colour or colours within an image. So one can make colour searches by indicating desired concentration of colours or by an example image with desired colour distribution and get similar images. Colour histograms are widely used to extract the colour distribution descriptors from the image. It is a statistic of the colour of pixels in the image. First colour distribution is represented by appropriate colour histogram, and then colour vector is formed from that histogram. by a new metric in which the distance between two points is the sum of the absolute differences of their Cartesian coordinates. The taxicab metric is also known as rectilinear distance, L1 distance, snake distance, city block distance, Manhattan distance or Manhattan length, with corresponding variations in the name of the geometry. B. Texture E. L2 Distance Similarity Extraction Retrieval by image texture in a similar to colour-based feature extraction, but it looks for visual patterns in images rather than colours. So it looks at homogeneity that is not a result of a single colour presence or intensity of a pixel value. Sometimes it also provides more spatial information. The most basic method used to extract the texture descriptor from the image is based on Fourier Transform. The initial image is transformed by the Fourier function. As the method works on digital images, Discrete Fourier Transform (DFT) is used. DFT converts images from the spatial domain into the frequency domain, where all the spatial frequencies of the original image are represented. In another words this transformed image shows intensity variations over a number of pixels. Transformed data is grouped to obtain several measures from it. In mathematics, the Euclidean distance or Euclidean metric is the "ordinary" straight-line distance between two points in Euclidean space. With this distance, Euclidean space becomes a metric space. The associated norm is called the Euclidean norm. A generalized term for the Euclidean norm is the L2 norm or L2 distance. C. RGB to HSV Feature Extraction IV. EXPERIMENTAL RESULTS James S. Wang et al. has provided a database which is known as wang database. This wang database is used to test the proposed method. This database cons ist 1,000 images of 10 classes. Each class has 100 images. These 10 classes are composed of African people, sea, building, dinosaur, bus, elephant, flower, horse, mountain, and food. A. Image 1 We want to convert the image to HSV because working with HSV values is much easier to isolate colors. In the HSV representation of color, hue determines the color you want, saturation determines how intense the color is and value determines the lightness of the image. As can be seen in the image below, 0 on the wheel would specify a mild red color and 240 would specify a blue color. In MATLAB, the hue ranges from 0 to 1 instead of 0 to 360. Similarity : L2 Retrieved Images : 20 Correct Output : 100% B. Image 2 Fig. 2 HSV Representation of a particular color D. L1 Distance Similarity Extraction A taxicab geometry is a form of geometry in which the usual distance function or metric of Euclidean geometry is replaced Similarity : L1 Retrieved Images : 20 Correct Output : 90% C. Image 3 REFERENCES  R.Brunelli and O.Mich “Image retrieval by examples,” IEEE Trans.Multimedia, vol.2, pp.164- 171, Sep.2000.  Young Deok Chun, Nam ChulKim,”Content Based Image Retrieval    Similarity : Relative Deviation Retrieved Images : 15 Correct Output : 86.67%  V. CONCLUSIONS AND FUTURE SCOPE This paper reviewed the main components of a content based image retrieval system, including image feature representation, indexing, query processing, and query-image matching and user's interaction, while highlighting the current state of the art and the key-challenges. A challenging task of development, implementation and integration of various novel algorithms to result into GUI based, selectable multi-modal processing of selectable single query image for retrieval of similar images has been achieved successfully. These algorithms include: Edges and prominent boundaries detection Foreground separation Image retrieval based on o Colour codes of entire image o Foreground colour codes o Foreground shape correlation o Combination of foreground colour codes and shape correlation Suggested future enhancements are as under. Analysis of performance of prominent boundaries detection method with other wavelets. Utilization of well localized thin-edges to further reduce artifacts produced due to intrinsic characteristic of watershed algorithm. Incorporation of indexing technique(s) for faster query response. Incorporation of database management modules for image and image feature databases. Incorporation of relevance feed-back from user to increase the retrieval performance of the system. Incorporation of multiple-queries to refine results for improved retrieval performance.  Using Multi-resolution Color and Texture Features” IEEE Transaction on multimedia, Vol.10No 6, October 2008. S.Liapis and G.Tziritas, “Color and Texture image retrieval using chromaticity histograms and wavelet frames,” IEEE Trans.Multimedia, vol6, pp.6766-686, oct.2004. G. Pass and R. Zabih, Histogram reﬁnement for content-based image retrieval, IEEE Workshop ApplComput, 1996, pp. 96-102. Jia Li, James Z. Wang, “Automatic linguistic indexing of pictures by a statistical modeling approach,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 9, pp. 1075-1088, 2003. Y. Gong, H. Zhang, H.C. Chuan, M. Sakauchi, An image database system with content capturing and fast image indexing abilities, Proceedings of IEEE International Conference on Multimedia Computing and Systems, Boston, May 1994, pp. 121}130. J.R. Smith, S.-F. Chang, Single color extraction and image query, Proceedings, IEEE International Conference on Image Processing (ICIP-95), Washington, DC, October 1995.