International Journal of Engineering Trends and Technology (IJETT) – Volume 23 Number 8- May 2015 Fuzzy Color Histogram Based Content Based Image Retrieval of Query Images Jaykrishna Joshi#1, Dattatray Bade*2, Kashyap Joshi#3 #1 Student, Department of Electronics & Telecommunication, ARMIET COE, Mumbai University, India * Professor, Department of Electronics & Telecommunication, Vidhyalankar COE, Mumbai University, India. #3 Professor, Department of Electronics & Telecommunication, MPSTME, NMIMS University, India 2 ABSTRACT The need for efficient content-based image retrieval system has increased hugely. Efficient and effective retrieval techniques of images are desired because of the explosive growth of digital images. Content based image retrieval (CBIR) is a promising approach because of its automatic indexing retrieval based on their semantic features and visual appearance. In this proposed system we use the the Fuzzy Color Histogram(FCH) that uses a small number of bins produced by linking the triplet from the L*a*b* color space into a single histogram by means of a fuzzy expert system. The a* and b*components are considered to have more weight than L* as it is mostly the combination of the two which provides the color information of an image. One of the reasons why the L*a*b* color space was selected is that it is a perceptually uniform color space which approximates the way that humans perceive color. Keywords - Content based image retrieval (CBIR), Fuzzy Color Histogram, Retrieval and Query Image. that multimedia databases deal with text, audio, video and image data which could provide enormous amount of information and which has affected life style of human for the better. CBIR is the application of computer vision to the image retrieval problem, that is, the problem of searching for digital images in large databases[2]. Content-based image retrieval also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR). "Content-based" means that the search will analyze the actual contents of the image. The term 'content' in this context might refer colors, shapes, textures, or any other information that can be derived from the image itself. There are some types of feature used for Image retrieval such as color retrieval, textual retrieval, shape retrieval and so on. Figure 1 show diagram fundamental of content-based image retrieval system. I. INTRODUCTION With the advances in computer technologies and the advent of the World Wide Web, there has been an explosion in the amount and complexity of digital data being generated, stored, transmitted, analyzed, and accessed. Much of this information is multimedia in nature, including digital images, video, audio, graphics, and text data[1]. In order to make use of this vast amount of data, efficient and effective techniques to retrieve multimedia information based on its content need to be developed. Among the various media types, images are of prime importance. An image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as captioning, keywords, or descriptions to the images so that retrieval can be performed over the annotation words. The reason behind research on multimedia systems and content-based image retrieval (CBIR) is the fact ISSN: 2231-5381 Figure.1 Block Diagram of Basic Content Based Image Retrieval System II. CONTENT-BASED IMAGE RETRIEVAL (CBIR) Content-based image retrieval (CBIR) has become one of the most active research areas in the past few years. Thus, many visual feature representations have been explored and many CBIR systems have been built. However, there are several problems and challenges need to be consider in attempt to apply CBIR systems. Firstly, the gap between high-level semantic concept and low-level http://www.ijettjournal.org Page 421 International Journal of Engineering Trends and Technology (IJETT) – Volume 23 Number 8- May 2015 visual features is great. In the CBIR context, an image is represented by a set of low-level visual features which are the features have no direct correlation with high-level semantic concept[3]. Human prefer to retrieve images according to the “semantic” or “concept” of an image. But, CBIR depends on the absolute distance of image features to retrieve similar images Thereby, appear the gap between high-level concepts and low-level features which is the major difficulty that hinders further development of CBIR systems. In other sentence, the semantic gap problem is the lack of coincidence between the image representation and the human interpretation for an image. There are many existing feature selection techniques such as distribution based approaches, Kullback-Leibler divergence (K-LD), boosting manner, discriminant analysis (DA) method and others. However, these feature selection techniques remains a challenging problem for image retrieval. In recent year, there are a lot of discriminant analysis method had been proposed and used as a feature selection method to improve relevance feedback. These methods included multiple discriminant analysis (MDA), biased discriminant analysis (BDA), kernel-biased discriminant analysis (KBDA) and nonparametric discriminant analysis (NDA). The goal of discriminant analysis is to find a weight matrix such that the distances between the two scatter class matrixes are maximized. However, these methods have their own drawback that must be solved to improve the performance of CBIR[4]. Basic single Gaussian assumption which proposed by MDA and BDA usually doesn‟t hold, since the few training samples are always scattered in the high dimensional feature space, and their effectiveness will be suffer. Moreover, single Gaussian distribution means all positive samples should be similar with similar view angle and similar illumination, which are not the case for CBIR. To overcome the problem of single Gaussian distribution assumption, KBDA had been introduced. But, this kernel based method has two major drawbacks which is regularization approach is often unstable and it is rely on parameter tuning. Then, NDA had been proposed to solve the problem in MDA, BDA and also KBDA. This approach can only barely match the accuracy performance of KBDA. As a conclusion, many feature selection methods can not satisfy the requirements in CBIR even though there are many method has been apply in content-based image retrieval[5]. III. METHODOLOGY The term Content-based image retrieval [CBIR] describes the process of retrieving desired images from a large collection on the basis of features (such as colour, texture and shape) that can be automatically ISSN: 2231-5381 extracted from the images themselves.Content-based image retrieval, also known as query by image content and content-based visual information retrieval is the application of computer vision to the image retrieval problem, that is, the problem of searching for digital images in large databases[6]. „CONTENT BASED‟ means that the search makes use of the contents of the images themselves, rather than relying on humaninput metadata such as captions or keywords. The surrounding world is composed of images. There are some notable studies on fuzzy color histogram and its application to content-based image retrieval. Han an Ma introduce a fast approach for computing fuzzy color histogram using fuzzy c-means algorithm in They find a correspondence between conventional color histogram (CCH) and fuzzy color histogram (FCH), and compute theFCH of an image without dealing with membership functions. It is claimed that FCH improves robustness, efficiency, and computation. Konstantinidis, Gasteratos and Andreadis propose a fuzzy linking method for color histogram creation in L*a*b* color space. They evaluated their method on a limited dataset, and achieve higheraccuracy values for several image retrieval tests. Comparedto our work, fuzzy color histogram of contains 9 bins.We extracted new colors and find fuzzy rules. Our experiments also focus on retrieving the original image, whereas they evaluate the relevancy of retrieved images[7]. The classic histogram is a global statistical feature, which describes the intensity distribution for a given image. Its main advantage is that it is fast to manipulate, store and compare and insensitive to rotation and scale. On theother hand, it is also quite unreliable as it is sensitive to even small changes in the scene of the image. In color image processing, the histogram consists of three components, respect to the three components of the color space[8]. A histogram is created by dividing a color space into a number of bins and then by counting the number of pixels of the image that belongs to each bin. It is usually thought that in order for an image retrieval system to performsatisfyingly, the number of regions that the color space is divided into is quite large and thus the colors represented by neighboring regions have relatively small differences[9]. As a result, the perceptually similar colors problem appears, images which are similar to each other but have small differences in scene or contain noise will produce histograms with dissimilar adjacent bins and vice versa due to the small distance that the regions are separated from each other. In order to present a solution to this problem , the FCH uses a small number of bins produced by linking the triplet from the L*a*b* color space into a single histogram by means of a fuzzy expert system[7]. The a* and b*components are considered to have more weight than L* as it is mostly the combination of the two http://www.ijettjournal.org Page 422 International Journal of Engineering Trends and Technology (IJETT) – Volume 23 Number 8- May 2015 which provides the color information of an image. IV. DESIGN STEPS One of the reasons why the L*a*b* color space was selected is that it is a perceptually uniform color space which approximates the way that humans perceive color. In L*a*b*, L* stands for luminance, a* represents relative greenness-redness and b* represents relative blueness-yellowness. All colorsand grey levels can be expressed throughout a combination of the three components. However, L* does not contribute in providing any unique color but for shades of colors, white, black and grey. Thus, the L* component receives a lower weight with respect to the other two components of the triplet. According to the work mentioned in [7] ,in order for the CBIR system to work effectively the a* and b* components should be subdivided intofive regions representing green, greenish, the middle of the component, reddish and red for a*, blue, bluish, the middle of the component, yellowish and yellow for b*, whereas L* should be subdivided into only three regions: dark dim and bright areas. The Fuzzification of the input is accomplished by using triangular shaped builtinmembership functions (MF) for the three input components (L*, a*, b*) as shown in fig .2. The Mamdani type of fuzzy inference is used in which the fuzzy sets from the output MFs of each rule are combined through the aggregation operator which is set to max and the resulting fuzzy set is defuzzified to produce the output of the system. The implication factor which determines the process of shaping the fuzzy set in the output Membership Functions(MFs) based on the results of the input MFs is set to min and the OR and AND operators are set to max and min, respectively. The followinf Figure 2 shows the fuzzy color histogram process. This process is divided five steps: Step-1: RGB to L*a*b* conversion: Convert RGB color space image components (pixel values) to L*a*b* color components called fuzzy sets.On sorting out the top 5 images afterfeature extraction fuzzy logic is applied to these images.RGB Color space is converted to first L*a*b* color space.L*a*b* is commonly preferred over RGB or other colorspaces, because it is one of the perceptually uniform color spaces which approximates the way that human perceivecolor. In L*a*b* color space, L* represents luminance, a*represent greenness-redness and b* represents bluenessyellowness. a* and b* components have more weights thanL*component. Step-2: Fuzzification: Fuzzification of each input pixel into l*a*b* using triangular membership function ’trimf’. Take the inputs and determine the degree to which these inputs belong to each of the appropriate fuzzy sets. Membership functions are shown in Figure 3,4 and 5. L* is divided into 3 regions which are Black, Gray and White. Figure 3. Triangular membership function for L* a* is dividedinto 5 regions which are and red, reddish,a_middle, greenish and green Figure 2 Structure of fuzzy color histogram ISSN: 2231-5381 http://www.ijettjournal.org Page 423 International Journal of Engineering Trends and Technology (IJETT) – Volume 23 Number 8- May 2015 Figure 4. Triangular membership function for a* B* is divided into 5 regions which are Blue, Bluish, b_middle, Yellowish, Yellow Figure 5. Triangular membership function for b* Step-3: Aggregation: Then the fuzzy rules are evaluated and their outputs are combined by MAX aggregation operator. This is done to change the aggregate output fuzzy set into a single crisp number. Aggregation is the process of unification of the outputs of all rules. We take the membership functions of all rule consequents previously clipped or scaled and combine them into a single fuzzy set. The input of the aggregation process is the list of clipped or scaled consequent membership functions, and the output is one fuzzy set for each output variable. Table 1. Some of the Fuzzy Rules Rule No. L*Colors a* colors b*colors Fuzzy colors V. RESULTS The process involved in the proposed method is Fuzzy Histogram Error calculation. The Fuzzy histogram is applied only to the top five extracted images to get better retrieval accuracy. After plotting the fuzzy histogram, the fuzzy error is calculated between the given query image and the top five extracted images which is shown in the table.(3) below. Fuzzy error is calculated as: 1 Black Red Blue Black 2 3 Black Gray Red Red Yellow Blue Blue Red 4 Grey Green Blue Yellow 5 White Red Yellowish 6 White Red Blue Dark Pink Red Step-4: Defuzzification: The process of the deffuzification is to change the aggregate output fuzzyset in crisp number. It involvesfuzzy linking using fuzzy rules. After Fuzzification, each fuzzified output (l*a*b* values) of pixel is verified under 75 fuzzy color rules and a single resultant fuzzy color is obtained for a pixel i.e. these components are then linked using seventy five fuzzy rules obtained by using all possible combinations of L*, a* and b* i.e.the 75 rules are created using 3 L*, 5 a* and 5 b* color space combinations (3×5×5=75). The following table summarizes some colors and their fuzzy correspondences.Because some colors (olive, purple, dark pink, lime, bluish- green) reside very close to others in 3-DL*a*b* space, we selected the remaining 9 colors. Therefore, the final fuzzy color histogram contains 9 bins only. Step-5: Calculation of fuzzy histogram and histogram error: Fuzzy histogram is then calculated wherein the pixel is approximately put in the corresponding color bin. After this we calculate the fuzzy histogram error between the Query image and the top 5 sorted images. image. From the table 2 it should be noted that the fuzzy error calculated for Pepper_1.png which is rotated version of query image is 0.0025 which is approximately close to query image. Still it is not the perfect match and hence not the retrieved image. Then table 5 also shows that the fuzzy error for fabric.png is 0.1150 which signifies that it is not at all a match to the query image Peppers.png. This shows the accuracy of the proposed Fuzzy system.A higher successful rate in retrieving a target image is thus obtained. = The Fuzzy error for Peppers.png is zero which is perfect match to the given query image. Hence finally obtain Peppers.png as the retrieved version of query ISSN: 2231-5381 http://www.ijettjournal.org Page 424 International Journal of Engineering Trends and Technology (IJETT) – Volume 23 Number 8- May 2015 Table.2 Fuzzy Histogram Error Sr no Image name Fuzzy error 1 Peppers.png 0 2 Pepper_1.png 0.0025 3 Pepper_2.png 0.0100 4 Pepper_3.png 0.0087 5 Fabric.png 0.1150 which use of gradients in different channels that weight the influence of a pixel on the histogram to cancel out the changes induced by deformations. When a rotated image is given as the query, the original image is retrieved as the closest match as shown in the results. To reduce the large variations between neighboring bins of conventional color histograms, fuzzy color histograms are adopted which consists of only nine bins. It is less sensitive to various changes in the images such lightning variations and noise. REFERENCES [1] “Content-Based Microscopic Image Retrieval System [2] [3] [4] Query Image: Pepper.png [5] [6] [7] Retrieved Image:Pepper.png Figure 6. Query Image and Retrieved Image [8] The retrieved image is “peppers.png” as shown in Figure 6. VI. CONCLUSION The conventional color histogram with Euclidean Distance as similarity measure and the fuzzy color histogram are almost similar in their performance. But the latter couldn‟t respond well to shifted or translated images. In order to overcome this problem invariant color histogram technique is used makes ISSN: 2231-5381 [9] for Multi-Image Queries”, Akakin.H.C., Gurcan. M.N, IEEE Transactions on Information Technology in Biomedicine, Volume:16, Issue:4 DOI: 10. 1109/TITB.2012.2185829, 2012. “Generalized Biased Discriminant Analysis for Content-Based Image Retrieval”, Lining Zhang, Lipo Wang , Weisi Lin, IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, Volume: 42 , Issue: 1 , DOI: 10.1109/TSMCB.2011. 2165335, 2012. “Content Based Image Retrieval Using Unclean Positive Examples”, Jun Zhang , Lei Ye, IEEE Transactions on Image Processing, Volume: 18 , Issue: 10 , DOI: 10.1109/TIP.2009.2026669,2009. “Interactive content-based texture image retrieval”, Patil. P.B., Kokare. M.B., 2nd International IEEE Conference on Computer and Communication Technology,2011, DOI: 10.1109/ICCCT.2011.6075199. “Content-Based Image Retrieval for Blood Cells”, Zare, M.R., Ainon, R.N. , Woo Chaw.Seng Third Asia International IEEE Conference on Modelling & Simulation, 2009, DOI:10.1109/AMS.2009.103. “Image Retrieval Based on Fuzzy Color Histogram”, Kai-xing Wu, Qiang Xu International IEEE Conference on Intelligent Information Hiding and Multimedia Signal Processing(IIHMSP),2008,DOI:10.1109/IIHMSP.2008.57. “A color based fuzzy algorithm for CBIR”, Rahmani, M.K.I. , Ansari. M.A, 4th International Conference on The Next Generation Information Technology Confluenc, 2013. DOI:10.1049/cp.2013.23 42. “Comparative Study on Content-Based Image Retrieval (CBIR)”,Khan.S.M.H., Hussain.A.,Alshaikhli.I.F.T., International IEEE Conference on Advanced Computer Science Applications and Technologies (ACSAT),2012, DOI:10.1109/ACSAT.2012.40. “A survey of methods for colour image indexing and retrieval in image databases”, R. Schettini, G. Ciocca, S Zuffi, Color Imaging Science: Exploiting Digital Media, (R. Luo, L. MacDonald eds.), J. Wiley, 2001. http://www.ijettjournal.org Page 425