Review of Content Based Image Retrieval Systems Saurav Seth ,Prashant Upadhyay

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 19 Number 4 – Jan 2015
Review of Content Based Image Retrieval Systems
Saurav Seth1,Prashant Upadhyay2,Ruchit Shroff3,Rupali Komatwar4
Department of Computer engineering1,2,3,4
Mumbai,Maharashtra,India1,2,3,4
Abstract-Content based image Retrieval (CBIR) has been an
active research field since the past two decades. In contrast to a
traditional system, in which the images are retrieved based on
the keywords, CBIR system retrieves the images based on the
visual content. In this paper, we start with the introduction to a
simple CBIR system and proceed to review few of the techniques
used to develop CBIR system. We also study the visual features
used for feature extraction. We study the components of a
conventional system. A review of a relevance feedback system
and a fuzzy logic system is presented. The query performance of
each technique is studied.
Keywords— review, content based image retrieval, visual
features, relevance feedback, fuzzy logic
I.
INTRODUCTION
We are at the start of the age of digital information. With the
ever-increasing access to the Internet, acquiring digital
information has become increasingly popular in recent years
[1] .The digital libraries and multimedia databases consist of
many types of information which include text, audio, image
and video. Content-based image retrieval (CBIR) lies at the
crossroads of multiple disciplines such as databases, artificial
intelligence, image processing, statistics, computer vision,
high performance computing, and human – computer
intelligent interaction.
Several limitations are inherent in a system based on
metadata. CBIR, which has a large range of uses for efficient
image retrieval, reduces these problems. Text-based
information about images can be easily searched using
existing technology, but this requires manual description of
each image in the database. In several situations, text
annotation remains incomplete. It is quite rare for complete
text annotation to be available because it would require
describing each color, texture, shape, and object within the
image. Second difficulty faced with text annotation is that a
large amount of labor work is required in manual image
annotation for huge image data. Instead of searching for
manually annotated text-based keywords, images are retrieved
based on their own visual features, such as color, texture and
shape; hence researchers turned attention to content based
retrieval techniques [2].
user is used to check whether the results are relevant or
irrelevant. If the results are irrelevant, the feedback loop is
repeated many times until the user is satisfied.
A Fuzzy System uses fuzzy logic, which in contrast to crisp
logic, is useful to represent uncertainty in complex problems
[3]. Image data can be characterized as having a fuzzy nature
due to the following [4]:
1. Descriptions of images usually involve inexact and
subjective concepts.
2. Imprecision and vagueness are present in descriptions of the
images and in some of the visual features.
3. Users‟ needs to retrieve images may be generally fuzzy.
Fuzzy logic can minimize semantic gap between high level
semantic and low level image features. [5] Fagin [6] and
Orlega et al. [7] are the pioneers who successfully integrated
fuzzy logic models into CBlR systems. Their proposed
algorithms evaluated the fuzzy query, and also showed the
effectiveness through experimental results. Medasani and
Krishnapuram [8] proposed a fuzzy linguistic query in their
CBlR system.
The paper is organized as follows: the first section provides
a brief introduction. The second section describes a general
system architecture and its components followed by a review
of various visual features in the third section. We review
different techniques in section 4 including conventional
system and its disadvantages, a relevance feedback system and
a fuzzy logic system Conclusion and Future research
directions are presented in the last section.
Relevance feedback is a powerful technique in CBIR
systems, which can improve the performance of CBIR
effectively. It provides open research area to the researcher to
reduce the semantic gap between low-level features and high
level concepts. The basic concept behind this is that after
obtaining the retrieval results, the feedback provided by the
ISSN: 2231-5381
http://www.ijettjournal.org
Page 178
International Journal of Engineering Trends and Technology (IJETT) – Volume 19 Number 4 – Jan 2015
INPUT IMAGE
IMAGE DATABASE
QUERY IMAGE
FEATURE
EXTRACTION
SIMILARITY
MATCHING
FEATURE
EXTRACTION
IMAGE
FEATURE
DATABASE
RETRIEVED IMAGE(S)
Fig.1Model of Content based image retrieval
III.
II.
VISUAL FEATURES
CBIR COMPONENTS
A. Query image
This is the image inputted by the user. This image undergoes
feature extraction. Finally similarity matching is used to
retrieve similar images from the feature database.
B. Image database
This consists of all the images present in the database. Each
image is subjected to the feature extraction process. This
information is then stored in a feature database.
C. Similarity Matching
Matching images directly, that is comparing the values of
the pixels of the image directly is quite often used in object
recognition. Different methods have been proposed to do this
and a selection of these methods is presented here and can be
used in the image retrieval system.
Euclidean Distance Probably the most common approach to
compare images directly is the Euclidean distance. Euclidean
distance is a geometrical concept which takes into
consideration the co-ordinate values of the pixel points
between which the distance is to be found [9]. To be able to
compare images using a Euclidean distance, the images have
to be of the same size which can be achieved easily with
scaling algorithms. The Euclidean distance has been used
successfully e.g. in optical character recognition and has been
extended by different methods.
D. Feature extraction
Feature extraction is the process of computing numerical or
alphanumerical representation of some attribute of digital
images to derive the image contents. A feature is directly
related to the visual characteristics of the image.
ISSN: 2231-5381
A. color
Color feature in content based image retrieval uses various
color spaces such as RGB, XYZ, YIQ, L*a*b*, U*V*W*,
YUV and HSV. The HSV color space gives the best color
histogram feature, among the different color spaces [10]-[13].
HSV color space the color is presented in terms of three
components: Hue (H), Saturation (S) and Value (V) and the
HSV color space is based on cylinder coordinates. L*a*b*
color space, L* stands for luminance, a* represents relative
greenness-redness and b* represents relative bluenessyellowness It achieves device independence [14].
B. Texture
Texture is an essential feature of an image when querying
image databases. It depends on human visual perception. The
two most commonly used features are Tamura and Gabor. In
Tamura the authors propose six texture features corresponding
to human visual perception: coarseness, contrast,
directionality, line-likeness, regularity, and roughness [15].
They make experiments to test the significance of the feature
and found the first three features to be very important. Gabor
filters are a well known technique for texture analysis which
was used for different works earlier. In this work we use the
approach where the HSV color space (hue, saturation, value)
is used. It has been proposed that Gabor filters can be used to
model the responses of the human visual system [15]-[17].
C. Shape
Shape from an image is quite a powerful representation as it
characterizes the geometry of the object. The representation of
a shape should be invariant to scale, translation and rotation.
http://www.ijettjournal.org
Page 179
International Journal of Engineering Trends and Technology (IJETT) – Volume 19 Number 4 – Jan 2015
The shape feature can be divided into two categories i.e.
Contour based and regions based [19]. Region based includes
simple geometric attributes that can be obtained by measuring
properties of points belonging to the region. The properties
includes area, aspect ratio etc. Typically boundary-based
representations include two major steps. First, a 1D function is
constructed from a 2D shape boundary parametrizing the
contour. Then the constructed 1D function is used to extract a
feature vector describing the shape of the object. The contour
based representation has descriptors such as Fourier
descriptors and CSS (Curvature Scale Space) descriptors [18][19].
IV.
TECHNIQUES
A. Conventional system
Basically Content based image retrieval technique is a
method to retrieve images that matches to the given
specifications of the query image. In CBIR systems, the
images stored in the database are labeled by feature vectors,
which are extracted from the images by means of computer
vision and digital image processing techniques. These feature
vectors are obtained by the feature extraction process. In
CBIR systems, the query to a database is specified by an
image. The query‟s feature vector is computed and the closest
items in the database, according to a similarity metric or
distance defined in feature space, are returned as the answers
to the query. A CBIR system is a query resolving system over
image collections that use the information inherently
contained in the image. The CBIR system has to be able to
extract quantitative features from the images that allow the
system to index the image collection and to compute a
distance between images. The user interacts with the system
by a querying interface, usually a web page, where the query
is defined and sent to the CBIR engine. In the process the
query is represented by an image provided by the users, asking
the CBIR system for a list of the most similar images in the
database. To resolve the query, the CBIR engine computes the
image features which correspond to a point in the metric space
defined by the system. Each image in database has a
representative in this metric space so a distance to the query
image could be computed for each image, using a similarity
(or dissimilarity) function. This produces a list ordered by
similarity (or dissimilarity) to the query image, which is
presented to the user as the response [20].
feedback to the system, so that the system can perform well in
order to reply to the original query according to the desired
output. For the retrieval of an image from the database, the
very first step is we extract feature vectors from images. The
features can be like shape, color, texture etc. These features
are stored in another database for future use. After the query
image is given by the user, the features are extracted and we
match those features with the one included in database image
features. And if the distance between these two images is
found to be small enough at an acceptable level; we consider
the corresponding image in the database similar to the query.
The results are based on similarity matching rather on
matching images. Then user attains the opportunity to give the
feedback in the form of his/her judgments expressed over the
retrieval results. The relevance judgments compute the results
depending on a three values. The three values are relevant,
non-relevant and don‟t care. Relevant means the similar image
desired to the user, non-relevant means the image is definitely
not matching, and don‟t cares mean the user does not care and
says anything about the image. The feedback loop stops as
soon as the feedback provided by user falls under the
„relevant‟ category otherwise it continues until user gets
satisfied with results.
One of the techniques of relevance feedback is the Bayesian
framework. A Bayesian network is a representation of random
variables graphically, providing an effective knowledge
representation. By formulating the problems through the
Bayesian belief network, we found that for our relevant image
adoption problem, the Bayesian network has advantages. The
Bayesian network has 3 layers, the query layer, feature index
layer and relevant image layer. Query layer is the root node
which represents the query example provided by the user. The
feature index layer is further divided into the low level feature
representations i.e. color, texture and shape, another level
consists of the components of feature vectors. The third layer
consists of relevant images specified by user [21].
Initial CBIR systems focused only on the visual features;
however, after the popularity of these systems, the need for
user-friendly interfaces became a necessity. Therefore, the
CBIR field started to include efficient designs that were easy
to understand and that tried to meet the needs of the user
performing the search. Systems that allowed descriptive
semantics in their query methods were required. In addition,
systems that provided user feedback and systems that included
machine learning, which may understand user satisfaction
levels were required.
These features have their further classifications as global
features and local features. The commonly used features are
color, texture, and shape. All these features are application
independent [21]-[23].
B. Relevance feedback
C. Fuzzy system
Basically the idea of relevance feedback is to shift the load
of finding the right query formulation from the user to the
system. In order for correctness, the user has to provide some
The fuzzy system can be combined with several other
models to improve its performance. It can be combined with
textual descriptions as well as with relevance feedback.
ISSN: 2231-5381
Fig2. Relevance feedback system
http://www.ijettjournal.org
Page 180
International Journal of Engineering Trends and Technology (IJETT) – Volume 19 Number 4 – Jan 2015
1) Image retrieval using texture features: Tamura
features are used to extract the texture feature of an image.
These low level features are assigned fuzzy linguistic terms.
For e.g.: Coarseness – very fine, fine, medium coarse, coarse,
very coarse. These linguistic terms lie within the discourse [0,
1] [5]. The various semantic and syntactic rules [5] are used to
obtain membership function. These are created for every
linguistic term from (left to right).A function for the query is
constructed which is then used for similarity matching
between the query and the images in the database. This
function can be defined by another fuzzy set [5].
The above system can handle queries with textual
descriptions. For e.g. a query of the form “very directional ˄
very blob-like ˄ very regular” [5] would retrieve images
whose linguistic terms correspond to the query‟s description.
2) Image retrieval using color feature: The L*a*b*
color space can be used for feature extraction. The color space
is split into triplets of L*, a*, b*. L* is given a low weightage
as it does not provide any unique color. As in the texture
model, each of the triplets is assigned a linguistic variable. For
e.g.: a*- green, greenish, middle, reddish, red [24]. A FIS
(fuzzy inference system), consisting of a rule base is used to
obtain a fuzzy color histogram (FCH) [25]. This FCH is
compared with all the FCH‟s of images in the database using a
similarity function. The similarity function called min-max
ratio is used to perform the comparison [24].
Such a system performs better as compared to a conventional
system.
The above system is more effective than the conventional
system. For a bus image the mean of the similarity between
images is 87% in contrast to 46% for a conventional system
[24].
A clustering algorithm can be used to classify the images
and create more appropriate linguistic variables.
V.
CONCLUSION
We have presented a brief review of the techniques that are
used in content-based image retrieval. A conventional system
only captures the low level features. The use of relevance
feedback (RF) provides the CBIR system with the necessary
feedback which is used to improve query performance. A
fuzzy system performs better than a conventional system by
reducing the semantic gap. The fuzzy system assigns linguistic
variables to the low level features; this helps the system to
reduce the semantic gap.
For future work, a fuzzy relevance feedback using both color
and texture feature can be studied.
[7] M. Ortega, Y. Rui, K. Cbakrabarti, K. Porkaew, S. Mehrotra, and T. S.
Huang, “Supporting ranked Boolean similarity queries in MARS,” IEEE
Transactions on Knowledge and Data Engineering, Vol. IO, No. 6, pp. 905925,1998.
[8] S. Medasani and R. Krishnapuram, “A fuzzy approach to complex
linguistic query based image retrieval,” IEEE International Conference on
Fuzzy Systems, pp. 590-595, Seoul-Korea, Aug., 1999.Colour paper
[9] John Eakins Margaret Graham University of Northumbria at Newcastle
,“Content-based Image Retrieval”, Report: 39 JISC Technology Applications
Programme Joint Information Systems Committee October 1999.
[10] X. Wan and C. C. Kuo, “Color distribution analysis and quantization for
image retrieval”, In SPIEStorage and Retrieval for Image and Video
Databases IV, Vol. SPIE 2670, pp- 9–16. 1996.
[11] M. W. Ying and Z. HongJiang, “Benchmarking of image feature for
content-based retrieval”, IEEE.Pp-253-257, 1998.
[12] Z. Zhenhua, L. Wenhui and L. Bo, “An Improving Technique of Color
Histogram in SegmentationbasedImage Retrieval”, 2009 Fifth International
Conference on Information Assurance and Security,IEEE, pp-381-384, 2009.
Signal & Image Processing : An International Journal (SIPIJ) Vol.3, No.1,
February 201256
[13] E. Mathias, “Comparing the influence of color spaces and metrics in
content-based image retrieval”,IEEE, pp- 371-378, 1998.
[14] V. Castelli, and L. Bergman, Image Databases: Search and Retrieval of
Digital Imagery, Wiley-Interscience, USA, 2002.
[15] Manjunath, B., Ma, W.: Texture features for browsing and retrieval of
image data. IEEE Trans on Pattern Analysis and Machine Intelligence 18
(1996) 837–842.
[16] Tamura, H., Mori, S., Yamawaki, T.: Textural features corresponding to
visual perception. IEEE Trans on Systems, Man and Cybernetics 8 (1978)
460–472
[17] Manjunath, B., Wu, P., Newsam, S., Shin, H.: A texture descriptor for
browsing and similarity retrieval. Journal of Signal Processing: Image
Communication 16 (2000) 33–43
[18] A. Del Bimbo, Visual Information Retrieval, 270 pages, Morgan
Kaufmann Publishers, San Francisco, California, 1999.
[19] M. Trimeche, “Shape Representations for Image Indexing and
Retrieval”, Master of Science Thesis, Tampere University of Technology,
May 2000.
[20] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain; ContentBased Image Retrieval at the End of the Early Years; IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol. 22, No 12, Pg. 1349-1380,
December 2000.
[21] Jensen, F.V., Lauritzen, S.L. and Olesen, K.G. (1990): Bayesian
updating in causal probabilistic networks by local computations.
Computational Statistics Quarterly 4:269-282.
[22] Q. Tian , N. Sebe, M.S. Lew , E. Loupias, and T. S. Huang, “Image
Retrieval Using Wavelet-Based Salient Points,” J. Electronic Imaging,
special issue on Storage and Retrieval of Digital Media, vol. 10, no.4 pp. 849935, 2001.
[23] J. Wang, H. Zha, and R. Cipolla, “Combining Interest Points and Edges
for Content-based Image Retrieval” in Proc. IEEE Int‟l Conf. Image
Processing (ICIP‟ 05), pp. 1256-1259, 2005.
[24] Aqeel M. Humadi, Hameed A. Younis “Application of the Fuzzy Logic
in Content Based Image Retrieval using Color Feature” IJCSMC, Vol. 3,
Issue. 2, February 2014, pg.170 – 180.
[25] Han, J., & Ma, K. K. (2002). Fuzzy color histogram and its use in color
image retrieval. IEEE Transactions on Image Processing, 11(8), 944–952.
doi:10.1109/TIP.2002.801585.
REFERENCES
[1] Jia Li, James Z. Wang, Gio Wiederhold IRM: Integrated Region Matching
for Image Retrieval, multimedia ’00 proceedings of the 8th ACM international
conference on multimedia pg. 147-156
[2] kokare manesh, chatterji b. n., biswas p. k, A Survey on Current Content
Based Image Retrieval Methods, IETE journal of research,
2002, vol. 48, pg. 261-271.
[3]Ponce-Cruz, Pedro, Ramírez-Figueroa, Fernando D., intelligent control
systems with labVIEW, 2010,XII.
[4] L. Yan, and Z. Ma, Intelligent Multimedia Databases and Information
Retrieval: Advancing Applications and Technologies, September - 2011.
[5] Chih-Yi Chiu Hsin-Chih Lin‟ Shi-Nine Yam “A fuzzy logic CBIR
System” The 12th IEEE International Conference on Fuzzy Systems, 2003
(Volume: 2) pg. 1171-1176.
[6] R. Fagin, “Combining fuzzy information from multiple systems,” Journal
of Computer and System Sciences, Vol. 58, No. 1, pp. 83-99, 1999.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 181
Download