2. Feature Extraction Engine

advertisement
FOUNDIT: SEARCHING FOR DECORATION DESIGNS IN DIGITAL
CATALOGUES
E. J. PAUWELS AND M.J. HUISKES
Centre for Mathematics and Computer Science (CWI),
Kruislaan 413, 1098SJ Amsterdam, The Netherlands
E-mail: {Eric.Pauwels,Mark.Huiskes}@cwi.nl
K. NOONAN, P. BERNARD AND P. VANDENBORRE
Sophis Systems NV,
Vlamingstraat 19, B-8560 Wevelgem, Belgium
E-mail: {karl,Paul.Bernard,piet}@sophis.be
P. PIANEZZA AND M. DE MADDALENA
Pianezza Paolo Srl., Italy.
Località Oro 6, 21030 Azzio (VA), Italy
E-mail: {paolo,marco}@pianezza.it
The FOUNDIT project aims to develop a system for content-based image
retrieval (CBIR) that allows users to give relevance feedback in a natural and
intuitively transparent manner. Although the aim is to develop a generic
system, the project focuses its efforts on large and challenging databases
provided by the decoration industry.
1.
Introduction
Content-based image retrieval (CBIR) remains a challenging problem especially
when the user’s subjective appreciation is involved (e.g. when browsing
databases of decorative designs such as clothes, textiles, wallpaper, etc). In these
applications, the only way to elucidate the user’s preference is by continuously
soliciting his feedback. This feedback is then harnessed to estimate for each
image in the database the likelihood of its relevance with respect to the user’s
goals, whereupon the most promising candidates are displayed for further
inspection and feedback. The most straightforward way to model the fuzzy state
of knowledge about the user's preferences, is for the search engine to assign to
every image I in the database a relevance probability p(I) that reflects the current
estimate of relevance. As gradually more information about the user’s
1
2
preferences becomes available, the probability measure will change to reflect the
reduced state of uncertainty and images that are assigned a high relevance will be
more likely to be sampled for display. The goal of the FOUNDIT project [1] is
to build a CBIR search engine based on the above principles that can handle the
requirements typically encountered in decoration-related image and design
databases. The FOUNDIT system comprises the following three modules. The
graphically oriented interface (see Fig.1) allows the user to provide the system
with relevance feedback by selecting examples and counter-examples which are
collected in separate bins. This qualitative feedback is then transformed by the
relevance inference engine into a probabilistic relevance-measure for each image
in the database by coupling it to mathematical features. The inference engine
therefore relies on the availability of pre-computed features that characterize the
visual appearance of the images. This feature-database is generated off-line by
the feature extraction engine.
Figure 1. Screenshot of the Foundit prototype interface. Browsing and searching by both
manually annotated categories (index on left) and relevance feedback are supported.
Feedback is supplied by selecting positive and negative examples which are collected on
the display bar at the bottom (positive examples on the left, negative on the right).
3
2.
Feature Extraction Engine
The Feature Extraction Engine consists of a large collection of algorithms for
quantitative image characterization. The routines are not restricted to
computation of low-level features such as global color and texture measures, but
try to establish a link to the more semantically meaningful categories that are
typically used by humans when making esthetical judgments on designs.
In recognition of their vital role in capturing the essence of a design, much
effort has been directed towards the detection of so-called salient design
elements. Two main strategies are followed to this end: (i) figure-ground
segregation, and (ii) grouping of primitives.
The figure-ground segregation is based on color-texture region extraction
and subsequent region classification based on region property variables such as
relative size, connectedness and compactness. Primitive grouping is directed at
finding objects by analyzing configurations of primitive image elements (e.g.
edges). In this manner we may for instance detect the occurrence and
arrangement of homogeneous strips.
Based on the decomposition of a design into a ground and one or more
salient regions or objects, the feature computation process can be further
specialized. For foreground regions/salient objects we compute, among others,
features for the following: size, orientation, color, shape (region and contourbased); and in case of several objects: spatial organization, occurrence of
periodic patterns, motive variation (color, shape, orientation). For background
regions, or images consisting entirely of (color) texture, we characterize: color,
e.g. dominant color, color structure, color layout; texture, e.g. by regularity,
coarseness, direction, edge histograms and granulometries. The Feature
Extraction Engine supports the full range of MPEG-7 visual descriptors [2], and
further contains routines based on multi-resolution analysis and morphological
operators.
3.
Relevance Inference Engine
At every stage of the search-history, the user inspects a (small) fraction of the
database (displayed on the interface, see Fig. 1) and provides the system with
feedback by making a number of positive and negative selections and
transferring them into the collection box as examples and counter-examples (see
above). This can be formalized by saying that for the images in the collection
box we have additional information (i.e. on top of the pre-computed feature
values) that is captured in a binary variable based on the interpretation supplied
4
by the user: we assign the value 1 if the image was considered to be an example,
and 0 if it was a counter-example.
Within the Foundit framework we have chosen for logistic regression as a
flexible framework to translate the qualitative user feedback (in terms of
examples and counter-examples) into quantitative information useful for
retrieval. It allows us to express the correlation between the feature values and
the binary response variable into a precise parametric model. Furthermore, these
regression models yield a principled tool to estimate the efficacy of individual
features in gauging the overall relevance of images by taking into account the
goodness-of-fit coefficients. As a consequence, it becomes possible to
automatically and adaptively extract from the vast collection of pre-recorded
image-features, the small subset that correlates best with the particular search at
hand.
4.
Conclusion
To improve the image mining capabilities in large databases of decoration
designs, FOUNDIT is developing a CBIR search engine that allows the user to
give relevance feedback in a natural and intuitively transparent fashion, thus
making this technology more efficient for professional users, and opening it up
to the much wider audience of non-expert mainstream users. Among future work
we envisage the definition of an XML/MPEG-7 description scheme [3] for the
representation of design image interpretations.
Acknowledgments
FOUNDIT is partially supported by the European Commission under the
IST Programme of the Fifth Framework (Project nr. IST-2000-28427).
References
1. FOUNDIT Webpage: http://www.cwi.nl/~foundit.
2. B. Manjunath, P. Salembier, and T. Sikora (Eds.) (2002),
Introduction to MPEG-7 - multimedia content description interface, John
Wiley and Sons, Ltd.
3. P. Bernard, H.Derumeaux, M.Huiskes, E. Pauwels, P. Vandenborre, S.
Sette, L. Vanlangenhove: An MPEG7-compatible XML-Schema for
Semantic Meta-data for Decorative Designs in Textile Industry. Submitted
to AUTEX 2003.
5
Download