Proximity graphs for image retrieval

advertisement
Proximity graphs for image retrieval
Djamel Abdelkader Zighed1 and Hakim Hacid1
ERIC Laboratory, University of Lyon 2
5 av. Pierre Mendès-France
69600 Bron
France.
abdelkader.zighed@univ-lyon2.fr,
hhacid@eric.univ-lyon2.fr
Summary. Retrieving hidden information in multimedia data is a difficult task
according to their complex structure and the subjectivity related to their interpretation. In this situation the use of an index is primordial. A multimedia index makes
it possible to group data according to similarity criteria. We propose in this paper
to bring an improvement to an existing content based image retrieval approach.
We propose an effective method for locally updating neighborhood graphs which
constitute our multimedia index. This method is based on an intelligent way for
locating points in a multidimensional space. Promising results are obtained after
experimentations on various databases.
1 Introduction
In image retrieval, the user is more interested in finding images that have meaning he
expects regardless to the vector of features. To access to the suitable images, the user
may express his/her request using either keywords (or natural language – similar to
Internet search) or by sample images. Today, the textual annotation is manual and
expensive process, this is why the vector approach is mostly used. The relevance
of the image retrieval process depends on the vector of features. Nevertheless, if
we assume that the features are relevant in the representation space, which it is
supposed to be Rp , images that are neighbors should have very similar meanings.
This is based on the general principle that says ”birds of a feather flock together”.
We propose a framework using low-level features, automatically extracted from
images, as navigation support for the browsing process. That way we can access
the semantic aspect of an image based on the neighbor images in the representation
space. Hence, the user may retrieve visually similar images. If the images in the
neighborhood of a sample image are labeled, then the query image may inherit from
the present labels. This process is called learning based instance. The most known
algorithm in this category is the k-nearest neighbors. However, k-nearest neighbors
algorithm has many drawback underlined in our previous paper [SCSZ04]. We prefer to use proximity graphs [Tou80] that have very interesting properties for this
purpose.
796
Djamel Abdelkader Zighed and Hakim Hacid
The advantages of the proximity graph approach were shown in [SCSZ04]. However, there were some limitations basically due to the complexity of the algorithm.
Indeed, for each new query we have to rebuild the whole proximity graphs, thus
the problem becomes intractable. In this paper, we propose a new algorithm able to
compute locally exact neighborhood of a new point. This provides a new advantage
to our approach.
The next section discusses the proximity graph concept in Section 2. Our contribution, dealing with the neighborhood graphs optimization and their application to
images indexing and retrieval is described in Section 3. We describe our experiments
and the evaluation of the method in Section 4. Concluding remarks and future work
are presented in Section 5.
2 Neighborhood graphs and motivation
Neighborhood graphs are very much used in various systems. Their popularity is due
to the fact that the neighborhood is determined by coherent functions which reflect,
from some point of view, the mechanism of the human intuition. Their use is varied
from information retrieval systems to geographical information systems. However,
several problems concerning the neighborhood graphs are under research and require
a lot of work to solve them. These problems are primarily related to their high cost
of construction and to their update difficulties. For this reasons, optimizations are
necessary for their construction and update.
In order to avoid some problems related to the use of the k-NN (symmetry problem, subjectivity related to the determination of k ), the use of another structuration
model based on neighborhood graph was proposed in [SCSZ04]. This proposition has
many advantages and this is the reason why we adopt the same approach (the use
of neighborhood graphs) for content-based image retrieval. We will in the followings
introduce the neighborhood graphs. Neighborhood graphs, or proximity graphs, are
geometrical structures which use the concept of neighborhood to determine the closest points to a given point. For that, they are based on proximity measures [Tou91].
Several possibilities were proposed for building neighborhood graphs. Among
them we can quote the Delaunay triangulation [PS85], the relative neighborhood
graphs [Tou80], the Gabriel graph [GS69], and the minimum spanning tree [PS85].
In this paper, we’ll consider only one of them, the relative neighborhood graph
RNG. The main motivation for this choice is its simplicity and its wide usage. For
illustration, we describe hereafter one example of neighborhood graphs, namely,
relative neighborhood graph (RNG). We will use the following notations throughout
this paper:
Let Ω be a set of points in a multidimensional space Rd . A graph G(Ω,ϕ) is
composed by the set of points Ω and a set of edges ϕ. Then, to any graph we can
associate a binary relation R upon Ω, in which two points (α, β) ∈ Ω 2 are in binary
relation if and only if the pair (α, β) ∈ ϕ. In other words, (α, β) are in binary relation
if and only if they are directly connected in the graph G. From this, the neighborhood
N (α) of a point α in the graph G can be considered as a sub-graph which contains
the point α and all the points which are directly connected to it.
In a relative neighborhood graph G rng (Ω,ϕ), two points (α, β) ∈ Ω 2 are neighbors if they satisfy the following relative neighborhood property: Let H (α, β) be the
Proximity graphs for image retrieval
797
hyper-sphere of radius δ (α, β) and centered on α, and let H (β, α) be the hypersphere of radius δ (β, α) and centered on β. δ (α, β) and δ (β, α) are the dissimilarity
measures between the two points α and β. δ (α, β) = δ (β, α). Then, α and β are
neighbors if and only if the lune A (α, β) formed by the intersection of the two
hyper-spheres H (α, β) and H (β, α) is empty [Tou80]. Formally:
A (α, β) = H (α, β) ∩ H (β, α) T hen (α, β) ∈ ϕ if f A (α, β) ∩ Ω = φ
Figure 1 illustrates the relative neighborhood graph.
Fig. 1. Relative neighborhood graph
The main problem related to neighborhood graphs is their high complexity. Their
usage in the context of database indexing, their exploitation needs some optimizations. Several algorithms for neighborhood graphs construction were proposed. The
algorithms which we quote hereafter relate to the construction of the relative neighborhood graph.
One of the common approaches to the various neighborhood graphs construction
algorithms is the use of the refinement techniques. In this approach, the graph is
built by steps. Each graph is built starting from the previous graph, containing
all connections, by eliminating some edges which do not satisfy the neighborhood
property of the graph to be built. Pruning (edges elimination) is generally done by
taking into account the construction function of the graph or through geometrical
properties.
The construction principle of the neighborhood graphs consists in checking for
each point, if the other points in the space are in its proximity. The cost of this
operation is of complexity O n3 (n is the number of points
in the space). Tou
ssaint [Tou91] proposed an algorithm of complexity O n2 . He deduces the RNG
starting from a Delaunay triangulation [PS85]. Using the Octant neighbors, Katajainen [Kat88] also proposed an algorithm
of the same complexity. Smith [Smi89]
proposed an algorithm of complexity O n23/12
which is significantly less than
O n3 .
In what concern us, as quoted in the introduction, this work can be viewed as
an improvement of the proposal in [SCSZ04]. The proposed approach in [SCSZ04] is
focused mainly on a locally updating technique of neighborhood graphs. This method
does not offer really the anticipated results. Indeed, in this method the graph is not
really updated but the neighbors of each point are considered to be those of it’s
798
Djamel Abdelkader Zighed and Hakim Hacid
nearest neighbor. Formally, if we consider α to be a query point and β its nearest
neighbor. If the set of the neighbors of β is {β1 , β2 , β3 }, then the neighbors of α are
{β, β1 , β2 , β3 }.
This approach, as we can see it, isn’t correct especially in the case of multidimensional spaces. So, using this method the graph will deteriorate. We propose in
the following an effective locally updating method, which is stable and non-sensitive
to the data dimension effects, and which returns very satisfactory results (in terms
of response time and quality answers).
3 Neighborhood graphs based retrieval approach
Several content-based information retrieval systems are based on the k-nearest neighbors principle [FH51] [VT00]. This principle is based on sorting items by taking into
account their distances, a predefined k nearest items are returned by the system
when a query is submitted. For example, the QBIC system, in its implementation
for the Hermitage museum1 [FBF94], returns the 12 nearest images to the query
submitted by the user.
The structuration model of the multimedia databases is (or can be seen) as a
graph based on similarity relations between the items (for example k-NN [Mit97]
or the relative neighborhood graph [SCSZ04]). The objective is to explore an image
database through the similarity between images. The structuration model is very
important because the performances of a content-based information retrieval system depend strongly on the representation structure (indexation structure), which
manages the data.
Image databases interrogation is generally made by submitting a query to the
system; this query is generally an image which the system pre-processes (segments,
equalizes, etc.) producing a vector of features representing a point in a multidimensional space. This point is inserted in the representation structure (indexation
structure) and its neighbors are then turned over as an answer to the query.
In our case, a naive approach when using neighborhood graphs would be the
rebuilding of the neighborhood graph containing the already existing data in the
database by adding the new query item. This approach is unfortunately not suitable, because it is very expensive especially when the number of items in the database
is important. Another approach would be to locally update the neighborhood graph,
i.e., to find a procedure such that only the potential items, which are in the neighborhood of the inserted point, will be affected by the updating.
The neighborhood graph locale update task passes by the location of the inserted
(or removed) point as well as the points which can be affected by the update. To
achieve this, we proceed in two main stages: initially, we look for an optimal surface
which can contain a maximum number of potentially closest points to the query
point. The second stage is done in the aim of filtering the items found beforehand in
order to recover the really closest points to the query by applying an adequate neighborhood property. This last stage causes the effective updating of the neighborhood
relations between the concerned points.
The main stage in this method is the ”search area” determination. This can be
considered as a question of determining an hyper sphere of center α (the query point)
1
http://www.hermitagemuseum.org/
Proximity graphs for image retrieval
799
which maximizes the chance of containing the neighbors of α while minimizing the
number of items that it contains.
We take advantage of the general neighborhood graphs structure in order to
establish the radius of the hyper sphere. We are focusing especially on the nearest
neighbor and the farther neighbor concepts.
So, the radius of the hyper-sphere respecting the above properties is the one
including all the neighbors of the first nearest neighbor of the query point. So,
considering that the hyper-sphere is centered in α, its radius will be the sum of
the distances between α and its nearest neighbor and the one between this nearest
neighbor and its further neighbor.
The content of the hyper sphere is processed in order to see whether there are
some neighbors (or all the neighbors). The second step constitutes a reinforcement
step and aims to eliminate the risk of losing neighbors or including bed ones. So, we
take all the true neighbors of the query point recovered beforehand (those returned
in the first step) as well as their neighbors and update the neighborhood relations
between these points.
That is, let consider α be the query point and β its nearest neighbor with a
distance δ1 . Let consider λ be the further neighbor of β with a distance δ2 . The
radius R of the hyper sphere can be expressed as:
SR = (δ1 + δ2 ) × (1 + ǫ)
ǫ is a relaxation parameter and it can be defined according to the state of the data
(their dispersion for example) or by the domain knowledge. We set experimentally
this parameter to 0.1.
The complexity of this method is very low and meets perfectly our starting aims
(locating the neighborhood of points in a time as short as possible). It is expressed
by:
O(2n + n′3 )
Where
• n: the number of items in the database.
• n’: the number of items in the hyper sphere (<< n).
This complexity includes the two stages described previously, namely, the search
of the radius of the hyper sphere and the seek of the true neighbors which are in this
hyper sphere corresponding to the term O(2n). The second term corresponds to the
necessary time for the effective update of the neighborhood relations between the real
neighbors which is very weak taking into account the number of candidates turned
over. This complexity constitutes the maximum complexity and can be optimized
in several ways. The most obvious way is to use a fast nearest neighbor algorithm.
4 Experiments and results
The interest of neighborhood graphs for content based information retrieval is shown
and discussed in [SCSZ04]. The comparison tests done with several different configurations of k-NN were interesting and showed the utility of these structures in this
800
Djamel Abdelkader Zighed and Hakim Hacid
field. In this section we are interested into two different tests: testing the validity of
the obtained results by the use of the suggested method and testing the response
times. We used an image database [NNM96]. This database contains 7200 images
representing 100 different objects taken with various views.
To perform our experimentations, we pre-process the database by applying some
image processing algorithms for extracting some characteristics. We use especially
color, texture and form features. Each image is represented as a vector of features
and the graph is built on the obtained vectors. The interrogation is also done in the
same way by using a vector of features. So, each image is represented as a point in
a multidimensional space. Our interest by performing these experiments is to show
that our method gives the same results as the ones obtained by building the graph
with all items and ensure we don’t include wrong or miss good neighbors of the
query item.
To check the validity of the obtained results, we need a reference graph. For that,
we take the whole image database (vectors of descriptors) and we build a relative
neighborhood graph. After building the reference graph, we can start the validity
checking tests. We take arbitrary an item from the data set and we build a new
relative neighborhood graph on the remaining items (the n-1 items). After that, we
insert the item taken beforehand from the dataset in the graph by using the proposed
method. The neighbors of the inserted point are compared with the neighbors of this
same item in the reference graph. This operation is repeated several times by taking
different items at each iteration.
Figure 2 summarizes the obtained results by taking into account different situations. The first experiment constitutes our reference. The experiments carrying
numbers 2,4,6 were carried out with an item in less and experiments 3,5,7 after the
insertion of the remaining item in the corresponding previous experiment. That is,
after the insertion of the remaining item, we always find the recall of the reference
graph, this means that the method finds the good neighbors of the inserted item
into each experiment. Figure 4 gives a visual illustration of the obtained result using
the neighborhood graphs indexing technique.
The validity of the obtained results being acquired, we are interested now in
the response times of the method, i.e. time that the method takes to insert a query
item in an existing structure. We use the same protocol as the one for the validity
checking, but in place of recovering the results items, we take into consideration only
the response times of each query. We used for these experiments a machine with an
INTEL Pentium 4 processor (2.80 GHz) and 512 Mo of memory. The response times
for 10 items, always arbitrary taken from the same dataset, are shown in the graphic
on Figure 3.
The response times (expressed in milliseconds) are interesting according to the
volume of used dataset. They are variable from an item to another and this is
due especially to the fact that various candidates items count for neighborhood
determination are used at each iteration. Generally speaking, the insertion of a new
item in the graph using our method requires only 50 milliseconds (the mean), and
requires at least 40 minutes using the standard algorithm in the same conditions.
Proximity graphs for image retrieval
Fig. 2. Recall measured for 03 locale
insertions
801
Fig. 3. Execution time variations of 10
locale insertions
Fig. 4. Obtained neighborhood using a standard algorithm with the whole images
5 Conclusion and future work
Content-based image retrieval is a complex task because of, primarily, the nature of
the image data and to the attached subjectivity on their interpretation. The use of
an adequate multimedia index structure is important. In this paper we considered
neighborhood graphs as indexing structure and we proposed a method for locally
updating them with an aim of bringing an improvement of an existing approach. The
experiments performed on different datasets show the effectiveness and the utility
of the proposed method.
As future work, we plan to integrate this approach and its application in more
functions related to images like automatic annotation and classification.
References
[FBF94] Christos Faloutsos, Ron Barber, Myron Flickner, Jim Hafner, Wayne
Niblack, Dragutin Petkovic, and William Equitz. Efficient and effective
querying by image content. J. Intell. Inf. Syst., 3(3/4):231–262, 1994.
802
Djamel Abdelkader Zighed and Hakim Hacid
[FH51] E. Fix and J. L. Hudges. Discriminatory analysis - non parametric discrimination: consistency properties. Technical Report 21-49-004, USAF School
of aviation medicine, Randolph Field, Texas, 1951.
[GS69] K. R. Gabriel and R. R. Sokal. A new statistical approach to geographic
variation analysis. Systematic zoology, 18:259–278, 1969.
[Kat88] J. Katajainen. The region approach for computing relative neighborhood
graphs in the lp metric. Computing, 40:147–161, 1988.
[Mit97] Tom M. Mitchell. Machine learning meets natural language. In EPIA, page
391, 1997.
[NNM96] Sameer A. Nene, S. K. Nayar, and H. Murase. Columbia object image
library (coil-100). Technical report, Technical Report CUCS-006-96, February 1996.
[PS85] F. Preparata and M. I. Shamos. Computationnal Geometry-Introduction.
Springer-Verlag, New-York, 1985.
[SCSZ04] M. Scuturici, J. Clech, V. M. Scuturici, and D. Zighed. Topological representation model for image databases query. Journal of Experimental and
Theoritical Artificial Intelligence (JETAI), pages 145–160, 2004.
[Smi89] W. D. Smith. Studies in computational geometry motivated by mesh generation. PhD thesis, Princeton University, 1989.
[Tou80] G. T. Toussaint. The relative neighborhood graphs in a finite planar set.
Pattern recognition, 12:261–268, 1980.
[Tou91] G. T. Toussaint. Some insolved problems on proximity graphs. D. W
Dearholt and F. Harrary, editors, proceeding of the first workshop on proximity graphs. Memoranda in computer and cognitive science MCCS-91-224.
Computing research laboratory. New Mexico state university Las Cruces,
1991.
[VT00] Remco C. Veltkamp and M. Tanase. Content-based image retrieval systems
: A survey. Technical Report UU-CS-2000-34, Department of Computing
Science, Utrecht University, 2000.
Download