Proximity graphs for image retrieval Djamel Abdelkader Zighed1 and Hakim Hacid1 ERIC Laboratory, University of Lyon 2 5 av. Pierre Mendès-France 69600 Bron France. abdelkader.zighed@univ-lyon2.fr, hhacid@eric.univ-lyon2.fr Summary. Retrieving hidden information in multimedia data is a difficult task according to their complex structure and the subjectivity related to their interpretation. In this situation the use of an index is primordial. A multimedia index makes it possible to group data according to similarity criteria. We propose in this paper to bring an improvement to an existing content based image retrieval approach. We propose an effective method for locally updating neighborhood graphs which constitute our multimedia index. This method is based on an intelligent way for locating points in a multidimensional space. Promising results are obtained after experimentations on various databases. 1 Introduction In image retrieval, the user is more interested in finding images that have meaning he expects regardless to the vector of features. To access to the suitable images, the user may express his/her request using either keywords (or natural language – similar to Internet search) or by sample images. Today, the textual annotation is manual and expensive process, this is why the vector approach is mostly used. The relevance of the image retrieval process depends on the vector of features. Nevertheless, if we assume that the features are relevant in the representation space, which it is supposed to be Rp , images that are neighbors should have very similar meanings. This is based on the general principle that says ”birds of a feather flock together”. We propose a framework using low-level features, automatically extracted from images, as navigation support for the browsing process. That way we can access the semantic aspect of an image based on the neighbor images in the representation space. Hence, the user may retrieve visually similar images. If the images in the neighborhood of a sample image are labeled, then the query image may inherit from the present labels. This process is called learning based instance. The most known algorithm in this category is the k-nearest neighbors. However, k-nearest neighbors algorithm has many drawback underlined in our previous paper [SCSZ04]. We prefer to use proximity graphs [Tou80] that have very interesting properties for this purpose. 796 Djamel Abdelkader Zighed and Hakim Hacid The advantages of the proximity graph approach were shown in [SCSZ04]. However, there were some limitations basically due to the complexity of the algorithm. Indeed, for each new query we have to rebuild the whole proximity graphs, thus the problem becomes intractable. In this paper, we propose a new algorithm able to compute locally exact neighborhood of a new point. This provides a new advantage to our approach. The next section discusses the proximity graph concept in Section 2. Our contribution, dealing with the neighborhood graphs optimization and their application to images indexing and retrieval is described in Section 3. We describe our experiments and the evaluation of the method in Section 4. Concluding remarks and future work are presented in Section 5. 2 Neighborhood graphs and motivation Neighborhood graphs are very much used in various systems. Their popularity is due to the fact that the neighborhood is determined by coherent functions which reflect, from some point of view, the mechanism of the human intuition. Their use is varied from information retrieval systems to geographical information systems. However, several problems concerning the neighborhood graphs are under research and require a lot of work to solve them. These problems are primarily related to their high cost of construction and to their update difficulties. For this reasons, optimizations are necessary for their construction and update. In order to avoid some problems related to the use of the k-NN (symmetry problem, subjectivity related to the determination of k ), the use of another structuration model based on neighborhood graph was proposed in [SCSZ04]. This proposition has many advantages and this is the reason why we adopt the same approach (the use of neighborhood graphs) for content-based image retrieval. We will in the followings introduce the neighborhood graphs. Neighborhood graphs, or proximity graphs, are geometrical structures which use the concept of neighborhood to determine the closest points to a given point. For that, they are based on proximity measures [Tou91]. Several possibilities were proposed for building neighborhood graphs. Among them we can quote the Delaunay triangulation [PS85], the relative neighborhood graphs [Tou80], the Gabriel graph [GS69], and the minimum spanning tree [PS85]. In this paper, we’ll consider only one of them, the relative neighborhood graph RNG. The main motivation for this choice is its simplicity and its wide usage. For illustration, we describe hereafter one example of neighborhood graphs, namely, relative neighborhood graph (RNG). We will use the following notations throughout this paper: Let Ω be a set of points in a multidimensional space Rd . A graph G(Ω,ϕ) is composed by the set of points Ω and a set of edges ϕ. Then, to any graph we can associate a binary relation R upon Ω, in which two points (α, β) ∈ Ω 2 are in binary relation if and only if the pair (α, β) ∈ ϕ. In other words, (α, β) are in binary relation if and only if they are directly connected in the graph G. From this, the neighborhood N (α) of a point α in the graph G can be considered as a sub-graph which contains the point α and all the points which are directly connected to it. In a relative neighborhood graph G rng (Ω,ϕ), two points (α, β) ∈ Ω 2 are neighbors if they satisfy the following relative neighborhood property: Let H (α, β) be the Proximity graphs for image retrieval 797 hyper-sphere of radius δ (α, β) and centered on α, and let H (β, α) be the hypersphere of radius δ (β, α) and centered on β. δ (α, β) and δ (β, α) are the dissimilarity measures between the two points α and β. δ (α, β) = δ (β, α). Then, α and β are neighbors if and only if the lune A (α, β) formed by the intersection of the two hyper-spheres H (α, β) and H (β, α) is empty [Tou80]. Formally: A (α, β) = H (α, β) ∩ H (β, α) T hen (α, β) ∈ ϕ if f A (α, β) ∩ Ω = φ Figure 1 illustrates the relative neighborhood graph. Fig. 1. Relative neighborhood graph The main problem related to neighborhood graphs is their high complexity. Their usage in the context of database indexing, their exploitation needs some optimizations. Several algorithms for neighborhood graphs construction were proposed. The algorithms which we quote hereafter relate to the construction of the relative neighborhood graph. One of the common approaches to the various neighborhood graphs construction algorithms is the use of the refinement techniques. In this approach, the graph is built by steps. Each graph is built starting from the previous graph, containing all connections, by eliminating some edges which do not satisfy the neighborhood property of the graph to be built. Pruning (edges elimination) is generally done by taking into account the construction function of the graph or through geometrical properties. The construction principle of the neighborhood graphs consists in checking for each point, if the other points in the space are in its proximity. The cost of this operation is of complexity O n3 (n is the number of points in the space). Tou ssaint [Tou91] proposed an algorithm of complexity O n2 . He deduces the RNG starting from a Delaunay triangulation [PS85]. Using the Octant neighbors, Katajainen [Kat88] also proposed an algorithm of the same complexity. Smith [Smi89] proposed an algorithm of complexity O n23/12 which is significantly less than O n3 . In what concern us, as quoted in the introduction, this work can be viewed as an improvement of the proposal in [SCSZ04]. The proposed approach in [SCSZ04] is focused mainly on a locally updating technique of neighborhood graphs. This method does not offer really the anticipated results. Indeed, in this method the graph is not really updated but the neighbors of each point are considered to be those of it’s 798 Djamel Abdelkader Zighed and Hakim Hacid nearest neighbor. Formally, if we consider α to be a query point and β its nearest neighbor. If the set of the neighbors of β is {β1 , β2 , β3 }, then the neighbors of α are {β, β1 , β2 , β3 }. This approach, as we can see it, isn’t correct especially in the case of multidimensional spaces. So, using this method the graph will deteriorate. We propose in the following an effective locally updating method, which is stable and non-sensitive to the data dimension effects, and which returns very satisfactory results (in terms of response time and quality answers). 3 Neighborhood graphs based retrieval approach Several content-based information retrieval systems are based on the k-nearest neighbors principle [FH51] [VT00]. This principle is based on sorting items by taking into account their distances, a predefined k nearest items are returned by the system when a query is submitted. For example, the QBIC system, in its implementation for the Hermitage museum1 [FBF94], returns the 12 nearest images to the query submitted by the user. The structuration model of the multimedia databases is (or can be seen) as a graph based on similarity relations between the items (for example k-NN [Mit97] or the relative neighborhood graph [SCSZ04]). The objective is to explore an image database through the similarity between images. The structuration model is very important because the performances of a content-based information retrieval system depend strongly on the representation structure (indexation structure), which manages the data. Image databases interrogation is generally made by submitting a query to the system; this query is generally an image which the system pre-processes (segments, equalizes, etc.) producing a vector of features representing a point in a multidimensional space. This point is inserted in the representation structure (indexation structure) and its neighbors are then turned over as an answer to the query. In our case, a naive approach when using neighborhood graphs would be the rebuilding of the neighborhood graph containing the already existing data in the database by adding the new query item. This approach is unfortunately not suitable, because it is very expensive especially when the number of items in the database is important. Another approach would be to locally update the neighborhood graph, i.e., to find a procedure such that only the potential items, which are in the neighborhood of the inserted point, will be affected by the updating. The neighborhood graph locale update task passes by the location of the inserted (or removed) point as well as the points which can be affected by the update. To achieve this, we proceed in two main stages: initially, we look for an optimal surface which can contain a maximum number of potentially closest points to the query point. The second stage is done in the aim of filtering the items found beforehand in order to recover the really closest points to the query by applying an adequate neighborhood property. This last stage causes the effective updating of the neighborhood relations between the concerned points. The main stage in this method is the ”search area” determination. This can be considered as a question of determining an hyper sphere of center α (the query point) 1 http://www.hermitagemuseum.org/ Proximity graphs for image retrieval 799 which maximizes the chance of containing the neighbors of α while minimizing the number of items that it contains. We take advantage of the general neighborhood graphs structure in order to establish the radius of the hyper sphere. We are focusing especially on the nearest neighbor and the farther neighbor concepts. So, the radius of the hyper-sphere respecting the above properties is the one including all the neighbors of the first nearest neighbor of the query point. So, considering that the hyper-sphere is centered in α, its radius will be the sum of the distances between α and its nearest neighbor and the one between this nearest neighbor and its further neighbor. The content of the hyper sphere is processed in order to see whether there are some neighbors (or all the neighbors). The second step constitutes a reinforcement step and aims to eliminate the risk of losing neighbors or including bed ones. So, we take all the true neighbors of the query point recovered beforehand (those returned in the first step) as well as their neighbors and update the neighborhood relations between these points. That is, let consider α be the query point and β its nearest neighbor with a distance δ1 . Let consider λ be the further neighbor of β with a distance δ2 . The radius R of the hyper sphere can be expressed as: SR = (δ1 + δ2 ) × (1 + ǫ) ǫ is a relaxation parameter and it can be defined according to the state of the data (their dispersion for example) or by the domain knowledge. We set experimentally this parameter to 0.1. The complexity of this method is very low and meets perfectly our starting aims (locating the neighborhood of points in a time as short as possible). It is expressed by: O(2n + n′3 ) Where • n: the number of items in the database. • n’: the number of items in the hyper sphere (<< n). This complexity includes the two stages described previously, namely, the search of the radius of the hyper sphere and the seek of the true neighbors which are in this hyper sphere corresponding to the term O(2n). The second term corresponds to the necessary time for the effective update of the neighborhood relations between the real neighbors which is very weak taking into account the number of candidates turned over. This complexity constitutes the maximum complexity and can be optimized in several ways. The most obvious way is to use a fast nearest neighbor algorithm. 4 Experiments and results The interest of neighborhood graphs for content based information retrieval is shown and discussed in [SCSZ04]. The comparison tests done with several different configurations of k-NN were interesting and showed the utility of these structures in this 800 Djamel Abdelkader Zighed and Hakim Hacid field. In this section we are interested into two different tests: testing the validity of the obtained results by the use of the suggested method and testing the response times. We used an image database [NNM96]. This database contains 7200 images representing 100 different objects taken with various views. To perform our experimentations, we pre-process the database by applying some image processing algorithms for extracting some characteristics. We use especially color, texture and form features. Each image is represented as a vector of features and the graph is built on the obtained vectors. The interrogation is also done in the same way by using a vector of features. So, each image is represented as a point in a multidimensional space. Our interest by performing these experiments is to show that our method gives the same results as the ones obtained by building the graph with all items and ensure we don’t include wrong or miss good neighbors of the query item. To check the validity of the obtained results, we need a reference graph. For that, we take the whole image database (vectors of descriptors) and we build a relative neighborhood graph. After building the reference graph, we can start the validity checking tests. We take arbitrary an item from the data set and we build a new relative neighborhood graph on the remaining items (the n-1 items). After that, we insert the item taken beforehand from the dataset in the graph by using the proposed method. The neighbors of the inserted point are compared with the neighbors of this same item in the reference graph. This operation is repeated several times by taking different items at each iteration. Figure 2 summarizes the obtained results by taking into account different situations. The first experiment constitutes our reference. The experiments carrying numbers 2,4,6 were carried out with an item in less and experiments 3,5,7 after the insertion of the remaining item in the corresponding previous experiment. That is, after the insertion of the remaining item, we always find the recall of the reference graph, this means that the method finds the good neighbors of the inserted item into each experiment. Figure 4 gives a visual illustration of the obtained result using the neighborhood graphs indexing technique. The validity of the obtained results being acquired, we are interested now in the response times of the method, i.e. time that the method takes to insert a query item in an existing structure. We use the same protocol as the one for the validity checking, but in place of recovering the results items, we take into consideration only the response times of each query. We used for these experiments a machine with an INTEL Pentium 4 processor (2.80 GHz) and 512 Mo of memory. The response times for 10 items, always arbitrary taken from the same dataset, are shown in the graphic on Figure 3. The response times (expressed in milliseconds) are interesting according to the volume of used dataset. They are variable from an item to another and this is due especially to the fact that various candidates items count for neighborhood determination are used at each iteration. Generally speaking, the insertion of a new item in the graph using our method requires only 50 milliseconds (the mean), and requires at least 40 minutes using the standard algorithm in the same conditions. Proximity graphs for image retrieval Fig. 2. Recall measured for 03 locale insertions 801 Fig. 3. Execution time variations of 10 locale insertions Fig. 4. Obtained neighborhood using a standard algorithm with the whole images 5 Conclusion and future work Content-based image retrieval is a complex task because of, primarily, the nature of the image data and to the attached subjectivity on their interpretation. The use of an adequate multimedia index structure is important. In this paper we considered neighborhood graphs as indexing structure and we proposed a method for locally updating them with an aim of bringing an improvement of an existing approach. The experiments performed on different datasets show the effectiveness and the utility of the proposed method. As future work, we plan to integrate this approach and its application in more functions related to images like automatic annotation and classification. References [FBF94] Christos Faloutsos, Ron Barber, Myron Flickner, Jim Hafner, Wayne Niblack, Dragutin Petkovic, and William Equitz. Efficient and effective querying by image content. J. Intell. Inf. Syst., 3(3/4):231–262, 1994. 802 Djamel Abdelkader Zighed and Hakim Hacid [FH51] E. Fix and J. L. Hudges. Discriminatory analysis - non parametric discrimination: consistency properties. Technical Report 21-49-004, USAF School of aviation medicine, Randolph Field, Texas, 1951. [GS69] K. R. Gabriel and R. R. Sokal. A new statistical approach to geographic variation analysis. Systematic zoology, 18:259–278, 1969. [Kat88] J. Katajainen. The region approach for computing relative neighborhood graphs in the lp metric. Computing, 40:147–161, 1988. [Mit97] Tom M. Mitchell. Machine learning meets natural language. In EPIA, page 391, 1997. [NNM96] Sameer A. Nene, S. K. Nayar, and H. Murase. Columbia object image library (coil-100). Technical report, Technical Report CUCS-006-96, February 1996. [PS85] F. Preparata and M. I. Shamos. Computationnal Geometry-Introduction. Springer-Verlag, New-York, 1985. [SCSZ04] M. Scuturici, J. Clech, V. M. Scuturici, and D. Zighed. Topological representation model for image databases query. Journal of Experimental and Theoritical Artificial Intelligence (JETAI), pages 145–160, 2004. [Smi89] W. D. Smith. Studies in computational geometry motivated by mesh generation. PhD thesis, Princeton University, 1989. [Tou80] G. T. Toussaint. The relative neighborhood graphs in a finite planar set. Pattern recognition, 12:261–268, 1980. [Tou91] G. T. Toussaint. Some insolved problems on proximity graphs. D. W Dearholt and F. Harrary, editors, proceeding of the first workshop on proximity graphs. Memoranda in computer and cognitive science MCCS-91-224. Computing research laboratory. New Mexico state university Las Cruces, 1991. [VT00] Remco C. Veltkamp and M. Tanase. Content-based image retrieval systems : A survey. Technical Report UU-CS-2000-34, Department of Computing Science, Utrecht University, 2000.