From: FLAIRS-02 Proceedings. Copyright © 2002, AAAI (www.aaai.org). All rights reserved. High Expressive Spatio-temporal Relations R. CHBEIR, Y. AMGHAR, A. FLORY LISI – INSA de Lyon 20, Avenue A. Einstein F-69621 VILLEURBANNE - FRANCE Phone: (+33) 4 72 43 85 88 - Fax: (+33) 4 72 43 87 13 E-mail: {rchbeir, amghar, flory}@lisi.insa-lyon.fr Abstract The management of images remains a complex task that is currently a cause for several research works in various domains such as GIS, medicine, etc. In line with this, we are interested in this work with the problem of image retrieval that is mainly related to the complexity of image representation. In this paper, we address the problem of relation representation and we propose a novel method for identifying and indexing relations between salient objects. Our method is based on defining several intersection sets for each feature like shape, time, etc. Relations are then computed using a binary intersection matrix between sets. Several types of relations can be calculated using our method in particular spatial and temporal relations. We show how our method is able to provide high expressive power and homogeneous representation to relations in comparison to the traditional methods. Introduction During the last decade, a lot of work has been done in information technology in order to integrate image retrieval in the standard data processing environments. Image retrieval is involved in several domains [(Yoshitaka and Ichikawa 1999), (Rui, Huang, and Chang 1999), Grosky 1997), (Smeulders, Gevers, and Kersten, 1998)] such as Geographic Information Systems, Medicine, Surveillance, etc. where queries criteria are based on different types of features. Principally, spatio-temporal relations are very important in image retrieval and need to be considered in several domains. In medicine, for instance, the spatial data in surgical or radiation therapy of brain tumors is decisive because the location and the temporal evolution of a tumor has profound implications on a therapeutic decision (Chbeir and Favetta 2000). Hence, it is crucial to provide a precise and powerful system to express spatio-temporal relations required in image retrieval. In this paper, we address this issue and propose a novel method that can easily compute several types of relations with better expressions. The rest of this paper is organized as follows. In section 2, we present the related work. Section 3 is devoted to define our method for calculating relations. Finally, conclusions and future orientations are given in section 4. Related Work Relations between either salient objects, shapes, points of interests, etc. have been widely used in image indexing such as R-tree (Guttman 1984), R+-tree (Sellis, Roussopoulos, and Faloutsos 1987), R*-tree (Beckmann 1990), hB-tree (Lomet and Salzberg 1990), ss-tree (White and Jain 1996), TV-tree (Lin, Jagadish, and Faloutsos 1994), 2D-String (Chang, Shi, and Yan 1987), 2D-G String (Chang and Jungert 1991), 2D-C+ String (Huang and Jean 1994], θR-String (Gudivada 1995], σ-tree (Chang and Jungert 1997), etc. Temporal and spatial relations are the most used relations in image indexing. To calculate temporal relations, Allen relations (Allen 1983) are often used. Allen proposes 13 temporal relations (Before, After, Meet, Touched By, Started by, Overlapped By, Start With, Started With, Finish wish, Finishes, Contain, During, Equal) in which 6 are symmetrical. On the other hand, three major types of spatial relations are generally proposed in image representation (Egenhofer, Frank, and Jackson 1989): Metric relations: measure the distance between salient objects (Peuquet 1986). For instance, the metric relation “far” between two objects A and B indicates that each pair of points Ai and Bj has a distance grater than a certain value δ. Directional relations: describe the order between two salient objects according to a direction, or the localisation of salient object inside images (El-kwae and Kabuka 1999). In the literature, fourteen directional relations are considered: north, south, east, west, north-east, north-west, south-east, south-west, left, right, up, down, front and behind. Topological relations: describe the intersection and the incidence between objects [(Egenhofer and Herring 1991), (Egenhofer 1997)]. Egenhofer (Egenhofer and Copyright © 2002, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. FLAIRS 2002 471 Herring 1991) has identified six basic relations: Disjoint, Touch, Overlap, Cover, Contain, and Equal. In spite of all the proposed work to represent complex visual situations, several shortcomings exist in the methods of spatial relation computations. For instance, Figure 1 shows two different spatial situations of three salient objects that are described by the same spatial relations in both cases, in particular topological relations: a1 Touch a2, a1 Touch a3, a2 Touch a3; and directional relations: a1 Above a3, a2 Above a3, a1 Left a2. Fi← (or inferior): contains elements of F- that do not belong to any other intersection set and inferior to ∂F elements on the basis of i dimension. Fi (or superior): contains elements of F- that do not belong to any other intersection set and superior to ∂F elements on the basis of i dimension. ∂F F i FΩ Fi Dimension i Figure 2: Intersection sets of one-dimensional feature. If we consider a feature of 2 dimensions i and j (as the shape of a salient object in a 2D space), we can define 4 intersection subsets: Figure 1: Two different spatial situations. In this paper, we present our method able to provide a high expressive power to represent several types of relations in particular spatial and temporal ones. Proposition Our proposal represents an extension of the 9-Intersection model (Egenhofer and Herring 1991). It provides a general method for computing not only topological relations but also other types of relations in particular temporal and spatial relations with higher precision. The idea is to identify relations in function of two values of a feature such as shape, position, time, etc. The shape feature gives spatial relations, the time feature gives temporal relations, and so on. To identify a relation between two values of the same feature, we propose the use of an intersection matrix between several sets defined in function of each feature: Definition Let us first consider a feature F. We define its intersection sets as follows: The interior FΩ : contains all elements that cover the interior or the core of F. In particular, it contains the barycentre of F. The definition of this set has a great impact on the other sets. FΩ may be empty (∅). The boundary ∂F: contains all elements that allow determining the frontier of F. ∂F is never empty (¬∅). The exterior F-: is the complement of FΩ∪∂F. It contains at least two elements ⊥ (the minimum value) and ∞ (the maximum value). F- can be divided into several disjoint subsets. This decomposition depends on the number of the feature dimensions. For instance, if we consider a feature of one dimension i (such as the acquisition time of a frame in a video), two intersection subsets can be defined (Figure 2): 472 FLAIRS 2002 Fi←∩Fj← (or inferior): contains elements of F- that do not belong to any other intersection set and inferior to Ω F and ∂F elements on the basis of i and j dimensions. Fi←∩Fj : contains elements of F- that do not belong to Ω any other intersection set and inferior to F and ∂F elements on the basis of i dimension, and superior to Ω F and ∂F elements on the basis of j dimension. Fi ∩Fj←: contains elements of F- that do not belong to Ω any other intersection set and superior to F and ∂F Ω elements on the basis of i dimension, and inferior to F and ∂F elements on the basis of j dimension. Fi ∩Fj (or superior): contains elements of F- that do not belong to any other intersection set and superior to Ω F and ∂F elements on the basis of i and j dimensions. More generally, we can determine intersection sets (2n) of n dimensional feature. In addition, we use a tolerance degree in the feature intersection sets definition in order to represent separations between sets and to provide a simple certainty parameter for the end-user. For this purpose, we use two tolerance thresholds: Internal threshold εi: defines the distance between FΩ and ∂F, External threshold εe: defines the distance between subsets of F-. To calculate relation between two features, we establish an intersection matrix of their corresponding feature intersection sets. Matrix cells have binary values: 0 whenever intersection between sets is empty, 1 otherwise For one-dimensional feature (such as the acquisition time), the following intersection matrix is used to compute relations between two values A and B: AΩ∩BΩ ∂A∩BΩ A←∩BΩ A→∩BΩ R(A, B) = AΩ∩∂B ∂A∩∂B A←∩∂B A→∩∂B AΩ∩B← ∂A∩B← A←∩B← A→∩B← AΩ∩B→ ∂A∩B→ A←∩B→ A→∩B→ ∆T1 0 0 1 0 0 0 1 0 0 0 1 0 ∆T2 Figure 3: Intersection matrix of two values A and B on the basis of one-dimensional feature. For two-dimensional feature (such as the shape) of two values A and B, we obtain the following intersection matrix: AΩ∩BΩ AΩ∩∂B ∂A∩BΩ ∂A∩∂B ← ← A1 ∩A2 Ω ∩B ← → A1 ∩A2 ∩BΩ → → A1 ∩A2 ∩BΩ A1→∩A2← ∩BΩ ← ← A1 ∩A2 ∩ ∂B ← → A1 ∩A2 ∩ ∂B A1→∩A2→ ∩ ∂B A1→∩A2← ∩ ∂B AΩ∩B1←∩ ← B2 ← ∂A∩B1 ∩ ← B2 A1←∩A2←∩ ← ← B1 ∩B2 ← → A1 ∩A2 ∩ ← ← B1 ∩B2 A1→∩A2→∩ ← ← B1 ∩B2 A1→∩A2←∩ ← ← B1 ∩B2 AΩ∩B1←∩ → B2 ∂A∩B1←∩ → B2 A1←∩A2←∩ ← → B1 ∩B2 ← → A1 ∩A2 ∩ ← → B1 ∩B2 A1→∩A2→∩ ← → B1 ∩B2 A1→∩A2←∩ ← → B1 ∩B2 AΩ∩B1→∩ B2→ ∂A∩B1→∩ B2→ ← ← A1 ∩A2 ∩ → → B1 ∩B2 ← → A1 ∩A2 ∩ → → B1 ∩B2 A1→∩A2→∩ → → B1 ∩B2 A1→∩A2←∩ → → B1 ∩B2 AΩ∩B1→∩ ← B2 → ∂A∩B1 ∩ ← B2 A1←∩A2←∩ → ← B1 ∩B2 ← → A1 ∩A2 ∩ → ← B1 ∩B2 A1→∩A2→∩ → ← B1 ∩B2 A1→∩A2←∩ → ← B1 ∩B2 1 1 1 1 TR2: The interval ∆T1 starts before ∆T2 and ends almost before ∆T2. ∆T1 0 0 1 0 0 0 1 0 0 0 1 0 ∆T2 1 1 0 1 TR3: The interval ∆T1 ends when ∆T2 starts: Figure 4: Intersection matrix of two values A and B on the basis of two-dimensional feature. ∆T1 0 0 0 1 ∆T2 Temporal relations To calculate temporal relations, we define the following intersection sets for each interval1 T (Figure 5): TΩ : represents the interval T without initial instant ti and final instant tf. It is situated between ti+εi and tf-εi. ∂T: contains both initial instant ti and final instant tf. εi: represents the internal tolerance threshold that separates TΩ from ∂T,. εe: represents the external tolerance threshold that separates ∂T from both T and T. T: is the set of values strictly anterior to ti-εe. T : is the set of values strictly posterior to tf+εe. T εe ∆T1 ∆T2 TΩ εi εe T T im e T Figure 5: Graphical representation of temporal interval T. Using our method, 33 (32+1) temporal relations between two intervals are identified instead of 13 (Allen 1983). We show here below 7 of these principal temporal relations (non symmetrical) between two intervals ∆T1 and ∆T2: TR1: The interval ∆T1 starts and ends before ∆T2: 0 0 0 1 0 1 0 1 1 0 0 1 1 1 1 0 0 0 0 1 TR5: The interval ∆T1 starts before the beginning of ∆T2 and finishes during ∆T2: ∆ T1 ∆T 2 tf εi 1 1 1 0 TR4: The interval ∆T1 finishes almost after the beginning of ∆T2: ∂T Ti 0 1 0 1 1 1 0 1 1 0 0 1 1 1 1 0 0 0 0 1 TR6: The interval ∆T1 starts before the beginning of ∆T2 and finishes almost before the end of ∆T2: ∆T1 ∆T2 1 1 0 0 1 0 0 1 1 1 1 0 0 0 0 1 TR7: The interval ∆T1 starts almost before ∆T2 and finishes during ∆T2: ∆T1 ∆T2 1 1 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 In this paper, we do not consider intersections sets between instants. Our method leads also to identify 5 temporal relations between instants instead of 3 traditional relations. This shows that our method provides a higher expressive power than traditional methods. FLAIRS 2002 473 Spatial relations In this paper, we consider only spatial relations calculated between polygonal shapes. Intersections sets of a polygon P are defined as follows (Figure 6): PΩ : represents the surface of P. It contains all points situated in the internal part of P. ∂P: is the frontier of P. Px∩Py: contains all points that do not belong to PΩ nor to ∂P, where coordinates are inferior to barycentre’s. Px ∩Py : contains all points that do not belong to PΩ nor to ∂P, where coordinates are superior to barycentre’s. Px∩Py : contains all points that do not belong to PΩ nor to ∂P, where abscises are strictly inferior to barycentre’s and ordinates are strictly superior to barycentre’s. Px ∩Py: contains all points that do not belong to PΩ nor to ∂P, where abscises are strictly superior to barycentre’s and ordinates are strictly inferior to barycentre’s. and 2 dimensions for 2D shape’s. Intersections sets are defined exactly as time and shape features’. We obtain 10 (2+23) intersection sets that conduct to 10x10 intersection matrix cells in order to calculate spatio-temporal relation between two features. Situation 1 R(a1, a2) Fi ∩Fj R’(a1, a2) FΩ ∂F Fi←∩Fj← Fi ∩Fj← x 0 1 0 0 1 1 1 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 1 R(a1, a3) 0 0 1 0 0 1 0 1 1 0 0 1 0 0 1 0 0 1 1 1 1 1 1 1 0 1 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1 0 1 1 0 0 1 0 0 1 0 0 1 1 1 1 1 1 0 0 1 0 0 1 1 0 0 0 0 0 1 Situation 2 y Fi←∩Fj 0 0 0 0 1 1 0 0 0 0 1 1 0 1 0 0 1 1 1 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 1 R’(a1, a3) Figure 7: Identified spatial relations using our method. Figure 6: Intersection sets of polygonal shape. Using our method, we are able to provide a high expression power to spatial relations that can be applied to describe images and formulate complex visual queries in several domains. For example, for Figure 1 that shows two different spatial situations between three salient objects a1, a2, and a3, our method expresses the spatial relations as shown in Figure 3. The relations R(a1, a2) and R’(a1, a2) are equal but the relations R(a1, a3) and R’(a1, a3) are clearly distinguished. Similarly, we can express relations between a2 and a3 in both situations (Figure 7). Moreover, our method allows combining both directional and topological relation into one binary relation, which is very important for indexing purposes. There are no directional and topological relations between two salient objects but only one spatial relation. Hence, we can propose a 1D-String to index images instead of 2D-Strings (Chang, Shi, and Yan 1987). Discussion Using our method, we can provide other type of relations. For instance, we can calculate spatio-temporal relations into one representation. For that, we define a new feature with 3 dimensions: 1 dimension for time representation, 474 FLAIRS 2002 Conclusion In this paper, we presented our method to identify relations in image representation. We used spatial and temporal relations as a support to show how relations can be powerfully expressed within our method. This work aims to homogenize, reduce and optimize the representation of relations. We currently experiment its integration in our prototype MIMS (Chbeir, Amghar, and Flory 2001). However, in order to study its efficiency and limits, our method requires more intense experiments in complex environment where great number of feature dimensions and salient objects exist. Furthermore, we will examine it in other type of media like videos. References Allen J.F., “Maintaining knowledge about temporal Intervals”, Communications of the ACM, 1983, Vol. 26, N° 11, P. 832-843. Beckmann N.,“The R*-tree: an efficient and robust access method for points and rectangles”, SIGMOD Record, 1990, Vol. 19, N° 2, P. 322-331. Chang S. K., Jungert E., “Pictorial data management based upon the theory of symbolic projections”, in Journal of Visual Languages and Computing, 1991, Vol. 2, N° 3, P. 195-215. Chang S.K., Jungert E., “Human- and System-Directed Fusion of Multimedia and Multimodal Information using the Sigma-Tree Data Model”, Proceedings of the 2nd International Conference on Visual Information Systems, 1997, San Diego, P. 21-28. Chang S.K., Shi Q.Y., Yan C.W., “Iconic Indexing by 2-D Strings”, IEEE-Transactions-on-Pattern-Analysis-andMachine-Intelligence, 1987, Vol. PAMI-9, N° 3, P. 413428. Chbeir R., Amghar Y., Flory A., “A Prototype for Medical Image Retrieval”, International Journal of Methods of Information in Medicine, Schattauer, Issue 3, 2001, P. 178184. Chbeir R. and Favetta F., “A Global Description of Medical Image with a High Precision”; IEEE International Symposium on Bio-Informatics and Biomedical Engineering IEEE-BIBE'2000, Washington D.C., USA, (2000) November 8th-10th, P. 289-296. Egenhofer M. and Herring J., “Categorising Binary Topological relationships between Regions, Lines, and Points in Geographic Databases, A Framework for the Definition of Topological Relationships and an Algebraic Approach to Spatial Reasoning within this Framework”, Technical Report 91-7, National center for Geographic Information and Analysis, University of Maine, Orono, 1991. Egenhofer M., “Query Processing in Spatial Query By Sketch”, Journal of Visual Language and Computing, Vol 8 (4), 1997, P. 403-424. Egenhofer M., Frank A., and Jackson J., “A Topological Data Model for Spatial Databases”, Symposium on the Design and Implementation of Large Spatial Databases, Santa Barbara, CA, Lecture Notes in Computer Science, July 1989, Vol. 409, P. 271-286. El-kwae M.A. and Kabuka M.R., “A robust framework for Content-Based Retrieval by Spatial Similarity in Image Databases”, ACM Transactions on Information Systems, vol. 17, N° 2, avril 1999, P. 174-198. Grosky W.I., “Managing Multimedia Information in Database Systems”, Communications of the ACM, 1997, Vol. 40, No. 12, P. 72-80. Gudivada V.N., “Spatial similarity measures for multimedia applications”, Proceedings of SPIE Storage and Retrieval for Still Image and video Databases III, 1995, San Diego/La Jolla, CA, USA, Vol. 2420, P. 363372. Guttman A., "R-trees: a dynamic index structure for spatial searching", SIGMOD Record, 1984, Vol. 14, N° 2, P. 4757. Huang P.W. and Jean Y.R., “Using 2D C+-Strings as spatial knowledge representation for image database management systems”, Pattern Recognition, 1994, Vol. 27, N° 9, P. 1249-1257. Lin K.I., Jagadish H. V., Faloutsos C., “The TV-tree: An index structure for high dimensional data”, Journal of Very Large DataBase, 1994, Vol. 3, N° 4, P. 517-549. Lomet D.B., Salzberg B., “The hb-tree: A multiattribute indexing method with good guaranteed performance”, ACM Transactions on Database Systems, 1990, Vol. 15, N° 4, P. 625-658. Peuquet D. J., “The use of spatial relationships to aid spatial database retrieval”, Proc. Second Int. Symp. on Spatial Data Handling, Seattle, 1986, P. 459 – 471. Rui Y., Huang T.S., and Chang S.F., “Image Retrieval: Past, Present, and Future”, Journal of Visual Communication and Image Representation, 1999, Vol. 10, P.1-23. Sellis T., Roussopoulos N., Faloutsos C., “the R+-tree: a dynamic index for multidimensional objects”, Proceedings of the 13th International Conference of VLDB, 1987, Brighton, England, P. 507-518. Smeulders A.W.M., Gevers T., Kersten M.L., “Crossing the Divide Between Computer Vision and Databases in Search of Image Databases”, Visual Database Systems Conf., Italy, 1998, P. 223-239. White D.A., Jain R., “Similarity Indexing with the SStree”, Proceedings of the 12th International Conference on Data Engineering, 1996, New Orleans, Louisiana, P. 516523. Yoshitaka A. and Ichikawa T., “A Survey on ContentBased Retrieval for Multimedia Databases”, IEEE Transactions on Knowledge and Data Engineering, vol. 11, No. 1, 1999, P. 81-93. FLAIRS 2002 475