Novel Hybrid Method For Image Retrieval By Ontological Descriptions Of Sub-Regions OLEG STAROSTENKO1, ALBERTO CHÁVEZ-ARAGÓN1, MA. A. MEDINA1, ALFRED ZEHE2, GENNADIY BURLAK3 1 Research Center CENTIA, Computer Science Department Universidad de las Américas-Puebla, Cholula, Puebla, C.P. 72820, MEXICO 2 Facultad de Cs. Físico-Matemáticas, Puebla, MEXICO 3 Autonomous State University of Morelos, MÉXICO Abstract: - This paper presents a new hybrid method of visual information retrieval, which combines low-level image analysis techniques such as a colour model and principal corners detection approach with automatic indexing of the objects in image by their textual descriptions. The colour and principal corner approaches for indexing the objects in image have been selected, because they are invariant to rotation, scale, and illumination changes. The principal goal of proposed method is the integration of low level image analysis approaches and the user-oriented descriptions, which provide more complete retrieval, accelerating the convergence to the expected result. For definition of the image semantics the ontological annotations of specific sub-regions obtained by CORPAI algorithm has been used in proposed method. Visual information retrieval is provided by comparison of the textual descriptions generated by designed automatic facilities with input keywords selected by user. For implementing the concepts of an ontological annotation, the image retrieval tool (IRONS system) has been designed and evaluated using Resource Description Framework language for establishment of machine-readable semantics. Key-Words: - web based system, image retrieval, object indexing, semantics, ontology 1 Introduction Actually the development of high quality digital image processing services is important part of the access to knowledge and information within large communities. The information contained in huge repositories like internet or digital libraries grows exponentially, a great part of this information are the images, which can not be queried and retrieved efficiently by common methods of textual annotations, unless those images are previously classified. The most common way to classify images is the generation of textual queries (traditional search engines are based on operating with the keywords), however, the maintenance and the management of queries for image retrieval is a very time-consuming activity. The typical approach of automatic indexing of images as a principal part of retrieval process is based on the analysis of lowlevel image characteristics such as colour, texture or shape [1], [2], consequently, this type of systems does not provide the semantics associated with content of each image. Nowadays, traditional image retrieval systems face a lot of problems, for example, they do not avoid the retrieval of nonsensical information. Therefore, they do not support the search of what the user is really looking for. There are several reports about recent research in image information retrieval area, among them the well-known systems are QBIC, VisualSeek, Amore and SQUID. The Query by Image Content system (QBIC) was developed at IBM Almaden Research Centre. It provides the retrieval of images, graphics and video data from online collections. This system uses image features as colour, texture, and shape for computing the similarity between images [3]. QBIC allows making queries by means of examples, sketches, and SQL predicates. The VisualSeek is a web-based system, where user requests the image by description of spatial arrangements of colour regions [4]. The matching of images in VisualSeek depends on arrangements of similar colour regions. VisualSeek was proposed by the Image and ATV Lab of Columbia University. The AMORE system (Advanced Multimedia Retrieval Engine) provides images retrieval in web. In AMORE system (developed by NEC USA Inc) queries can be formed by keywords, by a specification of a similar image or a combination of both [5]. The SQUID system, designed at the Centre for Vision, Speech, and Signal Processing at the University of Surrey, provides the image retrieval by analysis of a shape similarity. It allows submitting the shape as query to requested objects [6]. These systems have a good performance, however, they do not provide a specific way to represent the meaning of the object in the images. This is why, in many occasions they produce nonsensical results. One way to solve these problems is application of Artificial Intelligence methods. The original purpose is to facilitate knowledge sharing and reusing. Consequently, it permits to promote the development of machine-understandable semantics for representation and exchange of information between different software and human agents. This alternative is known as ontology approach [7]. There are a lot of definitions of ontology but more acceptable is the Grubber's one [8], which establishes that ontology is a formal, explicit specification of a shared conceptualisation. The ontological annotations can be expressed mathematically and as the result they are machine readable and understandable. In image retrieval applications, ontology permits to describes the semantics, establishes a common and shared understanding of domain, (in our case a domain is an image), and facilitates the implementation of useroriented vocabulary of terms and their relationship with objects in image. The proposed image retrieval method consists in operation with two main proposed modules: module for sub-region extraction based on traditional analysis of colour; and principal corners of the objects within the image and tool for indexing of the objects in image using their significance expressed by textual annotations of extracted sub-regions. The relationship between these extracted sub-regions is defined by textual ontological annotations supporting the different semantic concepts. The possible applications of our research regarding the development of image retrieval facilities are: systems for supporting digital image processing services, design of software that provides a high performance exchange of multimedia material for distributed collaborative and learning environments, distance education, digital libraries; development of the information systems for retrieval, processing and distribution of multimedia data within distributed environments; design of image based navigators. 2 Proposed Image Retrieval Method The principal goal of the proposed method is combination of general descriptors based on lowlevel image processing for extraction of the subregions invariant to changes due to rotations and illumination and the application of ontology concepts for definition of the image semantic aspects. The relationship between the image content and its descriptors consist in specification of textual annotations of image sub-regions which have a greater semantic weight according to the particular user's criteria. Visual information retrieval is provided by comparison of the textual annotations generated by proposed automatic tool with input keywords selected by user. If the grade of similarity is acceptable, the retrieval of image is provided. The proposed method consists in applying of four following procedures. 1. In order to divide the image into regions (objects) the luminance of pixels is analysed applying the SUSAN principal corner detection method based on estimation of Univalue Segment Assimilating Nucleus (USAN) [9], [10]. Each point in the input image is used as the nucleus of a circular mask. The best digital approximation for calculation of the mask value is the Gaussian weighting because it is more smoother and stable than square similarity function. Using equation 6 I(r) I(r0 ) Cr,r0 exp t (1) the brightness function is calculated for each pixel under the mask. In the equation I(r0), I(r) present the brightness of nucleus position and the brightness of other point within the mask respectively; t is the brightness difference threshold which defines minimum contrast of edges or corners and image background. The number of estimated corners directly depends on t value. The 6th power of equation is a theoretical optimum, which permits good balance between stability and the sensitivity of method. Then the value of USAN area n is calculated as sum of computed C(r, r0) functions. For the detection of corners the USAN area property is used comparing the estimated areas with the threshold g < nmax/2, where nmax is the maximum area of the circular mask. The threshold g defines quality and quantity of the detected corners, which can be modified by the user. The extracted principal corners obtained by used algorithm present the points that have certain features like high contrast and specific positions of objects in image. 2. In the next step, the spatial sampling of the original image is provided in order to reduce the amount of information being processed. The I1I2I3 colour model is used [11]. An image is divided into small windows, for example, 88 pixels and the average brightness value of these pixels is computed. Then this value is assigned to a single pixel in the new smaller image. The set of the pixels with similar colour is interpreted (feature vector) as a particular region (seed image). The main descriptor of each region is generated according to a set of colour parameters and the number of detected principal corners. 3. Comparing the proposed method with wellknown prototypes, where the description is applied to whole image, the textual annotations of subregions are used in proposed approach. It allows simple identification of separated objects, definition of their semantic characteristics, and interpretation of them using the ontology. The subdivision of the image into sub-regions is provided by convex regions pre-processing algorithm in images (CORPAI) proposed by authors [12]. The detected principal corners are used as characteristic points for generation of convex hull. If the pixel is into the convex hull it means that the pixel belongs to analysed object. In order to know what pixels are inside a convex hull sub-region, CORPAI uses the vertical slabs algorithm, which draws vertical lines through all hull sub-regions, storing x-coordinates in a sorted array. Then, a binary search in sorted array is carried out determining if the pixel is inside or outside of hull sub-region. The pseudo code of the algorithm for convex hull extraction is shown below. Input: A set of points (principal corners) represented by points[ ] and original image Output: image divided into sub-regions ConvexHulls(points[]) // compute the convex hull MakeVerticalSlaps( ) // make a data structure to perform point location queries While(image[][]) { if( query_sub-region(image[][])) // apply operator to the sub-region operator(image[][]) } 4. Finally, it is possible to establish the relationship between the objects contained in a specific subregion of image and its formal explicit definition. In such way, the meaning of the image may be obtained in textual form as a set of annotations of each sub-region. The annotations of each sub-region are related with a particular ontology, which allows interpreting the image by analysis of textual descriptions of objects. There are some languages and tools to support the ontology management. One of them is the Resource Description Framework (RDF) that defines a syntactic convention and a simple data model to implement machine-readable semantics [13]. Using RDF it is possible to describe each web resource with relations to its objectattributes-value based on metadata standard developed by the World Wide Web Consortium. RDF has been used in important applications such as Netscape 6 and in Dublin Core [14]. 3 Problem Solution The block diagram of the proposed system is shown in Fig. 1. The input of system is an image or keyword, which describes the object being found in digital repositories. The retrieved images will be those that have more similarity with the low-level features of query and have the high grade of matching with ontological annotations defining the content of image. Fig. 1. Block diagram of the proposed system. The implementation of four steps of the proposed method described in chapter 2 may be implemented by two following algorithms. First of all, the similarity between the image query and images stored in Data base of images is computed taking into account the textual description linked them to different convex sub-regions of the images. Proposed algorithm for indexing the images using colour and principal corners: Input: A colour image represented by luminance characteristics of pixels Ic Output: The feature vectors 1. Ig ComputeLuminanceInformation(using Ic) // it converts the original colour image into grey level image 2. Principal corners SUSAN operator (Ig) // detection of principal corners of objects in image 3. Scs SpatialSampling(Ic) // reduction of the colour image to small colour image by computing of the average brightness for window of 88 pixels 4.ColourDescriptor ComputeColourDescriptor I1I2I3 (Scs) // generation of the colour descriptors based on I1I2I3 colour system model 5.FeaturesVector ComputeDescriptor (Principal Corners, ColourDescriptor) // the sub-region descriptor is obtained (feature vector for sub-region) as combination of colour vector and principal corners position. The feature vector is stored into database. Now the user may change the sequence of the retrieval process and make his proper annotations to the selected objects in the image using the simple interface shown in Fig.2. That permits generating the user-oriented descriptions of the objects with the principal objective to improve the quality of the expected result. Fig. 2. User interface for object description For example, the close up selected object in Fig. 2 may be interpreted as a mobile robot. But another user can describe the same object as vehicle, toy, car, machine, mobile engine, etc. In this case the textual descriptions of the retrieval object are quite different. The user must be a part of the image retrieval process forming the specific feedback. In the Fig.2 the designed user feedback facility is shown. The system is flexible enough to associate as many meanings to the object as it is necessary for high efficient interpretation and retrieval. Algorithm for linking annotations with sub-regions: Input: A colour image represented by Ic, with the set of points (principal corners shown in Fig. 2 in active convex hull) defined a convex region represented by Sp Output: The relationship between a sub-regions and their descriptions by ontology. 1. Subregion CORPAI(Ic,Sp) // applying the CORPAI algorithm on the sub-regions defined by Sp 2. IcNEW TransformationFromSubregionToImage (Subregion) // transformation of the irregular convex sub-region of original image to a new normalised one 3. FeaturesVector of IcNEW IndexingImage Algorithm (IcNEW) // application of image indexing algorithm described early (using colour and principal corners) to the selected sub-region. 4. Td GetTextualDescriptionFromUser() // description of the sub-regions by their textual annotations. The feature vector and textual annotations are stored to ontology name space. 5. SaveRelationInOntology(Ic, FeaturesVector of IcNEW,Td) // definition of semantics The second algorithm establishes the relationship between obtained sub-regions and their meaning defined by ontology. Now the meaning of whole image may be obtained by composition of the significance of the sub-regions that it contains. This is the principal advantage of proposed method that provides the improvement of image retrieval process. The algorithm, which computes the similarity among images, uses the Euclidean distance dE to compare feature vectors representing the sub-regions or images according the equation: d E ( , ) ( ) 2 (2) where and denote two feature vectors. In the proposed method two kind of vectors are used: vector representing the visual features and vectors expressed by ontological annotations. Thus, the images, which have maximum similarity for both types of vectors, are returned to the user as the result. Multiple experiments have been carried out using RDF for providing the machine-readable semantics, where the frames describe the objects in images and the slots contain their main features used for retrieval. The ontology is described by a directed graph; each node has a feature vector which represent the concept associated to that node. The concept inclusion is represented by IS-A interrelationship, this kind of relationship goes from generic concepts to specific ones. The Fig. 3 illustrates the ontological graph for transport domain where each rectangle represents a specific concept associated with transport notion, tool, machine, equipment, etc. For example, the "automobile" is a specific concept from transport domain and inherits all attributes of this domain. In order to evaluate efficiency of the proposed method, the image retrieval system has been designed and implemented. Fig. 3. Ontological graph for the transport domain. In order to evaluate efficiency of the proposed method, the image retrieval system has been designed and implemented. For a simple identification of the proposed system, it has been named as IRONS (Image Retrieval by Ontological Descriptions of Sub-regions). Fig. 4. IRONS system for image retrieval by CORPAI and ontological annotations The user requests the required image by textual annotation or pre-processed sub-region (seed image). The system retrieves the images more similar to requested one according to matching their low-level features and textual descriptions. In Fig. 4 the user interface for image retrieval is presented.The images with the higher similarity of the feature vector are retrieved. The feature vectors of each sub-region or image consists in the list of keywords (textual ontological annotations in Ontology namespace of the system) linked to described object or whole image and includes descriptions of the low-level characteristics of the sub-region or image (object position and colours). For example, the image "Aeroplane" is described by vector as "plane, DC9, Douglas, jet, P,C ", where P is a set of coordinates and position of the object within the image and C is the sub-region or image colour vector. The evaluation of the proposed method and testing the designed system have been done comparing the results of image retrieval by IRONS and well-known systems, particularly, VisualSeek and Amore. They have been analysed, using the database with 300 images randomly taken from Internet. Table 1. Comparison of Image Retrieval Systems System VisualSeek QBIC Query types Semantic representation Query by image and by text Annotations about whole image Keywords and images of similar colour and texture Based on image features (colour and texture) Amore Similar images and keywords It does not have a semantic representation SQUID IRONS Query by sketches and contours User-oriented queries, similar sub-region colour, principal corners and keywords It does not have a semantic representation Semantic representation by textual descriptions of sub-regions. The relations between sub-regions are provided by ontological annotations The performance of the system is better when the image is processed in sub-regions but excessive subdivision does not produce good results. The satisfactory retrieval of expected images using ontological descriptions is faster due to the lower number of iterations. The disadvantage of proposed method is a necessity of a great amount of highspeed memory for data storage and retrieval. Another disadvantage is the introduction of errors during a spatial sampling used for reduction of the amount of processed data and generation of the feature vectors in image indexing. The accuracy of method may be increased by correct selection of the parameters for low-level processing procedures and by providing the construction of Ontology namespace of the system as complete as it is possible. The comparison of the designed IRONS with other well-known systems is presented in Table 1, where interaction with the user and basic characteristics for image retrieval are analysed. 4 Conclusion The most important contribution of this research is development and implementation of image retrieval method, which solves the following problems. Because applying the fast and efficient approach for principal corner detection used for generation of convex hulls and development of CORPAI algorithm, it is possible now to provide the automatic image division into sub-regions and independent objects indexing. The manipulation with ontological annotations allows the simple and fast estimating the significance of sub-regions and a whole image. For implementation of the ontological concepts, the Resource Description Framework language has been used for development machinereadable semantics. The IRONS system as a set of image retrieval facilities has been designed and evaluated in order to analyse its utility, efficiency, and advantages comparing with well-known systems. For reduction of the number of retrieval iterations, the user-oriented ontological name space has been proposed where user may exactly specify the required information by textual descriptions. The proposed image retrieval method is robust to partial occlusion and small changes in the position of the objects, as well as it introduces a novel way to associate several meanings to the same image or to its sub-regions. Applying the ontological annotations, the retrieval of nonsensical images is avoided and that improves the quality of image retrieval process. From the earlier experimental results, we can conclude that the method has high performance and it could be considered as alternative way for development of visual information retrieval facilities. Acknowledgement: This research is sponsored by the Mexican National Council of Science and Technology, CONACyT (project 35804-A). References: [1] T. Gevers & A. Smeulders, PicToSeek: Combining colour and shape invariant features for image retrieval, IEEE Trans. on Image Processing, Vol. 9, No.1, 2000. [2] O. Starostenko & J. Chávez, Motion estimation algorithms of image processing services for wide community, Proc. of Knowledge/Based Intelligent Information Engineering Systems Conference KES'2001, Japan, 2001, 758-763. [3] QBIC (TM). IBM's Query by image content, http://wwwqbic.almaden.ibm.com/. [4] J. Smith, Visual Seek: fully automated contentbased image query system, Proc. of ACM Multimedia, USA, 1996, 87-98. [5] The Amore. Advance multimedia oriented retrieval engine, http://www.ccrl.com/amore/. [6] F. Mokhtarian, Shape queries using image data bases, www.ee.surrey.ac.uk/Research/VSSP/ imagedb/demo.html. [7] D. Fensel, Ontologies: a silver bullet for knowledge management and electronic commerce, USA, CA: Springer, 2001. [8] T.R. Gruber, A translation approach to portable ontology specifications, Knowledge Acquisition, Vol. 5, 1993. [9] S. Smith & J. Brady, A new approach to lowlevel image processing, Journal of Computer Vision, Vol. 23 No1, 1997. [10] O. Starostenko & J. Neme, Novel advanced complex pattern recognition and motion characteristics estimation algorithms, Proc. VI Iber - American Symposium on Pattern recognition, Brazil, 2001, 7-13. [11] M. Lew, Principles of visual information retrieval, Advances in pattern recognition, USA, NJ: Springer-Verlag, 2001. [12] J.A. Chávez-Aragón, O. Starostenko & M. Medina, Convex regions preprocessing algorithm in images, Proc. of III International Symposium in Intelligent Technologies, Mexico, 2002, 41-45. [13] D. Fensel, The semantic web and its languages, IEEE Computer Society, Vol.15, No.6, 2000. [14] D. Beckett, The design and implementation of the redland RDF application framework, Proc. of the 10th International World Wide Web Conference, WWW, 2001, 120-125. [15] Consortium for the Computer Interchange of Museum Information, Dublin core, Guide to Best Practice: CIMI, 1999