Object-Oriented Model for GIS Compressed Images Boris Rachev, Mariana Stoeva Technical University of Varna, Department of Computer Science 1, Studentska str., 9010, Varna, Bulgaria Abstract In this paper we analyse the existing approaches to image data modelling and we propose an Object-Oriented Model for GIS compressed images (OOMCI). Image databases (IDB) are a very important element of Information Systems, and of GIS and Multimedia applications in particular. IDB usually require large memory resources and realization in the computer network environment. Therefore we discuss a proposal for the structure of the Compressed Image Databases (CIDB) with new effective spatial queries system. The main problem of the CIDB design is the structure of the compressed image model (CIM). It must be extensible and represent the structure and content of the image, its objects and the relationships between them. We think an appropriate CIM should be a very useful tool for the search process in this system. In the same time we have no standard model for the semantic wealth of the image that should be a base for the new CIM invention. However we want to have a CIM with the content-based image accessible in an environment of compressed images. The proposed object-oriented model for the compressed GIS images is based on the logical representation of the compressed image, that is an abstraction of the real world. We use a proper method for image compression and Hierarchical Point-to-Point Model (HPPM) of the Real World for the creation of a new global IDB structure. OOMCI is a universal model and can be used in the medical & electronic IDB, with the large diversity of images, but GIS are the main area of its putting into practice. Keywords: Data Bases, Image & Video Databases, DBMS, Object-Oriented Model, Real World Model, RGW, GIS. 1. Introduction Image bases are an essential part of information systems and multimedia applications that brings their continuous development. The fact that images are much richer in image information than text and can be differently interpreted according to the application domain makes image data management very complex. There are variable and specific types of problems that appear in different applications and as a result the basic characteristics that must be identified and extracted from the images are very different. Image databases require large storage resources and usually a network information access. Due to this customer server technologies are used mainly. The continuous IDB development increases their capabilities and improves their basic characteristics – quickness, flexibility and the essential storage for the data. This imposes IDB design to integrate ideas and techniques of various areas of computer science such as computer graphics, image processing, image identification, artificial intelligence, database & knowledge base methodology, and GIS of course. This composition of ideas culminates in new ideas for representation and new data models, exact and efficient algorithms for query processing and proper independent systems of architecture. One solution that decreases the necessary storage is the compressed image storage or creation of compressed images databases. CIDB storage adds new problems to IDB design. Image compression has a large application and gives a powerful computer imageprocessing device. Compression methods with and without information loss, of different generations with high compression degree, based on transformations that imitate human visual system models are created. Such are the well-known compression methods by outline intensity, by segmentation and stable and unstable object identification. They use the same mathematical techniques as the one for information extraction from images. These methods are of great interest for CIDB authors as they view the image from aspects that support its logical presentation in the models of images data. Pictorial data model is one of the main problems in Image database systems (IDBS) design and development. Data model has to be extensible, to possess an expressive might and to be able to present image structure and content, the objects it contains and their relationships. The design of an appropriate data model guarantees the abilities for image search in an IDBS. The model gets more complicated due to the complexity of image interpretation in dependence of application area, the lack of a standard model for representation of the semantic wealth of an image, the desire for images to be stored in a compressed form and the access approaches to be based on image content. Object oriented models for data representation have the most wide spread application no matter the IDBMS type – relational or object oriented. These models allow direct presentation of hierarchical information in its most natural analogue form, namely data storage in different abstraction levels [8, 10]. 2. Definition of the Problem The experience in the domain of creation and utilization of models for interpretation of real world objects images (RW) represented somehow in the formal computer world (CW) shows that a creation of a new model of object oriented type above all, but for representation and retrieval of compressed images is necessary and possible. Such a model has to support direct image search at different levels including spatial search. It also has to be applicable in a wide variety of image collections. The utilization of multiple logical representations of an image is necessary for this purpose. The model searched has to be employable in the following application domains: GIS, bases of medical images, image catalogues, etc. 3. Solution of the Problem 3.1. Background: Image description 3.1.1.Classification of the Image Data and its contents In general, image data can be classified as: Non-spatial, alphanumeric data that are attribute based; Spatial data consisting of spatial properties of image-objects; Graphical data consisting of expressive image characteristics description. They are closely connected with spatial data. The image data can be treated as physical image representation and their meaning as logical representation. The representation includes tools and approaches for description of the image, image–objects characteristics and their relationships [11]. Physical representation The physical representation is commonly in raster or vector forms. The raster form includes the image header and the image matrix. The vector form that is intended mainly for pictorial images includes mathematical description of image-objects. Compressed images are presented in their encoded form that depends on the compression algorithm used. Logical representation The logical representation includes image description as general, description of image-objects and their relationships. A detailed review of the tools and approaches to logical representation is given in [11]. 3.1.2. General image description Images are described by general describing attributes and by attributes extracted from image content that characterize it as whole. General describing attributes are meta attributes and semantic attributes. Meta attributes refer to the process of image creation. Semantic attributes contain subjective information about the image. A specialist in IDB application domain gives the values of these attributes. The attributes that are extracted from the image are colour and structural attributes. Colour attributes may be histograms of intensity of the contours’ colours, average basic colours, overall average colour, etc. Structural attributes are extracted by either structural methods, identifying the structure primitives or statistical methods using spatial distribution of image pixels’ intensity. The objects are separated and associated with the corresponding names after image processing. As general this process is named segmentation. The determination of the objects of interest is made by different segmentation techniques such as: threshold, texture, special images, contour and contour-segment methods, objects identification, mouse drawings, etc. 3.1.3. Object description Object description uses semantic, logical, colour, texture and shape attributes. Semantic attributes describe image objects characteristics subjectively. Logical attributes are obtained as metric characteristics of image- objects such as: height, width, diameter, perimeter, area, angles, etc. Colour objects attributes are the same as these of the image. Textural objects attributes are coarseness, contrast, directionality, regularity and roughness. Objects shape describing attributes take into account boundary based geometrical methods (lists of corner point and chain codes, minimum boundary rectangle [7, 9]), geometrical or structural regions based methods on spatial domains ( primitive and 2-D strings [9]), region based on domain transformation methods. 3.1.4. Objects relationships representation Spatial relations are the mast commonly used. They are represented in the logical structures for spatial data presentation and are described by specific alphanumeric strings. The chosen spatial data representation structure is definite for the efficient spatial query processing. Object oriented image data structure makes IDBMS more flexible, intelligent and faster and allows the knowledge embedded in images to be captured by the data structure as much as possible, especially spatial knowledge [9]. Structures are point structures and extended spatial domains. Point structures are intended to present data points in multidimensional space. Such are Btrees, binary trees, point quad trees and region quad trees, Kd –tree (k-dimensional binary tree), K-D-B-tree, consisting of region and point pages. Data structures for extended spatial areas are continuously developing and aiming efficient storage use and easy information retrieval. There are data structures using minimum boundary rectangles (MBR) and orthogonal relations, corner stitching, cell and grid models, 4D-tree, Rtree (multidimensional generalization of the B-tree), R+ -tree and K-D-B- tree (proposed to overcome the problem of overlapping MBR of the R-tree and the dead space of the R+-tree). All structures have their advantages and defects. Their knowledge may be used as a guide for new data structures design and for comparison with qualitative new structures. Description of the spatial relations among the objects in the logical data structure is made most commonly by strings of the following types: (1) 2-D string describing orthogonal relations among the objects, represented by MBR and their centroids; (2) R-string describing objects’ MBRs centroids order along a radial bounding line that begins in the image centroid and makes a full turn counter clockwise. For abstract information representation string grammar, indefinite grammar and predicative logic are used as well as for presentation of description tools for spatially oriented completely connected graph, where each edge’s weight is the slope of the line that connects the MBR’s centroids of the corresponding objects. Matrixes are used for representation of the following relations: topologic relation of the type intersect touch, not intersect and not touch; vector relations presenting the relative position of the objects according to the four cardinal points; metric relations, giving the distances between the objects (near, far, too near, too far). 3.2. CIDB Data Models The idea for image memory as a compressed image is too attractive for the researchers due to the great storage necessary for IDB and at the same time it sets new problems for the design [4]. High range compression algorithms, based on transformations that imitate human vision system models appeared the last ten years. These are the compression methods by contour intensity, by segmentation and by stable and unstable object identification [2, 6]. As well as in image analysis, so in the compression the way of data representation for the particular pixels is very important. In the filtration as a stage of the analysis should be operated directly with the image intensity values. In more complex compression associated assignments it is better to operate with other data representation forms in higher level using primitives. What exactly these primitives should be is still a permanent research domain. Most commonly they are vectors describing particular class image information. The primitive is a semantic and significant image characteristic [6]. Such a primitive is proposed for processing and compressing by the image contours intensity. In other segmentation image compression methods segmentation is applied to homogeneous areas, surrounded by contours that are encoded. Objects identification based methods involve preliminary description of the objects that may appear. These methods are of specific interest for CIDB designers as they view the image in aspects that support its logical representation in the image data models. In dependence of image data application and the search level one or a compression method should be preferred, retaining or not the images semantics [15]. The specific character of image collection application predetermines the type of the models in the existing CIDB. They support direct search of the image content base at different levels. At the lower image search level the global, general image characteristics should be used, specified by its semantic, meta, colour and texture attributes, when no context information and no area specification are required. At the next search level the image objects typical features should be used specified by its semantic, logical, colour, texture and shape attributes. The spatial search on the basis of the topological, vector, matrix and spatial relationships is at the higher level. A basic challenge for researchers is the creation of data model structures that support this higher-level search. The most popular IDBMS is IIDS (Intelligent Image Database System), where images are stored compressed as a whole. The IIDS model supports spatial data, their flexible retrieval, visual representation and the traditional operations on IDB [9]. In this model the spatial data structure is represented by 2-D strings and gives an efficient tool for iconic indexing in DBMS and spatial argumentation. The 2-D string is especially popular because it is very efficient in describing symbol pictures and describes the orthogonal relations between the objects determined in their MBRs meaning. 3.3. Object-Oriented Model for Compressed Images – General description The proposed model is built after analysing the existing tools and approaches for image data modelling, the image compression algorithms and the existing data models. The objectoriented model for image data representation was preferred. It detects naturally the inherent image data structure since an object may be created at different levels of spatial division and also at different levels of hierarchic description and view of the real world, for example by the HPPM mode [13] and its application [14]. The object-oriented model increases the data structuring flexibility and allows presentation of great quantity of semantic information [1]. PHYSICAL IMAGE REPRESENTATION LOGICAL IMAGE REPRESENTATION IMAGE Image Compression Codes DIGITIZED IMAGE Image Colour Attributes Image Meta Attributes Image Texture Attributes Image Semantic Attributes 1st level Object Compression Codes SEGMENTED IMAGE Object Colour Attributes Object Texture Attributes Object Shape Attributes Object Semantic Attributes SEGMENTATION SYMBOLIZED IMAGE Object Logical Attributes 2nd level Spatial Object Attributes 3rd level Legend Compression Content-based attributes Figure RELATION 1: OOMCI Structure Symbol attributes The model supports spatial data and allows determination of the relationships between alphanumeric and spatial data, represented by primitive entities named objects (areas as points, lines and segments) of different type. The fact that any spatial entity with a definite shape can be presented as an object is taken into account. The objects are an entity that combines at the same time the processing properties and the data, the classes are premeditated data description and the different examples of a given class form the extension data [8]. The one-by-one connection appears at different levels. The model supports traditional operations on CIDB and flexible image retrieval. The conceptual images OOMCI representation takes place by concepts describing images structure (Figure 1). Three forms of image are used: numeric, segmented and symbolized. The segmented image is obtained from the previous numeric image by segmentation. It contains the objects identified in the image. The symbol image describing the relations between the objects is obtained from the segmented image by description the spatial relationships between the objects according to the chosen data storage structure. It is represented by specific strings, matrixes or double-linked lists. The image is represented in two aspects, physical and logical. The physical representation is determined at: The lower search level of the description of the code records that contain the images as a whole in compressed form. The next search level of the description of the code records, containing the object in compressed form. The compressed records description depends on the compression algorithm. At the lower level this may be anyone of the existing compression methods and its choice is determined by the compression coefficient it achieves. At the next level for object storage in compressed form an algorithm has to be used, that takes an account of imitating human visual system model corresponding to the logical objects representation. The logical representation of a given image is obtained from the defined general image description, the objects description and the objects relationships description. This form of information representation allows image indexing and retrieval at different search levels, in different in type characteristics. The general description is obtained by extracting the general image characteristics from the numeric image. It includes a description of image general characteristics represented by meta and semantic attributes and a description of the characteristics based on the information content and represented by colour and texture attributes. Most present IDB use this image description method. The general image description supports direct search at lower level in attributes describing the image as a whole. Another way of image description is by its visual content, namely objects description. Objects description is obtained by extracting the objects and their characteristics from the segmented image. It includes a description of objects general characteristics represented by logical and semantic attributes and a description of characteristics based on objects content represented by colour, texture and form attributes. This description supports next level search and assume the existence of some previous knowledge about the values of these object characteristics. The object code record that represents it in compressed form may be used as form attribute when the compressed algorithm is appropriately chosen. Some present IDB use these very form attributes. Objects description may be extended by description of the relationships among them. The description of the relationships among the objects is obtained from the symbol description of the relationships among the objects. Objects attributes for spatial search are extracted by corresponding algorithms. Spatial query processing requires description of the relationships among the objects. Spatial relations are presented in different data structures and their description depends on the chosen structure. The most commonly used approach is the description by string or double linked lists. Matrixes are used for description of topological, metric and vector relations. The objects attributes supporting spatial search are extracted from the symbol representation by appropriate algorithms. IMAGE IMAGE ATTRIBUTES Image Compression Codes Colour Histogram RGB model 46%,28%,25% Meta Name –Varna/l/33 Date - 01/01/99 Source- reg11, page 32 Texture Contrast 0.84 Semantic Region –Varna Coordinates – 430 8’6” 0/0(EOB) 1010 0/1 0000 0/2 0101 0/3 1011 1st level SEGMENTED IMAGE IMAGE H P PR OBJECTS G H P Object H Object P Compression Object R Object G Codes Compression Compression 0100101….. Compression Codes 0100101 …..….. Codes 0100101 Codes 0100101….. R G ATTRIBUTES Colour Average purple Texture contrast Shape Contour 2064242 464 Semantic Type - hospital Owner - Petrow Logical Area 350 m2 Colour Average blue Texture Contrast 0.76 Shape Contour 2107654 3 Semantic Type - pond Owner-Petrow Logical Area 25m2 Colour Average grey Texture Contrast 0.35 Shape Contour 2710531 3 Semantic Type - road Owner-municipality Logical Area 115 m2 Colour Average green Texture Contrast 1.25 Shape Contour 1753 Semantic Type -garden Owner-municipality Logical Area 500 m2 1.98 2nd level Figure 2 a: OOMCI hierarchical (1st and 2nd levels) modelling of the GIS compressed image YY MBR H (53,130) (85,127) P G (123,125) (45,120) R X X ORTHOGONAL RELATION R3 R1 H1 RELEVANT POSITION OR G1 H R P H 2 R2 G G2 G1 H1 R string (R, H, P, G) SPATIAL INDEX R1 P H 2 R2 G2 R3 SYMBOLIZED IMAGE 2-D string ((H1 P<H2<R1<R3<R2<G2<G1; P H2 R2) G2<R3<G1<H1 R1) SPATIAL INDEX 3rd level Figure 2 b: OOMCI hierarchical (3rd level) modelling of the GIS compressed image 3.4. Object-Oriented Model for Compressed Images - Example An example of image OOMCI description of a built-up area district draft is shown in Fig. 2. 4. Implementation of the OOMCI on the HPPM images The proposed OOMCI may be used over primarily created images in accordance with the hierarchic architecture of their data obtained over HPPM model. In this case its general mathematical description will be as follows: w = OOGw(Pij,Rij), where: OOG is an Object-Oriented variant of the HPPM Model, j=1, Ni, i=1, K and Rij are the spatial or simple relationships of the Points Pij. Each set of Points Pij on each level i must be used as a basis for the image representation and its implementation by OOMCI. Here j is the index of the number of image points, which are situated on the level i. The first level of the OOMCI model over HPPM hierarchic representation of realistic image of a part of the real world is illustrated in Figure 3. This is the level of a hierarchic set of images. Each of them may be represented at the next level of the object-oriented model in accordance with the schemes in Figures 1 and 2. IMAGE 1 Image Compressed Codes ….. 0011 One HPPM Point and one OOMCI representation at level ∞ IMAGE 1 ATTRIBUTES Color Semantic Texture - Meta Spatial HPPM Relationships “Oneto-many” Two HPPM Sets of Points and two OOMCI representation at level i+1 … . Two HPPM Points at One Full OOMCI the level i representation at level i Image Compressed Codes ….. 01100 Color IMAGE 2 Semantic Meta Texture IMAGE 2 ATTRIBUTES 1st level Figure 3: 1st level of the OOMCI implementation example for the HPPM hierarchical series of images 5. Conclusions The proposed model generalized the experience of the existing image data models allowing storage in compressed form. It is used for image representation and retrieval and it is: Applicable for a great number of collections; Flexible and may be conformed to the application specificity; Supporting direct search not only according to alphanumeric attributes, but also according to characteristics extracted from the image at different search levels – general image characteristics, object characteristics and spatial characteristics; Allowing different types of functions on the physical and logical image representation. The CIDB development is directed to: search of algorithms for automatic extraction of data characteristics from the images, composition of structures for spatial data representation and retrieval, improvement of structures description approaches. 6. Acknowledgments This work is supported by the INCO Copernicus Project URBAN 960252 and includes some proposals, which develop the results of this one. 7. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. J.A Orenstein, F.A. Manola, “PROBE-Spatial Data Modeling and Query Processing in an Image Database Application” IEEE Transaction on Software engineering vol. 14 N0 5 may 1988. Sl. Jordanowa “Algorithms for compression”, PhD Thesis, TUV, Varna, 1999. N. Roussopoulos, C. Faloutsos, T Sellis, “An efficient Pictorial Database System for PSQL”, IEEE Transaction on Software engineering vol. 14 N0 5 may 1988. Y. Cheng, S.S.Iyengar, R. L. Kashyap, “A New Method of Image Compression Using Irreducible Covers of Maximal Rectangles” IEEE Transaction on Software engineering vol. 14 N0 5 may 1988. Unnikrishman, P Shankar, Y. V. Venkastesh, “Threaded Linear Hierarchical Quadtree for Computation of Geometric Properties of Binary Image”, IEEE Transaction on Software engineering vol. 14 N0 5 may 1988. Kh. Sayod “Introduction to data compression” Morgan Kaufman Publishers, Inc, San Francisco California 1996. R. Kasturi, J. Alemany, “Information Extraction from Image of paper Based Maps”, IEEE Transaction on Software engineering vol. 14 N0 5 may 1988. L. Mahan, R.L.Kashyar, “An Object-oriented knowledge Representation for Spatial Information”, IEEE Transaction on Software engineering vol. 14 N0 5 may 1988. S. K. Chang, C. W. Yan, D. C. Dimitroff, T. Arndt, “An Intelligent Image Database System” IEEE Transaction on Software engineering vol. 14 N0 5 may 1988. Peter L. Stanchev, “Object-Oriented Image Model”, Technology of Object-Oriented Languages and Systems TOOLS Eastern Europe’99, Proceedings, Blagoewgrad, pp. 98-109, 1999. R. Pentland, R.W. Picard, S. Sclaroff, PhotoBook: “Content-Based Manipulation of Image databases” http://vismod.www.media.mit.edu/~tpminka/photobook/. V.E. Ogle, M.Stonebraker, “Chabot: Retroeval from a Relational Database of Images”. http://www.enet.it/hpq/texture/index.htm. B. Rachev. “A new Real World Point to Point Model for GIS”, 4 th EC-GIS Workshop, Budapest, 1998, EC&JRC, Proceedings, pp. 98-106, 1999. B. Rachev, V. Todorov, A. Sirekov, E. Racheva, N. Nikolov, D. Velkova. “ECOURBAN - An Ecological GIS for the City of Bourgas, Bulgaria”, 5th EC GIS Workshop GIS OF TOMORROW, Stresa, Italy, EC&JRC, Proceedings, pp. 200-209, 2000, http://www.ec-gis.org/Work-shops/5ec-gis/. John R. Smith, “Integrated Spatial and Feature Image Systems: Retrieval, Analysis and Compression”, http://disney.ctr.columbia.edu/jrsthesis/thesissmall.html.