MULTISCALE LANDSCAPE REPRESENTATION DERIVED FROM REMOTE SENSING IMAGES USING SPATIAL SUBPIXEL MODELS AND COMBINATORIAL MAPS G. Kaiser⋆ and T. Bauer Institute of Surveying, Remote Sensing and Land Information BOKU - University of Natural Resources and Applied Life Sciences, Vienna, Austria Peter-Jordan-Straße 82, 1190 Vienna, Austria georg.kaiser@boku.ac.at, t.bauer@boku.ac.at KEY WORDS: landscapes, multiscale, subpixel, graphs, combinatorial maps ABSTRACT: Landscapes can be modelled as sets of homogeneous objects (landscape elements) forming areas of characteristic patterns (landscape units). The common way to represent landscapes is by vector GIS data structures which do not provide a mechanism for storing the spatial relationships between the objects in explicit form. Also, they do not account for a multi-scale representation. Graphs can be used for describing the topology of a set of objects and the dual graph approach allows for a representation of its embedding into the Euclidean plane. The combinatorial map concept is eligible for the simultaneous representation of dual graphs in one single data structure by orientation and splitting of edges into so-called half-edges or darts. Combinatorial maps also allow for a dual graph representation of a landscape at different scales by creating a map for each hierarchy level and linking them together. The task of obtaining a landscape representation from remote sensing images can be accomplished by deriving an adequate combinatorial map representation from the image data. This is done by segmentation of the image. Adverse effects by mixed pixels on the segmentation result can be reduced by spatial subpixel analysis. This method is based on the geometric description of object boundaries that intersect pixels and thus lead to mixed pixels. Its applicability depends on the relationship between the size of the remotely sensed objects and the pixel size of the sensor. Since the distribution of grey values in the neighbourhood of any mixed pixel not only depends on the parameters of the geometric model of land cover boundaries but also on the spatial response of the sensor, knowledge about the sensors point spread function can be used to enhance performance of spatial subpixel analysis. The result of applying spatial subpixel analysis to the remote sensing image is a vector representation of edges in the image which can be transformed into the combinatorial map structure. Then segmentation can be performed by manipulation of the permutations defining the combinatorial map. This leads to a new combinatorial map which can be regarded as a representation of the landscape at a certain scale. This step can be applied repeatedly for obtaining a hierarchy of combinatorial maps each of which defining the landscape at a different abstraction level. In this work we show how spatial subpixel models and the representation of remote sensing images by combinatorial maps can be combined to retrieve a multi-scale description of landscapes with subpixel accuracy from satellite images. 1. INTRODUCTION 1.1 Representation of landscapes The automatic extraction of landscape structure from remote sensing data is an important prerequisite for analysing and understanding landscapes and their processes (Forman, R., Godron, M. 1986, Schneider, W. 2002). Based on the idea of a hierarchical structure of landscapes there is a strong need for a multiscale representation. Landscapes can be represented in a computer system by different data structures. First, a regular grid can be used, each cell of which usually representing a homogeneous region of quadratic extent. Then a set of adjacent and similar cells can be combined to represent a real-world object. Similarity in this case means that the adjacent cells are likely to be part of the same object and has to be defined more precisely by the application, taking into account the scale of interest. The set of adjacent cells then represents a real-world landscape element which is considered as being homogeneous at that particular scale. Second, a vector representation can be used, which delineates the landscape elements by polygons. This format has the advantage of allowing a more realistic representation of object boundaries since they do not correspond to regular grid cells in general. Both, the grid and the vector format, are eligible for representing the geometry (location, spatial extent, . . . ) of the real-world objects composing a landscape, but they do not store their spatial relationships (i.e. their topology) in explicit form. For that reason graphs have been introduced into landscape ecological problem solving. Urban and Keitt (Urban, D., Keitt, T., 2000) give examples of how graphs can be used for the representation and analysis of landscapes. Following this approach the nodes of the graph are equivalent to patches located in the landscape matrix and the edges stand for the fact that there is some relationship between pairs of patches. This graph is known as region adjacency graph (RAG, Figure 1c). In the simplest case an edge just means that the two nodes (it is incident to) are adjacent to each other, but it may also carry information about interaction (mass fluxes, dispersal of individuals or species, . . . ) between the patches. RAGs can be used to represent the partitioning of a 2-D plane, but they cannot correctly encode multiple boundaries and inclusions (Haxhimusa, Y., Ion A., Kropatsch, W., Brun, L., 2005). For the definition of both properties of the objects composing a landscape, their topology and their embedding into the Euclidean plane, the dual graph approach can be used (Saib, M., Haxhimusa, Y., Glantz, R., 2002). Dual graphs encode a partitioning of a plane by a RAG and its dual structure, the boundary graph (Figure 8), but dual graph representations lack an explicit encoding of the orientation of planes, existing in combinatorial maps (Haxhimusa, Y., Ion A., Kropatsch, W., Brun, L., 2005). A combinatorial map can be obtained from the segmentation of a plane by orientation and splitting of edges (of the boundary graph) into so-called half-edges or darts. This is outlined in more detail in chapter 1.3. Figure 1 shows the different data structures for the representation of landscapes in a computer system. It should be noted that Figure 1c shows the region adjacency graph only, and not its dual, the boundary graph. node picture/raster element (pixel) In spatial subpixel analysis, the parameters of the model valid for a certain pixel and its neighbourhood are estimated from the pixel values in this neighbourhood (e.g. in a 3x3 cell of pixels, Figure 4). If the object and its model parameters have been found (Figure 3), the corresponding part of the image can be resampled to a higher spatial resolution, containing again mixed pixels but at a finer scale. The resampling of the image to a finer scale can be regarded as simulating the imaging process with a higher spatial resolution (Figure 5 left). This is, for example, valuable for image co-registration or visual interpretation. Alternatively, the pure spectral signatures of the dominating land cover can be assigned to the pixels in the resampled image. This can be done prior to segmentation and/or classification to reduce the percentage of mixed pixels (Figure 5 right). label vertex region a) raster arc b) vector dart node In the method presented here we do not resample the original image, but use the parameters obtained from spatial subpixel analysis for creating a vector representation of the edges in the image (Figure 2 right). In the case of pixels for which no spatial subpixel model could be found, these edges correspond to the inter-pixel boundaries (Braquelaire, J.P., Brun, L., 1998) according to the regular grid of the image (Figure 6). a edge vertex vertex face c) graph s edge f d) combinatorial map Figure 1. Representation of landscapes Obtaining a landscape representation from a remote sensing image can be accomplished by segmentation of the image. Spatial subpixel analysis, as described in chapter 1.2, can be used to reduce adverse effects by mixed pixels and to retrieve the object boundaries with subpixel accuracy. This paper is structured as follows. In chapters 1.2 and 1.3 we give a brief overview of spatial subpixel analysis and combinatorial maps, respectively. In chapter 2 we present a method that combines both spatial subpixel analysis and the representation of landscapes based on combinatorial maps, to derive a multi-level structural representation of a landscape from a remote sensing image by repeated segmentation, leading to an irregular image pyramid encoded by a combinatorial pyramid. In chapter 2.4 we discuss some limitations of the method and chapter 3 contains the conclusion. Figure 2. Real image and model image v1 v2 Figure 3. Spatial subpixel model ’Linear Border’ (4 parameters) 1.2 Spatial subpixel analysis Spatial subpixel analysis (Schneider, 1993) of remotely sensed images aims at detecting the spatial structure of land cover at subpixel level and of spectral signatures undisturbed by mixed pixel effects. For this purpose, the remotely sensed real-world objects are described locally (for a certain pixel and its neighbourhood) with a geometrical-physical model. This model consists of a set of parameters describing the object’s geometry and pure spectral signatures. Especially in agricultural areas, the borders between adjacent fields of different land cover often resemble simple geometric shapes, like straight lines, T-shaped figures or corners. This geometric shape determines - together with the object’s reflectance - the distribution of grey values in the image. For example, the model for two adjacent homogeneous fields separated by a straight borderline (Figures 2 and 3) is uniquely described by two geometric parameters, namely the origin and orientation of the boundary, and two spectral parameters (per band) defining the pure spectral signatures of the fields. More models (e.g. for corners and roads) have been implemented (Kaiser, G., Schneider, W., 2006). p1 p2 p3 p4 p5 p6 p7 p8 p9 Figure 4. Cell of pixels, C A prerequisite for successful functioning of spatial subpixel analysis is the mathematical formulation of the process of pixel value generation in the case of a given object (terrain) radiance distribution. The simplest assumption is that every pixel value is obtained by integrating the object radiance over a square corresponding to the area of the detector at the position of the pixel. This assumption, however, is an oversimplification. The optical system of the sensor produces a blurred version of the radiance Definition (Brun, L., Kropatsch, W., 1999): A combinatorial map G is the triplet G = (D, σ, α), where D is a set called the set of darts (or half-edges) and σ and α are two permutations defined on D. α is an involution (see below). The combinatorial map can be retrieved from the planar graph by splitting the edges where their duals cross (Figures 7, from (Brun, L., Kropatsch, W., 1999), 8 and 10). With respect to the representation of a landscape as shown in Figure 1, the graph containing the circular nodes in Figure 7 represents the boundary graph, the one containing the square nodes corresponds to the region adjacency graph. Figure 5. 2 different resampling methods ‚half edges‘ (darts) s f a plane graph split at dual edges combinatorial map Figure 6. Inter-pixel boundaries Figure 7. Combinatorial map deduced from planar graph distribution on the terrain surface, which is then integrated over the area of the detector. In addition, the pixel value is further affected by sensor motion and electronic amplification influences. These effects are described by the point spread function (PSF) of the sensor (Schowengerdt, 1997). The influence of the point spread function on spatial subpixel analysis and how it can be considered in this method is described in detail in (Kaiser, G., Schneider, W., 2006). In this context, a permutation is a mapping that associates to each element of D a unique predecessor and a unique successor in D. This mapping is one to one and onto (injective and surjective, hence bijective). Involution means that the successor (predecessor) of the successor (predecessor) of any element of D is the element itself. Since it is a composition of two permutations, the mapping defined by the combination of α and σ to φ is also a permutation: Spatial subpixel analysis can be seen as an optimisation problem. Let C be a cell containing n pixels in the source image (Figure 4) with pixel values p1 , p2 , . . . , pn , spatial (geometric) parameters r1 , r2 , . . . , rk and spectral parameters v1 , v2 , . . . , vl . Similar to the spectral subpixel analysis it is assumed that each pixel within cell C is the result of a lineare mixture of the pure signatures v1 , v2 , . . . , vl . For given model parameters, the estimated pixel values q1 , q2 , . . . , qn can be described as φ=σ◦α qi = ρi (f1i ) vi + ρi (f2i ) v2 + . . . + ρi (fli ) vl (1) with i = 1, . . . , n. The fji are the polygons of homogeneous regions with pure signatures. Function ρi (f1i ) is the integral of PSF over the polygon f1i (Steinwendner, 2003). In an ideal optical system it would represent the area of the polygon fji . Finally the optimisation problem can be written as a minimization of the sum of the quadratic difference between the observed pixel values pi and the estimated pixel values qi with respect to the model parameters r1 , r2 , . . . , rk and v1 , v2 , . . . , vl : e(r1 , . . . , rk , v1 , · · · , vl ) = n X (3) φ encodes the faces (Figures 1 and 7) of the planar map which are defined by the edges surrounding it. The ordered sequence of darts around a vertex induces an ordered sequence of darts that is obtained when turning around a face. Each edge bounds two neighboured faces, but the splitting into two oriented half-edges allows for an unique assignment of one dart to one face. The opposite dart is uniquely assigned to the neighbouring face on the other side of the edge. Thus, the faces can be encoded by one dart. By following successor chains the orbits (cycles) of the permutations can be retrieved: The orbits of σ define the nodes (vertices) of the map. Cycles of α combine two darts (half-edges) that stem from the same edge. The faces of the map are encoded by φ and represent the regions. Another benefit of combinatorial maps is that they implicitely encode the dual structure of the planar graph. 2. METHOD 2 [pi − qi ] → min (2) i=1 This non-linear least squares problem can be solved by using the Levenberg-Marquardt algorithm for example. 1.3 Combinatorial maps A combinatorial map may be regarded as a particular encoding of a planar graph. Since the combinatorial map formalism is used to represent the dual graphs encoding the topology and the geometry of the image segments, the described method can be regarded as a graph-based segmentation method. Steinwendner (Steinwendner, 2003) introduced graph-based subpixel segmentation, showing the analogy of image segmentation and graph partitioning, but used the data type graph of the LEDA library (Mehlhorn, K., Näher, S., 2000) instead of combinatorial maps. The advantages of combinatorial maps (correct encoding of multiple boundaries and inclusions, explicit encoding of the orientation of planes and encoding of both dual structures in one single data structure) have been mentioned above. 2.1 Subpixel analysis prior to segmentation The first step of the method is spatial subpixel analysis as described in 1.2. This results in a vector representation of boundaries at subpixel scale for pixels for which a valid geometrical model could be determined, and of inter-pixel boundaries for pixels which could not be decomposed into subpixel areas. Additionally, the orientation of the detected boundary may support the decision which object the subpixel areas belong to (Steinwendner, 2003). Figure 8 contains the boundary graph and its dual, the region adjacency graph, of a 3 by 3 cell after spatial subpixel analysis and shows how spatial subpixel analysis eliminates the mixed pixels (see also Figure 2) and retrieves the object boundaries at subpixel accuracy. a stack of sub combinatorial maps of the original combinatorial map. The faces (segments) of the sub combinatorial maps are linked together from one level to the next, forming a stack of successively reduced combinatorial maps called a combinatorial pyramid. 2.3 Algorithm The main steps of the overall algorithm are as follows (Figure 9): 1. Perform spatial subpixel analysis for the input image taking into account the spatial response of the particular sensor (Figures 2 and 3). 2. Based on the results of spatial subpixel analysis, create a vector representation of the edges and its dual (Figure 8). 3. Derive an adequate combinatorial map from the vector representation (Figure 10). 4. Based on the application-dependent rules, perform a segmentation on the current combinatorial map by defining an adequate removal kernel, and remove the corresponding darts from the combinatorial map. (Figure 11). 5. Repeat step 4 for each scale and create an appropriate combinatorial map representation for the current level (Figure 12), leading to a combinatorial pyramid. 1 2 geometric description of objects Input image (raster) vector representation of edges 3 Figure 8. RAG (yellow) and its dual (boundary graph, red) 5 4 combinatorial map combinatorial pyramid 2.2 Segmentation based on combinatorial maps Segmentation is a method to subdivide an image into disjoint segments as large as possible, with the constraint that a segment’s elements satisfy some similarity (homogeneity) criterion. In order to perform segmentation of an image represented by a combinatorial map, the adequate operation has to be defined for the combinatorial map and is described for a region merge algorithm here (Kaiser, 2003). The regions of the image are represented by the faces (permutation φ) of the combinatorial map. Merging of two adjacent regions can be achieved by removing the boundary between the two regions, i.e. by removing the edge e that delineates the two regions. The resulting combinatorial map can be defined by a sub combinatorial map deduced from the original one by removing the darts d and α(d) (representing e) from its set of darts. The segmentation of the whole image is defined by the removal kernel: Every edge that delineates two adjacent regions which are evaluated to be similar enough to merge is a member of the removal kernel applied to the combinatorial map. The removal kernel consists of all edges of a certain level that will be removed to build the regions in the next level. Thus, the main goal of any segmentation algorithm to be applied to a combinatorial map is the generation of a removal kernel that leads to a good segmentation result. Successively reducing the combinatorial map creates Figure 9. The steps of the algorithm Figure 10. Combinatorial map deduced from vector/graph representation 3. CONCLUSION In this work we present a method of creating a multi-level structural representation of landscapes based on combinatorial maps and combinatorial pyramids. We show how spatial subpixel analysis can be used for reducing adverse effects of mixed pixels on segmentation and for detecting the boundaries of objects with subpixel accuracy. Combinatorial maps permit unambiguous encoding of the partition of a plane in one single data structure, containing all topological and geometrical information. Combinatorial pyramids, created by successive segmentation, are suitable for a multi-level representation of landscapes. ACKNOWLEDGEMENTS Figure 11. Removal kernel (grey) This work is financially supported by the Austrian Science Foundation (Fonds zur Förderung der wissenschaftlichen Forschung, FWF) , grant No. P17647-N04. REFERENCES Braquelaire, J.P., Brun, L., 1998. Image Segmentation with Topological Maps and Inter-Pixel Representation. Journal of Visual Communication and Image representation 9(1). Brun, L., Kropatsch, W., 1999. Dual contraction of combinatorial maps. Technical Report PRIP-TR-54, Pattern Recognition and Image Processing Group (PRIP), Technical University Vienna. Forman, R., Godron, M., 1986. Landscape Ecology. New York: John Wiley and Sons Ltd. Figure 12. New map (2 regions) as a result of segmentation based on removal kernel 2.4 Limitations • Spatial subpixel analysis (step 1 of the described algorithm) is based on the geometric description of the boundaries of real-world objects. Aside from being dependent on the relationship between the size of the remotely sensed objects and the pixel size of the sensor it is therefore most suitable for scenes containing anthropogenic features. Natural objects, in contrast, most often exhibit much more complex shapes (Lang, S., Blaschke, T., 2003). For that reason the benefit of performing spatial subpixel analysis prior to segmentation depends on the type of sensor and on the content of the scene. • Repeated segmentation of the image builds a combinatorial pyramid, containing combinatorial maps representing the scene at that level and scale. However, a particular level may or may not represent a ’functional level’ of the landscape. Is therefore necessary to identify ’ecologically meaningful’ levels. • Spatial subpixel analysis operates on a cell C, which may be the image of only a part of an object’s boundary. Because of noise in the image the computed geometric parameters of adjacent cells may result in vector boundaries not being ’smooth’. This could be ovecome by developing appropriate smoothing algorithms. Haxhimusa, Y., Ion A., Kropatsch, W., Brun, L., 2005. Hierarchical image partitioning using combinatorial maps. Technical report, Pattern Recognition and Image Processing Group (PRIP), Technical University Vienna. Kaiser, G., 2003. Metasegmentation of Remote Sensing Images based on Combinatorial Maps. Diploma thesis, University of Natural Resources and Applied Life Sciences, Vienna. Kaiser, G., Schneider, W., 2006. Estimation of sensor point spread functions based on spatial subpixel analysis. ISPRS Proceedings, Commission VII Mid-term symposium 2006, Enschede. Lang, S., Blaschke, T., 2003. Hierarchical object representation - comparative multi-scale mapping of anthropogenic and natural features. In ISPRS Archives, Volume XXXIV. Munich. Mehlhorn, K., Näher, S., 2000. LEDA: A platform for combinatorial and geometric computing. Cambridge University Press. Saib, M., Haxhimusa, Y., Glantz, R., 2002. Building irregular graph pyramids using dual graph contraction. Technical Report PRIP-TR-69, Pattern Recognition and Image Processing Group (PRIP), Technical University Vienna. Schneider, W., 1993. Land use mapping with subpixel accuracy from Landsat TM image data. 25th International Symposium: Remote Sensing and Global Environmental Change II, 155–161. Schneider, W., 2002. Knowledge Based Methods of Analysis of Remote Sensed Images for Landscape Ecology. In Pillmann, W., Tochtermann, K. (Ed.), EnviroInfo, Volume 25-27, pp. 197– 202. Vienna: IGU/ISEP, International Society for Environmental Proctection. Schowengerdt, R. A., 1997. Remote Sensing. Models and Methods for Image Processing. Second Edition. Academic Press. Steinwendner, J., 2003. Graph-Based Subpixel Segmentation of Optical Satellite Imagery. Ph. D. thesis, University of Natural Resources and Applied Life Sciences, Vienna. Urban, D., Keitt, T., 2000. Landscape Connectivity, a Graphtheoretic Perspective. Technical report, Levine Science Research Center, Duke University.