Vector Data Vijay Gandhi University of Minnesota Synonyms Geometric modeling, Vector graphics Definition Spatial data is data related to a location. Some examples include population of a city, the type of soil in a region, and data from remote sensing satellites. In the first example, the city could be considered as a location and the population is the data or a feature. In the second example, the region is a collection of locations and the type of soil is the feature. Note that a location may have one or more features. For example, it may be useful to represent both the population and the average age group of a city. It is necessary to convert spatial data into a form which a computer can understand. Both models of the data must have the property of storing locations, features and the association between the two. There are two major ways to model spatial data: as Vector data or as Raster data. Vector data involves storing data as geometric objects. For example, a road can be represented as a combination of lines. In this case, the lines are the objects. A given real world situation can be represented as either a raster or vector model. The choice between a raster data model and vector data depends on how we conceptualize the feature. There are certain properties of Vector data which make them a better choice to represent the model. Vector data is more useful for data which can be represented as discrete objects. Also, it is easier to handle on computers and it takes less space. Here we focus on the properties of the vector data with examples and compare it with the raster data. Historical Background Vector graphics has probably been the earliest field to make use of vector data. Vector graphics became popular since the CRT display monitors back in 1950’s. To create an image on to a screen, CRT monitors trace a beam on to a monitor’s surface. The path traced by a beam was a line; in mathematical terms, a vector. These monitors produced high-resolution images. Since then, vector data is supposed to have been known to have a property of having good quality. Scientific Fundamentals Representation of the Vector data: A vector data can be thought of as an object described using mathematical notations. Vector data is represented as a collection of simple geometric objects such as points, lines, polygons, arcs, circles, etc. For example, a city may be represented by a point, a road may be represented by a collection of lines, and a state may be represented as a polygon. Vector Data Vijay Gandhi University of Minnesota Consider an aerial image of a geographical area shown in Figure 1: Figure 1: Aerial image of a geographical area The area shown in Figure 1 consists of a few entities namely, a River (R1), a Building (B1) and 3 patches of land (A1, A2, and A3). To represent Figure 1 in a Vector data, each entity would be represented by an object. For example, the building B1 could be represented by a point, the river R1 by a line string, and the patch of land A1 by a polygon. The graphical representation of these objects is given in Figure 2: Vector Data Vijay Gandhi University of Minnesota Figure 2: Graphical representation of the Vector data for B1, R1 and A1 shown in Figure 1 Mathematically, these objects could be expressed as: Building B1: Point(6, 4) River R1: LineString((1,1), (3,3), (3,4), (8,9)) Area A1: Polygon((2,4), (2,8), (3,9), (4,9), (5,8), (5,7)) In the backend, each object ID and its corresponding points are stored. The same map could be represented in a Raster data model as shown in the Figure 3: Figure 3: Graphical representation of the Raster data model for B1, R1 and A1 shown in Figure 1 The image is divided into a grid and at each pixel the value corresponding to the corresponding object is stored. Hence the pixel (6, 4) belongs to object B1; pixel (8, 9) belongs to R1, etc. In the backend, the pixel-value pair is stored. Hence all the pixels (in this case 9 x 9) would be stored. Vector data and Storage: Since geographical data involves generally involves millions of pixels of data, storage requirement is one of the main considerations in choosing a data model. Generally, the amount of memory required by a vector model would be less as compared to that of a raster model because the data could be emulated by the vectors. Memory requirement for vector data depends on the complexity of the objects. Simple polygons require less storage space. Vector Data Vijay Gandhi University of Minnesota To compare the storage space required by both models, consider Figure 1. To represent using Raster data, it will take 81 pixels. However, Vector data would require significantly fewer than 81 pixels to be stored. The scale of data is an important factor that could affect the storage requirement for Raster data. Since the map is divided into a grid, the scale of the data depends on the size of the cell in the grid. For example, each cell could represent either a 1km x 1km area or it could represent a 10km x 10km area. The storage space required depends proportionally on the level of detail. For Vector data, scale does not affect the storage requirement. Vector data and Data Source: Data source sometimes determines the type of data model to be used. For instance, the data obtained from remote sensing satellites is typically in raster data format. Since the conversion of this data to vector data is time consuming, raster data model could be used in such cases. Also, since most of the images obtained are digitized, the raster data format is more natural than the vector data format. In such cases, an extra step is required to convert the data into vector format. Vector data and Data Quality: Vector data is supposed to represent data with higher quality. This is preferred by cartographers, who would like to see straight lines in their maps instead of the “jagged” lines which usually occur in raster data models due to digitization. Raster data model quality depends on the level of detailed data being stored. As described earlier, a Raster data model with data at the level of 1km x 1km pixel scale will have far higher quality than a data at the scale of 10km x 10km per pixel. In the case of the Vector data, the quality does not depend on the scale. Vector data and Data Structure: A Vector data requires more complex data structures to be represented in the computer. For example, to represent a polygon, it would be required to have a count of the number of vertices, co-ordinates of each vertex, and its relative position with respect to adjacent edges. Raster data model could be modeled with a matrix data structure. Each cell could represent a location, and different attributes could be associated to each cell. For the above example, the data structure may look as shown in Table 1: Pixel 1,1 2,1 3,1 4,1 … Object ID None None A1 A1 … Attribute 1 Attribute 2 Vector Data Vijay Gandhi University of Minnesota 9,9 None Table 1: Raster data structure for Figure 1 Vector data could be implemented a table and a linked list. In the table, the object ID, Type and its corresponding attribute values could be stored, and a link to its list of points in order could be provided. For the above example, vector data structure may look as Table 2: Object ID B1 R1 Object Type Point Line String A1 Polygon Attribute 1 Attribute 2 List of Points (6,4) (1,1)(3,3)(3,4) (8,9) (2,4)(2,8)(3,9) (4,9)(5,8)(5,7) Table 2: Vector data structure for Figure 1 Vector data and Object types: Vector data is useful for objects which could be represented mathematically in terms of simpler entities, e.g., polygon, which may be represented by a series of lines. However, it may not be suitable for objects which are complex in nature. For example, polygons with islands and polygons with disjointed regions might be difficult to represent using vector data. Vector data and Scaling: Because of certain properties of vector data, like the topology described above, objects represented by a vector data can be scaled without any loss of quality. Since only the objects are represented by vectors, just scaling the vectors could reproduce the actual object at different scales efficiently. This makes vector data very useful to use in applications such as maps, where it is often required to zoom in and out at different levels. Also, as noted earlier, a Raster data model takes more space if data is required at a detailed level. Vector data and Topology: One of the most important features in spatial data is topology. Topology can be defined as a relation that is relative to two objects. An example would be two cities connecting a road. In a vector based data model, such information is inherent in the representation itself. In this example, a node could represent a city and an arc connecting the two nodes could represent the road. Thus, by following the arc it would be easy to find the two cities. In a raster data model, each pixel would have to be scanned to find the arc and then the cities. Hence, a Vector data is more useful when operations related to topology are important. Vector data and Computation: Vector Data Vijay Gandhi University of Minnesota Computation for a vector data could be expensive. This is because of the object representation. Processing on a vector data involves solving complex geometrical problems e.g., finding the intersection of one polygon with another, finding distance across objects, etc. This problem of computation is more evident when the data-set is large. An example can be seen in Figure 1. To find the river closest to the patch A1, we would have to perform geometrical calculation between the polygon formed by A1 and other objects. This geometric calculation could be complex and computationally expensive. Since most of the visual displays work on digitized data, a vector data would require an additional step of converting vector data to digitized format. Vector data and Attributes: In spatial applications, it is often necessary to associate an attribute to a geographical object. Some examples would be population of cities, traffic on a road, and elevation at a given point on Earth, etc. Here population, traffic, and elevation are examples of attributes related to an object. Since entities such as cities, road, etc., are represented as objects, it is easy to associate them to these attributes. This makes it easy to store such associations in the database. Taking the example shown in Figure 1, consider that we need to associate an attribute value for the region A1. Then, all the 16 pixels in the Raster data model would have to be associated with the value. However, in the case of the Vector data, only object A1 could be associated to the value. Vector data and Applications: It is easy to write applications for vector data because the data is already represented as objects. If raster data were used, end-users would have to deal with low-level details or would have to be provided with an interface which converts the data from raster data model to objects, and vice-versa. It is easy to write operations on these objects. For example, to find the area of the region A1, we can just use the geometric properties of the polygons to find the area of the polygon. Vector data and Data Modeling: Since vector data are represented as objects, it might be easy to convert logical models to a physical model. This will make it easy to design the database model. Vector data is easy to store data in an object-oriented database system. The basic building blocks of a vector data, such as Point, could be stored as an object. Vector Data Vijay Gandhi University of Minnesota Vector data and Spatial Networks: Vector data are apt for spatial networks. Spatial networks involve calculations such as shortest path between two points and nearest neighbors. Since these operations are mainly graph-related, they could be performed easily if the objects are in the form of nodes and arcs between the nodes. A vector data satisfies this requirement. Hence, they are preferred for modeling spatial networks. Key Applications Generally, vector data format could be used in any application which has spatial data. Some of the key applications of vector data are listed below: Computer Graphics and Animation Vector data is used in computer graphics to represent images. The quality of vector data representation of images is supposed to be better than using the traditional bitmap format. The quality of vector data does not degrade with scaling. This makes vector data a better choice for images such as logos, which are resized frequently. Images in vector data format are also preferred because it takes less storage space. Most of the 3D animation uses vector data because of properties such as less storage space, and good quality. Geographic Information Systems Geographic Information Systems are computer systems which are used for creating and managing geo-spatial data useful for applications such as land management. Most of these applications require associating attributes to spatial data. For example, relating population to a city. Since it is easy to associate attributes to spatial object in a vector data format, Geographic Information Systems prefer vector data for such cases. Transportation Networks In transportation networks, it is important to maintain spatially relative information between two objects. For example, the direction of a turn is spatially relative between two streets. Such information are easy to represent using vector data. Cartography Maps can be accurately drawn using vector data. Many cartographers prefer vector data because they can produce images without any distortion. Conclusion: Spatial data could be represented using vector data and raster data model. Few of the properties of the vector data make them a better choice over raster data model. Vector data is more suited for discrete entities which require maintaining topological information. Object-oriented representation of data makes vector data easy to model in a database system and applications. Vector Data Vijay Gandhi University of Minnesota References: 1. Kersting, O. and J. Dollner. Interactive 3D visualization of vector data in GIS. In Proceedings of the 10th ACM international symposium on Advances in geographical information systems, Pages 107-112, McLean Virginia, 2002. 2. Khedker, U. and D. M. Dhamdhere. A generalized theory of bit vector data flow analysis. In ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 16 Issue 5, Pages 1472-1511, 1994. 3. Haralick, R.M., 1980. "A Spatial Data Structure for Geographic Information Systems," in H. Freeman and G.G. Pieroni, eds., Map Data Processing, Academic Press, New York. 4. Y. Wu, "Raster, Vector, and Automated Raster-to-Vector Conversion", in Moving Theory into Practice: Digital Imaging for Libraries and Archives, Book Eds. by A.R. Kenney and O.Y. Rieger, 2000, Research Libraries Group 5. Spatial Databases: A Tour, Shashi Shekhar, Sanjay Chawla Keywords: Vector: A vector is a quantity which can be specified by a magnitude and a direction. It could also be considered as a one-dimensional array in which the elements are arranged in order, thereby preserving the direction. For example, a polygon could be represented by a series of points either in clockwise or anti-clockwise direction. Raster: A digital image of a picture where the values are stored at each pixel. Data Structure: The organization of data in a computer represented in a logical form. For example, to represent a table of records, the data structure used would be an array. Each record would be represented by elements of an array. Data Modeling: One of the steps of designing a schema for a database. Spatial Networks: A network of spatial entities. For example, the road network is an excellent example of spatial networks. Entities involved in this example include the roads, the traffic at each road, its connectivity, etc.