Delivering Location Intelligence with Spatial Data White Paper Published: August 2007 Updated: July 2008 Summary: The growing ability of businesses and consumers to quickly absorb large volumes of data, together with the increased availability of digital maps and spatially-enabled applications has created an unprecedented opportunity to incorporate geographic factors into decision making processes and analysis. The new spatial support in Microsoft SQL Server™ 2008 can help you to make better decisions through visual analysis of location data that can be stored and manipulated in a SQL Server database... Contents Introduction ................................................................................................................. 1 Comprehensive Spatial Support .................................................................................. 2 Spatial Models ......................................................................................................... 2 SQL Server 2008 Spatial Data Types ...................................................................... 4 Spatial Data Type Methods ..................................................................................... 4 High Performance Spatial Data Capabilities ................................................................ 6 Built-in Spatial Views ................................................................................................... 7 Location-Aware Application Extensibility ..................................................................... 8 Importing Spatial Data ............................................................................................. 8 Using Spatial Data................................................................................................... 9 Conclusion ................................................................................................................ 10 1 1 Introduction Today’s information workers and consumers deal with massive amounts of information of different kinds, from traditional tables of business data in spreadsheets and databases, to online media-based data such as video, photographs, and music. The recent trend towards mash up solutions in which information and content from multiple sources is combined to create versatile online applications is indicative of the way that computer users use highly integrated solutions to make sense of the vast amount of information that is available to them. At the same time, advances in technology have led to the proliferation of geographical services and devices, including online mapping solutions such as Microsoft® Virtual Earth™, and inexpensive global positioning system (GPS) solutions. Technology that was once the preserve of geographic information system (GIS) specialists is now widely available to everyone. These two factors bring new expectations and opportunities for software applications. The ubiquity of geographical services, and the increasing sophistication with which users consume data means that spatial information is just another component to be incorporated into a solution and used as a basis for making better decisions and providing higher value services. Spatial data can be used in many ways, as the following list of examples demonstrates: A retailer Web site can display the locations of all stores as pins on a map, and find the nearest store to a given zip code A sales manager can define geographic sales regions, and use them to match customers to sales representatives and perform analysis of sales performance. An architect can create plans for a new building, and overlay those plans onto a map of the proposed site. A driver can find the distance between two locations, and plan a route. A real estate agent can quickly identify properties that match a client’s requirements, such as houses over 20,000 square feet in size that are on the shore of Lake Washington. A mobile application can find all gas stations within 10 miles of a given location. These examples represent only a few of the possibilities created by the integration of spatial data into software applications. SQL Server 2008 provides support for geographical data through the inclusion of new spatial data types, which you can use to store and manipulate locationbased information. The spatial support in SQL Server 2008 can help users to make better decisions through analysis of location data in scenarios such as: Consumer-focused location-based information Customer-base management and development 1 Delivering Location Intelligence with Spatial Data 1 Environmental-related data impact, analysis, and planning Financial and economic analysis in communities Government-based planning and development analysis Market segmentation and analysis Scientific research study design and analysis Real-estate development and analysis This whitepaper provides a high-level introduction to the comprehensive spatial data support in SQL Server 2008, and describes its high-performance spatial capabilities and location-aware application extensibility. Comprehensive Spatial Support SQL Server 2008 provides comprehensive spatial support through new data types. To understand how you can use these data types to store locationbased data, you first need to understand a little of how spatial data, and in particular geospatial data, works. Spatial Models Spatial data is used to represent points, lines, and areas on a surface. Most commonly, these elements relate to actual physical locations on Earth, so can be described a geospatial data. Most of us are familiar with this concept through the use of globes and maps, which generally show multiple geographic features and their relative locations. Geodetic Spatial Models The problem with describing a location on a planetary surface is that planets are not flat. Earth is a very complex object that can be reasonably approximated by an oblate spheroid, a (slightly) flattened sphere. An accurate representation of the Earth is usually manifested as a globe, in which locations on the surface of the planet are described in terms of their latitude and longitude, which is measured in degrees from the equator and the international date-line respectively. This approach to modeling geographic locations is called a geodetic model, and provides an accurate way to define locations and objects on a globe as shown in Figure 1. There are a number of different geodetic models in use throughout the world, including the Airy 1830 ellipsoid used in the United Kingdom’s Ordnance Survey geographic system, and the WGS84 ellipsoid used by the world’s GPS solutions. 2 Delivering Location Intelligence with Spatial Data 2 Figure 1: A geodetic model Planar Spatial Models While a geodetic model provides the most accurate way to represent geographic features, working with an ellipsoid and taking planetary curvature into account when calculating distances was extremely difficult before computing when people had to work with flat maps. Historically, it has been much easier to work with two-dimensional surfaces, or planes, so it is common to find location-based data represented in various flat (planar) models. To work with geospatial data on a flat two dimensional surface, a projection is created to flatten the geographical objects on the spheroid. As with geodetic models, there are many mathematical models used to project the geographical features of Earth onto a flat surface, including the Mercator projection, the Peters projection, and the Lambert Conformal Conic projection. Figure 2 shows a planar model of the Earth based on the Mercator projection. Figure 2: A planar model Regardless of which projection is used, converting geographical data from a spheroid to a flat surface always results in some distortion of the shape, size, 3 Delivering Location Intelligence with Spatial Data 3 or position (or all three) of the geographic features in the resulting map, which is why in the projection shown in Figure 2, Greenland is shown as being almost the same size as the United States of America, even though in reality its land mass is much smaller. Generally, the larger the surface area being projected, the more distortion occurs – with the features at the furthest edges of the map exhibiting more distortion than those at the center. For this reason, planar models work best for small geographical areas such as individual countries, states, and towns, or for non-projected spatial surfaces such as interior floor plans. SQL Server 2008 Spatial Data Types SQL Server 2008 provides the geography data type for geodetic spatial data, and the geometry data type for planar spatial data. Both are implemented as Microsoft .NET Framework Common Language Runtime (CLR) types, and can be used to store different kinds of geographical elements such as points, lines, and polygons. Both data types provide properties and methods that you can use to perform spatial operations such as calculating distances between locations and finding geographical features that intersect one another (such as a river that flows through a town.) The geography Data Type The geography data type provides a storage structure for spatial data that is defined by latitude and longitude coordinates. Typical uses of this kind of data include defining roads, buildings, or geographical features as vector data that can be overlaid onto a raster-based map that takes into account the curvature of the Earth, or for calculating true great circle distances and trajectories for air transport where the distortion inherent in a planar model would cause unacceptable levels of inaccuracy. The geometry Data Type The geometry data type provides a storage structure for spatial data that is defined by coordinates on an arbitrary plane. This kind of data is commonly used in regional mapping systems, such as the state plane system defined by the United States government, or for maps and interior floor plans where the curvature of the Earth does not need to be taken into account. The geometry data type provides properties and methods that are aligned with the Open Geospatial Consortium (OGC) Simple Features Specification for SQL and enable you to perform operations on geometric data that produce industry-standard behavior. Spatial Data Type Methods Both spatial data types in SQL Server 2008 provide a comprehensive set of instance and static methods that you can use to perform queries and operations on spatial data. For example, the following code sample creates two tables for a city mapping application; one contains geometry values for the districts in the city, and the other contains geometry values for the streets in 4 Delivering Location Intelligence with Spatial Data 4 the city. A query then retrieves the city streets and the districts that they intersect. CREATE TABLE Districts ( DistrictId int IDENTITY (1,1), DistrictName nvarchar(20), DistrictGeo geometry); GO CREATE TABLE Streets ( StreetId int IDENTITY (1,1), StreetName nvarchar(20), StreetGeo geometry); GO INSERT INTO Districts (DistrictName, DistrictGeo) VALUES ('Downtown', geometry::STGeomFromText ('POLYGON ((0 0, 150 0, 150 150, 0 150, 0 0))', 0)); INSERT INTO Districts (DistrictName, DistrictGeo) VALUES ('Green Park', geometry::STGeomFromText ('POLYGON ((300 0, 150 0, 150 150, 300 150, 300 0))', 0)); INSERT INTO Districts (DistrictName, DistrictGeo) VALUES ('Harborside', geometry::STGeomFromText ('POLYGON ((150 0, 300 0, 300 300, 150 300, 150 0))', 0)); INSERT INTO Streets (StreetName, StreetGeo) VALUES ('First Avenue', geometry::STGeomFromText ('LINESTRING (100 100, 20 180, 180 180)', 0)) GO INSERT INTO Streets (StreetName, StreetGeo) 5 Delivering Location Intelligence with Spatial Data 5 VALUES ('Mercator Street', geometry::STGeomFromText ('LINESTRING (300 300, 300 150, 50 50)', 0)) GO SELECT StreetName, DistrictName FROM Districts d, Streets s WHERE s.StreetGeo.STIntersects(DistrictGeo) = 1 ORDER BY StreetName The results from this query are shown in the following table. Query Results StreetName DistrictName First Avenue Downtown First Avenue Harborside Mercator Street Downtown Mercator Street Green Park Mercator Street Harborside High Performance Spatial Data Capabilities The spatial data types in SQL Server 2008 are implemented as CLR system types. SQL Server 2008 increases the maximum size for CLR types in the database from the 8000 bytes limit that was imposed in SQL Server 2005 to 2GB, which makes it possible to store extremely complex spatial data elements, such as polygons, which are defined by a large number of points. By storing spatial data in relational tables, SQL Server 2008 makes it possible to combine spatial data with any other kind of business data; this removes the need to maintain a separate, dedicated spatial data store and enables high performance queries that do not need to combine data from multiple external sources. Performance of queries against spatial data is further enhanced by the inclusion of spatial index support in SQL Server 2008. You can index spatial data with an adaptive multi-level grid index that is integrated into the SQL Server database engine. Spatial indexes consist of a grid-based hierarchy in which each level of the index subdivides the grid sector that is defined in the level above. A conceptual model of a spatial index is shown in Figure 3. 6 Delivering Location Intelligence with Spatial Data 6 Figure 3: A spatial index The SQL Server query optimizer makes cost-based decisions on which indexes to use for a given query, and because spatial indexes are an integral part of the database engine, SQL Server can make cost-based decisions about whether or not to use a particular spatial index, just like any other index. Built-in Spatial Views Use the new spatial results tab to easily view spatial query results directly from within SQL Server Management Studio. This tab offers simple projection and zoom/pan capabilities for quick investigation. 7 Delivering Location Intelligence with Spatial Data 7 Figure 4: A spatial views tab within Management Studio Location-Aware Application Extensibility The geography and geometry data types are supported in the various SQL Server 2008 editions that scale from single-user desktop applications to enterprise-level data stores and enable you to build geospatial solutions of any scale. This broad support brings spatial data capabilities to all kinds of applications without the need for expensive proprietary geospatial solutions. Importing Spatial Data The geography and geometry data types include methods for importing and exporting data in the Well Known Text (WKT) and Well Known Binary (WKB) formats for geographic data that are defined by the OGC, as well as the commonly used Geographic Markup Language (GML) format, which makes it easy to import geographic data from source that supports these standards. Geographical data is readily available from a number of government and commercial sources, and can be exported relatively easily from many existing GIS applications and GPS systems. Microsoft maintains close relationships with a number of third-party GIS vendors and geospatial data solution providers, which helps to ensure strong compatibility between SQL Server 2008 and a wide range of industry-proven tools and utilities for importing, exporting, and manipulating spatial data. 8 Delivering Location Intelligence with Spatial Data 8 Using Spatial Data As already demonstrated in this whitepaper, the geography and geometry data types provide methods that you can use to perform spatial operations on your data. Because these data types are implemented as .NET CLR types, you can easily create client applications that consume spatial data from SQL Server through Microsoft data programmability technologies and use clientside managed code to call methods on instances of the spatial types. This enables you to build powerful applications to work with your spatial data and integrate it with other location-aware applications and services such as Virtual Earth. For example, Figure 4 shows an application in which spatial data from SQL Server 2008 is integrated with Virtual Earth. The application shows the census blocks in a ZIP Code region with the number of restaurants computed. The number of restaurants in each block, relative to the size of the block yields a density value, which appears in the display as a region shaded from white (low density) to red (highest density). Figure 5: Spatial data integrated with Virtual Earth 9 Delivering Location Intelligence with Spatial Data 9 Conclusion As the integration of geospatial information into applications becomes more prevalent, application developers will increasingly require database systems that can store and manipulate spatial data. With the introduction of the geography and geometry data types, SQL Server 2008 provides a comprehensive, high-performance, and extensible data storage solution for spatial data, and enables organizations of any scale to integrate geospatial features into their applications and services. For more information: Microsoft SQL Server 2008 http://www.microsoft.com/sql 10 Delivering Location Intelligence with Spatial Data 10 Please give us your feedback: Did this paper help you? Tell us on a scale of 1 (poor) to 5 (excellent), how would you rate this paper and why have you given it this rating? For example: Are you giving it a high rating because it has good examples, excellent screenshots, clear writing, or another reason? Are you giving it a low rating because it has poor examples, fuzzy screenshots, unclear writing? This feedback will help us improve the quality of white papers we release. Send feedback. The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This white paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in, or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. © 2008 Microsoft Corporation. All rights reserved. Microsoft, PowerShell, SharePoint, SQL Server, Visual Basic, Visual C#, Visual Studio, Windows, Windows Server, and the Server Identity Logo are trademarks of the Microsoft group of companies. All other trademarks are property of their respective owners.