Geographic Information Systems: an introduction Week III Data models cont’d The Nature of Geographic Data Last week • Representing geographic data – E.g. continuous fields, discrete objects • Data models – E.g. Raster, vector Data Model Levels Humanoriented Reality Conceptual Model Logical Model Computeroriented Physical Model Increasing Abstraction Logical Data Models & Applications • • • • • • • CAD Graphical Image Raster TIN Geo-relational Object • • • • • • • Engineering design Simple mapping Image processing and analysis Spatial analysis / modeling Surface /terrain analysis / modeling Geoprocessing geometric features Features with behavior Raster and Vector Models • Raster – implementation of field conceptual model – Array of cells used to represent objects – Useful as background maps and for spatial analysis • Vector – implementation of discrete object conceptual model – Point, line and polygon representations – Widely used in cartography, and network analysis Rasters and Vectors • How to represent phenomena conceived as fields or discrete objects? • Raster – Divide the world into square cells – Register the corners to the Earth – Represent discrete objects as collections of one or more cells – Represent fields by assigning attribute values to cells – More commonly used to represent fields than discrete objects Legend Mixed conifer Douglas fir Oak savannah Grassland Raster representation. Each color represents a different value of a nominalscale field denoting land cover class. Characteristics of Rasters • Pixel size – The size of the cell or picture element, defining the level of spatial detail – All variation within pixels is lost • Assignment scheme – The value of a cell may be an average over the cell, or a total within the cell, or the commonest value in the cell – It may also be the value found at the cell’s central point Vector Data • Used to represent points, lines, and areas • All are represented using coordinates – One per point – Areas as polygons • Straight lines between points, connecting back to the start • Point locations recorded as coordinates – Lines as polylines • Straight lines between points Vector Data POINTS POLYGON LINES NODES ANNOTATION Vector - Land Records Surveyed feature 20.37’ 26.23’ 13 12 30.5’ 26.23’ GIS Survey Link / Survey point / / / / / 9 Computation Raster vs Vector • Volume of data – Raster becomes more voluminous as cell size decreases – Raster is vaster • Source of data – Remote sensing, elevation data come in raster form – Vector favored for administrative data – Vector is more appropriate when storing large amounts of attribute information. • Software – Some GIS better suited to raster, some to vector Data Model Levels Humanoriented Reality Conceptual Model Logical Model Computeroriented Physical Model Increasing Abstraction Data Models and Spatial Analysis • • • • • Raster v. vector? Picture of a large burn Picture of an urban landscape Picture of burning houses Map of burning houses Geo-relational Model Formats of feature data models in ArcGIS • Georelational models – Coverages – Shapefiles • Database Management System (DBMS) – Geodatabase Coverages Traditional model for complex (multiple feature types) geoprocessing environments Coverages primary . composite routes sections regions secondary annotation ticks links Shapefiles Simple, open, “flatfile” based format for single data features Geodatabase Object-based data model where all features are contained in database; features can exhibit “behavior” Compression techniques No compression Run length encoding (lossless) ADRG - National Imagery and Mapping Agency (NIMA)'s ARC Digitized Raster Graphics BIL - Band Interleaved by Line (image format linked with satellite derived imagery) CADRG - National Imagery and Mapping Agency (NIMA)'s Compressed ARC Digitised Raster Graphics (nominal compression of 55:1 over ADRG) CIB - National Imagery and Mapping Agency (NIMA)'s Controlled Image Base (type of Raster Product Format) Digital raster graphic (DRG) - digital scan of a paper USGS topographic map ECW - Enhanced Compressed Wavelet (from ERMapper). A compressed wavelet format, often lossy. ESRI grid - binary and ASCII raster formats used by ESRI GeoTIFF - TIFF variant enriched with GIS relevant metadata IMG - ERDAS IMAGINE image file format MrSID - Multi-Resolution Seamless Image Database (by Lizardtech). A compressed wavelet format, often lossy. Vector formats Geography Markup Language (GML) - XML based open standard (by OpenGIS) for GIS data exchange DXF - Contour elevation plots in AutoCAD DXF format Shapefile - ESRI's open, hybrid vector data format using SHP, SHX and DBF files Simple Features - Open Geospatial Consortium specification for vector data MapInfo TAB format - MapInfo's vector data format using TAB, DAT, ID and MAP files National Transfer Format (NTF) - National Transfer Format (mostly used by the UK Ordnance Survey) TIGER - Topologically Integrated Geographic Encoding and Referencing Vector Product Format - National Imagery and Mapping Agency (NIMA)'s format of vectored data for large geographic databases. GeoMedia - Intergraph's Microsoft Access based format for spatial vector storage. ISFC - Intergraph's Microstation based CAD solution attaching vector elements to a relational Microsoft Access database Personal Geodatabase - ESRI's closed, integrated vector data storage strategy using Microsoft's Access MDB format Coverage - ESRI's closed, hybrid vector data storage strategy. Legacy ArcGIS Workstation / ArcInfo format with reduced support in ArcGIS Desktop lineup Grid formats (for elevation) USGS DEM - The USGS' Digital Elevation Model DTED - National Imagery and Mapping Agency (NIMA)'s Digital Terrain Elevation Data GTOPO30 - Large complete Earth elevation model at 30 arc seconds SDTS - The USGS' successor to DEM Other formats Binary Terrain - The Virtual Terrain Project's Binary Terrain format Well-known text (WKT) – ASCII spatial projection description (ESRI uses a *.prj extension) Well-known binary (WKB) - Binary spatial projection description Geo-relational Model Formats of feature data models in ArcGIS • Georelational models – Coverages – Shapefiles • Object model – Geodatabase Coverages Traditional model for complex geoprocessing environments •Topologically correct •Data processing and spatial analysis •Complex data structure •Difficult interchange (E00) •Multiple feature types primary . composite routes sections regions secondary annotation ticks links Coverages Shapefiles Simple, open, “flatfile” based format for single data features •Single feature type within one file •No topology •Easy to transport •Open Shapefiles Geodatabase Object-based data model where all features are contained in database; features can exhibit “behavior” •Multiple features stored as tables •Data processing topology •Object oriented instead of restricted to points, lines, polys •Resides in traditional database Geodatabase Topology • Field within mathematics. • The study of the general abstract nature of continuity or "closeness" on spaces. • System administrators actually describe computer networks as being topologic – Bus topology – Ring topology – Star topology • Leonhard Euler – – The solution of a problem relating to the geometry of position 1736. Euler was aware that he was dealing with a different type of geometry where distance was not relevant. Consider the problem of building a fusion reactor which confines a plasma by a magnetic field. The solution of a problem relating to the geometry of position Topology • Science and mathematics of geometric relationships – – – – Simple features + topological rules Connectivity Adjacency Shared nodes / edges • Topology uses – Data validation – Spatial analysis (e.g. network tracing, polygon adjacency) Topology and GIS • Topology in GIS is generally defined as the spatial relationships between adjacent or neighboring features. • Two types of topology: – Polygon topology – Line topology • three advantages of incorporating topology in GIS databases: data management, data correction and spatial analysis Data correction and storage Topology and spatial analysis • Adjacency • Connectivity (network models) • Containment Polygon Topology Model Polygon Topology Contiguity Definitions • A method of analysis is spatial if the results depend on the locations of the objects being analyzed – move the objects and the results change – results are not invariant under relocation • Spatial analysis requires both attributes and locations of objects – a GIS has been designed to store both Spatial Autocorrelation • Spatial autocorrelation is determined both by similarities in position, and by similarities in attributes – Sampling interval – Self-similarity Tobler Spatial autocorrelation measures n i,j zi c i,j wi,j number of objects in the sample any two of the objects the value of the attribute of interest for object i the similarity of i’s and j’s attributes the similarity of i’s and j’s locations