GI Systems and Science January 30, 2012 Points to Cover Recap of what we covered so far A concept of database Database Management System (DBMS) Database data models Relational database model Object-oriented approach Relationship between spatial and attribute data in GIS Data management operations in ArcGIS Spatial Data Modeling Stage of data modeling in GIS Available options Indentifying the spatial features form the real world that are of interest in the context of the research question Separating real world into layers of feaures Choosing how to represent the features Points Lines Polygons Networks Surfaces Choosing an appropriate spatial data model Raster model Vector model Selecting an appropriate spatial data structure to store the model within the computer Basic raster data structure Compacted raster data structure Basic vector data structure Point dictionary structure Topological vector data structure Data Types in GIS Spatial data Attribute data Concept of Database A GIS can be described as a spatially-enabled database Conventional databases store attribute data GIS stores both attribute and spatial data Database is a set of structured data that are related to each other in some way Organized filing cabinet Phone book On-line database of academic publications Concept of Database Many of human activities produce data that is stored and managed in the database environment Our society is generating tremendous amounts of data on daily basis Data has become a valuable commodity To realize its value, the data need to be managed and shared efficiently Concept of Database One of the approaches to turning data into information is a database Database is more than just a container for storing data Organize data into more meaningful and usable form Have analytical capabilities Concept of Database The database approach provides the following benefits Ease of accessing data Prevention of unnecessary duplication of data Data stored independently of the application for which they are used Secure controlled access to data Standards facilitate data exchange Data in the database can be shared by different users Manual databases versus computer databases Which are more effective and efficient? Database Management System A DBMS is a computer program that controls the storage, retrieval and modification of data in a database (Dale and McLaughlin, 1988) Comprises tools that perform these functions Allows users to deal with the data without knowing much about the database itself Figure 4.4 Source: Heywood et al., 2011 Relational Database Data Model A DBMS manages data that are organized using a database data model Set of rules about how the objects and relationships between them should be represented Number of different data models used for handling attribute data in GIS Relational data model is most used by GIS The relational data model is based on concepts proposed by Frank Codd (1970) Relational Database Data Model Data are organized and stored in a series of twodimensional tables, each of which contains records for one type of entity Each entity has a unique identifier value assigned to it Unique identifiers allow to link (relate) data in two or more different tables This structure makes possible applying queries to one or more tables Relational Database Data Model Types of relationships possible between entities in a relational database Figure 4.7 Source: Heywood et al., 2011 Relational Database Data Model Figure 4.8 Source: Heywood et al., 2011 Relational Database Data Model Querying a relational database Queries are built on expressions based on relational algebra which in turn is based on Boolean logic SQL, standard query language, has been developed to facilitate the querying of relational databases Advantages: completeness, simplicity, pseudo-English language style Disadvantages: was not developed to handle geographical concepts such as ‘near to’, ‘far from’ or connected to Object-Oriented Database Approach More realistic approach to representing spatial entities in the database environment Encapsulation: Object = State + Behaviour State: set of values of attributes describing a spatial entity Behaviour: methods of operating on it Composite object Hierarchy of objects Subclass Superclass Figure 4.17 Source: Heywood et al., 2011 Object-Oriented Database Approach Allows arranging objects into hierarchies allows differential assignment of behaviours (Inheritance) Behaviour of subclass objects = ‘own’ behaviour + ‘superclass’ behaviour Benefits of the OO approach for GIS No differentiation between spatial and attribute data Works better for graphic operations Disadvantage of the OO approach Represents world as a series of rigidly bounded objects Still under development Data management in ArcGIS The primary data storage mechanism in ArcGIS is the geodatabase A collection of geographic datasets of various types held in a common ‘container’ such as a database file or a database application Based on an object-relational model Relations (tables) function as objects Behavior is supplied through the geodatabase application logic implemented as a series of system tables A key geodatabase strategy is to leverage the database management system (DBMS) Extends SQL application to feature geometry Data management in ArcGIS Geodatabases comprise Three primary dataset types Feature classes Raster datasets Tables Database schema: metatables Source: ArcGIS 10 Help files containing information about object behavior and relationships, maintaining data integrity Topologies Networks Subtypes * Data management in ArcGIS Types of geodatabases File geodatabases Stored as folders in a file system. Each dataset is held as a file that can scale up to 1 TB in size. Personal geodatabases Datasets are stored within a Microsoft Access data file Limited in size to 2 GB ArcSDE geodatabases Stored in a relational database using Oracle, Microsoft SQL Server, IBM DB2, IBM Informix, or PostgreSQL. Multiuser geodatabases which are unlimited in size Data management in ArcGIS Figure 4.9 Source: Heywood et al., 2011 Relationship between data in GIS Raster datasets Simple raster datasets no separate attribute data table Rater datasets with attribute tables Within a geodatabase, the raster attribute table is saved within the raster dataset and hidden from the user Source: ArcGIS 10 Help files * Relationship between data in GIS Vector datasets Data Management Operations The supported attribute column types in the geodatabase Source: ArcGIS 10 Help files Data Management Operations Queries Attribute query Used to find features based one particular attribute Locational query Used to find features with locations that meet certain conditions Works with four types of relationships: near, adjacent to, intersect, and inside The result of queries is a set of selected features Data Management Operations Joins Associating two or more tables based on a common field (key) One-to-one relationship One-to-many relationship