Intro to GIS What is GIS ? • A computer system for - collecting, - storing, - manipulating, - analyzing, - displaying, and - querying geographically related information. Google Earth is an online GIS system History of GIS 1963-1977 innovation 1981-1999 Canadian land inventory system, Harvard Graphics & S.A. Lab, US Census Bureau ArcInfo, GPS, MapInfo, TIGER, NSDI, MapQuest 2000-present exploitation >$7 billion industry, >1 million users Demand for GIS professionals In the US in 2005 500k using GIS as part of job; growing at 15% each year Job market demand 75k/year 50k US students/year take a GIS class 4000 certified graduates/year Source: ESRI An example of GIS: composite of layers In general GIS cover 3 components Computer system Hardware Computer, plotter, printer, digitizer Software and appropriate procedures Spatially referenced or geographic data People to carry out various management and analysis tasks Geographic Data Geospatial data tells you where it is and attribute data tells you what it is. Metadata describes both geospatial and attribute data. In GIS, we call geographic data as GIS data or spatial data 1. Geospatial data Traditional method To represent the geographic data is paper-based maps Geology map Topographic map City street map (we still use it a lot) ... GIS: a simplified view of the real world Discrete features Points Lines Areas Networks A series of interconnecting lines Continuous features Road network River network Sewage network Surfaces Elevation surface Temperature surface Points A point is a 0 dimensional object and has only the property of location (x,y) Points can be used to Model features such as a well, building, power, pole, sample location ect. Other name for a point are vertex, node Point Lines A line is a one-dimensional object that has the property of length Lines can be used to represent road, streams, faults, dikes, maker beds, boundary, contacts etc. Lines are also called an edge, link, chain, arc In an ArcInfo coverage an arc starts with a node, has zero or more vertices, and ends with a node Line Areas (Polygons) A polygon is a two-dimensional object with properties of area and perimeter A polygon can represent a city, geologic formation, dike, lake, river, ect. Other name for polygons face, zone Area Topology needed A collection of numeric data which clearly describes adjacency, containment (coincidence), and connectivity between map features and which can be stored and manipulated by a computer. A set of rules on how objects relate to each other Major difference in file formats Higher level objects have special topology rules Two basic data models to represent these features Raster spatial data model Define space as an array of equally sized cells arranged in rows and columns. Each cell contains an attribute value and location coordinates Individual cells as building blocks for creating images of point, line, area, network and surface Continuous raster Discrete raster Numeric values range smoothly from one location to another, for example, DEM, temperature, remote sensing images, etc. Relative few possible values to repeat themselves in adjacent cells, for example, land use, soil types, etc. Vector spatial data model Use x-, y- coordinates to represent point, line, area, network, surface Point as a single coordinate pair, line and polygon as ordered lists of vertices, while attributes are associated with each features Usually are discrete features DIGITAL SPATIAL DATA • RASTER • VECTOR • Real World Source: Defense Mapping School National Imagery and Mapping Agency Raster and Vector Data Models Real World 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 G B G G B B BG G B G G B G BK BG B B B B Raster Representation 600 Trees 500 400 Y-AXIS 300 Trees House 200 River 100 100 200 300 400 500 X-AXIS 600 Vector Representation Source: Defense Mapping School National Imagery and Mapping Agency Example: Discrete raster Example: continuous raster Xie et al. 2005 Raster Real world Vector Heywood et al. 2006 Effects of changing resolution Heywood et al. 2006 Vector – Advantages and Disadvantages Advantages Good representation of reality Compact data structure Topology can be described in a network Accurate graphics Disadvantages Complex data structures Simulation may be difficult Some spatial analysis is difficult or impossible to perform Raster – Advantages and Disadvantages Advantages Simple data structure Easy overlay Various kinds of spatial analysis Uniform size and shape Cheaper technology Disadvantages Large amount of data Less “pretty” Projection transformation is difficult Different scales between layers can be a nightmare May lose information due to generalization Grid Properties Each Grid Cell holds one value even if it is empty. A cell can hold an index standing for an attribute. Cell resolution is given as its size on the ground. Point and Lines move to the center of the cell. Minimum line width is one cell. Rasters are easy to read and write, and easy to draw on the screen. 2. Attribute data Attribute data is about “what” of a spatial data and is a list or table of data arranged as rows and columns Rows are records (map features) Each row represents a map feature, which has a unique label ID or object ID Columns are fields (characteristics) Intersection of a column and a row shows the values of attributes, such as color, ownership, magnitude, classification,… examples 3. metadata Meta is defined as a change or transformation. Data is described as the factual information used as a basis for reasoning. Put these two definitions together and metadata would literally mean "factual information used as a basis for reasoning which describes a change or transformation." In GIS, Metadata is data about the data. It consists of information that describes spatial data and is used to provide documentation for data products. Metadata is the who, what, when, where, why, and how about every facet of the spatial data. According to the Federal Geographic Data Committee (FGDC), metadata is data about the content, quality, condition, and other characteristics of data. Why use and create metadata To help organize and maintain an organization's spatial data - Employees may come and go but metadata can catalogue the changes and updates made to each spatial data set and how each employee implemented them To provide information to other organizations and clearinghouses to facilitate data sharing and transfer - It makes sense to share existing data sets rather than producing new ones if they are already available To document the history of a spatial data set - Metadata documents what changes have been made to each data set, such as changes in geographic projection, adding or deleting attributes, editing line intersections, or changing file formats. All of these could have an effect on data quality. Metadata Should Include Data about Date of data collected. Date of coverage generated. Bounding coordinates. Processing steps. Software used RMSE, etc. From where original data came. Who did processing. Projection coordinate System Datum Units Spatial scale Attribute definitions Who to contact for more information See an example of non-standard metadata (see) Federal Geographic Data Committee’s (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM) The FGDC is developing the National Spatial Data Infrastructure (NSDI) in cooperation with organizations from State, local and tribal governments, the academic community, and the private sector. The NSDI encompasses policies, standards, and procedures for organizations to cooperatively produce and share geographic data. The objectives of the CSDGM are to provide a common set of terminology and definitions for the documentation of digital geospatial data. CSDGM (FGDC-STD-001-1998) Metadata = Identification_Information Data_Quality_Information Spatial_Data_Organization_Information Spatial_Reference_Information Entity_and_Attribute_Information Distribution_Information Metadata_Reference_Information Connect to http://www.fgdc.gov/metadata/csdgm/ 4. Geodatabase Before geodatabase, in one GIS project, many GIS files (spatial data and nonspatial data) are stored separated. So for a large GIS project, the GIS files could be hundreds. Within a geodatabase, all GIS files (spatial data and nonspatial data) in a project can be stored in one geodatabase, using the relational database management system (RDMS) Types of geodatabases personal enterprise Personal Geodatabase The personal geodatabase is given a name of filename.mdb that is browsable and editable by the ArcGIS, and it can also be opened with Microsoft Access. It can be read by multiple people at the same time, but edited by only one person at a time. maximum size is 2 GB. Multiuser Geodatabase Multiuser (ArcSDE or enterprise) geodatabase are stored in IBM DB2, Informix, Oracle, or Microsoft SQL Server. It can be edited through ArcSDE by many users at the same time, is suitable for large workgroups and enterprise GIS implementations. no limit of size. support raster data.