PPT

advertisement
Intro to advanced GIS and
a review of basic GIS
Outlines




About the class setting
Materials to be covered and scheduled
Quick review of GIS basics
First lab (Lab 1)
What covered in introGIS





Geospatial Tech
GIS
GIS data
GIS data type
GIS data format


GIS a simplified view of Earth
Two types of coordinate systems


Geographic coordinate system
Projected coordinate system





Conic,
cylindrical,
Azimuthal
Distortions (shape, size, distance, direction)
Two important things


Define
Project
Geographic Coordinate System
Unprojected
5
Projected Coordinate System
6
What is GIS ?
• A computer system for
- collecting,
- storing,
- manipulating,
- analyzing,
- displaying, and
- querying
geographically related
information.
In general GIS cover 3
components

Computer system

Hardware




Computer, plotter, printer, digitizer
Software and appropriate procedures
Spatially referenced or geographic
data
People to carry out various
management and analysis tasks
Geographic Data

Geospatial data tells
you where it is and
attribute data tells
you what it is.
Metadata describes
both geospatial and
attribute data.
In GIS, we call geographic data as GIS data or spatial data
1. Geospatial data
Traditional method

To represent the geographic data is
paper-based maps




Geology map
Topographic map
City street map (we still use it a lot)
...
Characteristics of spatial data

“mappable” characteristics:





Location (coordinate system, will be lectured
later)
Size is calculated by the amount (length,
area, perimeter) of the data
Shape is defined as shape (point, line, area)
of the feature
Discrete or continuous
Spatial relationships
Discrete and continuous

Discrete data are distinct features that
have definite boundaries and identities


A district, houses, towns, agricultural fields,
rivers, highways, …
Continuous data has no define borders
or distinctive values, instead, a
transition from one value to another

Temperature, precipitation, elevation, ...
GIS: a simplified view of the real
world

Discrete features



Points
Lines
Areas
Networks

A series of interconnecting
lines



Continuous features

Road network
River network
Sewage network
Surfaces


Elevation surface
Temperature surface
Problems caused by the simplified
features may still exist, but let’s live on it

Dynamic nature (not static)




Identification of discrete and continuous features



Forest grow
River channel change
City expand or decline
Road to be a line or a area?
Scale
Some may not fit to any type of features: fuzzy
boundaries

Transition area between woodland and grassland
Lets do not worry about these problems now!!! Just keep in mind
Topology needed

A collection of numeric data which clearly describes
adjacency, containment (coincidence), and
connectivity between map features and which can
be stored and manipulated by a computer.

A set of rules on how objects relate to each other

Major difference in file formats

Higher level objects have special topology rules
Two basic data models to
represent these features

Raster spatial data model



Define space as an array of equally sized cells arranged in rows and
columns. Each cell contains an attribute value and location
coordinates
Individual cells as building blocks for creating images of point, line,
area, network and surface
Continuous raster


Discrete raster


Numeric values range smoothly from one location to another, for
example, DEM, temperature, remote sensing images, etc.
Relative few possible values to repeat themselves in adjacent cells, for
example, land use, soil types, etc.
Vector spatial data model

Use x-, y- coordinates to represent point, line, area, network,
surface


Point as a single coordinate pair, line and polygon as ordered lists of
vertices, while attributes are associated with each features
Usually are discrete features
DIGITAL SPATIAL DATA
• RASTER
• VECTOR
• Real World
Source: Defense Mapping School
National Imagery and Mapping Agency
Raster and Vector Data Models
Real World
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8 9 10
G
B
G G
B
B
BG G
B G G
B G
BK
BG
B B
B
B
Raster Representation
600
Trees
500
400
Y-AXIS
300
Trees
House
200
River
100
100 200
300 400 500
X-AXIS
600
Vector Representation
Source: Defense Mapping School
National Imagery and Mapping Agency
Example: Discrete raster
Example: continuous raster
Xie et al. 2005
Raster
Real world
Vector
Heywood et al. 2006
Effects of changing resolution
Heywood et al. 2006
Vector – Advantages and
Disadvantages

Advantages





Good representation of reality
Compact data structure
Topology can be described in a network
Accurate graphics
Disadvantages



Complex data structures
Simulation may be difficult
Some spatial analysis is difficult or impossible to
perform
Raster – Advantages and
Disadvantages

Advantages






Simple data structure
Easy overlay
Various kinds of spatial analysis
Uniform size and shape
Cheaper technology
Disadvantages





Large amount of data
Less “pretty”
Projection transformation is difficult
Different scales between layers can be a nightmare
May lose information due to generalization
GIS data formats (file formats)

Vector data


Shapefiles
Coverages
TIN (e.g. elevation can be stored as
TIN)


Raster data

Triangulated Irregular Network
Grid (e.g. elevation can be stored as
Grid)
Image (e.g. elevation can be stored as
image, all remote sensing images)
Shape Files




Nontopological
Advantages no overhead to process
topology
Disadvantages polygons are double
digitized, no topologic data checking
At least 3 files .shp .shx .dbf
Coverages




Original ArcInfo Format
Directory With Several Files
Database Files are stored in the Info
Directory
Uses Arc Node Topology



Containment (coincident)
Connectivity
Adjacency
Evolution of Vector Data Model

ESRI, Inc.



Arc/Info: coverages
ArcView: shapefiles
ArcGIS: geodatabase
Geodatabase componentsvector data and table


Primary (basic) components
- feature classes,
- feature datasets,
- nonspatial tables.
complex components
building on the basic
components:
- topology,
- relationship classes,
- geometric networks
Geodatabase componentsRaster data



Raster data referenced only in personal geodatabase
Raster data physically stored in multiuser geodatabse
Raster datasets and raster catalogs


A raster dataset is created from one or more individual rasters. When
creating a raster dataset from multiple rasters, the data is mosaicked,
or aggregated, into a single, seamless dataset in which areas of overlap
have been removed. The input rasters must be contiguous (adjacent)
and have the same properties, including the same coordinate system,
cell size, and data format. For each raster dataset (.img, grid, JPEG,
MrSID, TIFF), ArcGIS creates an ERDAS IMAGINE file (.img).
A raster catalog is defined as a table in the geodatabase which you can
view like any other table in ArcCatalog. Each raster in the catalog is
represented by a row in the table. It contains a collection of rasters
that can be noncontiguous, stored in different formats, and have other
different properties. In order to view all the rasters in the catalog, they
must have the same coordinate system and a common geographic
extent
2. Attribute data

Attribute data is about “what” of a
spatial data and is a list or table of data
arranged as rows and columns

Rows are records (map features)



Each row represents a map feature, which has
a unique label ID or object ID
Columns are fields (characteristics)
Intersection of a column and a row shows
the values of attributes, such as color,
ownership, magnitude, classification,…
examples
Relational database

A relational database is a collection of tables, also called
relations, which can be connected to each other by keys.


A primary key represents one or more attributes whose values
can uniquely identify a record in a table. Its counterpart in
another table for the purpose of linkage is called a foreign key
Advantages


Each table in the database can be prepared, maintained, and
edited separately from other tables
Efficient data management and processing, since linking tables
query and/or analysis is often temporary
Join and relate tables


Once tables are separated as
relational tables, then two operations
can be used to link those tables
during query and analysis
 Join, brings together two tables
based on a common key.
 Relate, connects two tables
(based on keys) but keeps the
tables separate.
Keys do not have to have the same
name but must be of the same data
type
Join
relate
Join
relate
The joined table
The joined table will only preserved within the map
document-the tables remain separate on disk-and can be
removed at any time
Related tables
The related table will only preserved within the map document-the
tables remain separate on disk-and can be removed at any time
3. metadata



Meta is defined as a change or transformation. Data is
described as the factual information used as a basis for
reasoning. Put these two definitions together and
metadata would literally mean "factual information used as
a basis for reasoning which describes a change or
transformation."
In GIS, Metadata is data about the data. It consists of
information that describes spatial data and is used to
provide documentation for data products. Metadata is the
who, what, when, where, why, and how about every
facet of the spatial data.
According to the Federal Geographic Data Committee
(FGDC), metadata is data about the content, quality,
condition, and other characteristics of data.
Why use and create metadata



To help organize and maintain an organization's spatial
data
- Employees may come and go but metadata can
catalogue the changes and updates made to each spatial
data set and how each employee implemented them
To provide information to other organizations and
clearinghouses to facilitate data sharing and transfer
- It makes sense to share existing data sets rather
than producing new ones if they are already available
To document the history of a spatial data set
- Metadata documents what changes have been made
to each data set, such as changes in geographic projection,
adding or deleting attributes, editing line intersections, or
changing file formats. All of these could have an effect on
data quality.
Metadata Should Include Data
about













Date of data collected.
Date of coverage generated.
Bounding coordinates.
Processing steps.

Software used

RMSE, etc.
From where original data came.
Who did processing.
Projection
coordinate System
Datum
Units
Spatial scale
Attribute definitions
Who to contact for more information
See an example of non-standard metadata (see)
Federal Geographic Data Committee’s
(FGDC) Content Standard for Digital
Geospatial Metadata (CSDGM)


The FGDC is developing the National Spatial Data
Infrastructure (NSDI) in cooperation with organizations
from State, local and tribal governments, the academic
community, and the private sector. The NSDI
encompasses policies, standards, and procedures for
organizations to cooperatively produce and
share geographic data.
The objectives of the CSDGM are to provide a common
set of terminology and definitions for the documentation
of digital geospatial data.
CSDGM (FGDC-STD-001-1998)

Metadata =







Identification_Information
Data_Quality_Information
Spatial_Data_Organization_Information
Spatial_Reference_Information
Entity_and_Attribute_Information
Distribution_Information
Metadata_Reference_Information
Connect to http://www.fgdc.gov/metadata/csdgm/
Metadata tools

Metadata editors:
-

tkme / USGS
ArcCatalog / ESRI
SMMS / Intergraph
FGDCMETA / Illinois State Geological Survey
xtme / USGS
Metadata utilities (check compliance and export to text,
HTML,XML, or SGML):
- mp / USGS
- MP batch / Intergraph
- ArcCatalog powered by mp/ ESRI

mp: Metadata Parser
Metadata Server
- Isite / FGDC
- GeoConnect Geodata Management Server / Intergraph
- ArcIMS Metadata Server / ESRI
4. Geodatabase


Before geodatabase, in one GIS project, many
GIS files (spatial data and nonspatial data) are
stored separated. So for a large GIS project, the
GIS files could be hundreds.
Within a geodatabase, all GIS files (spatial data
and nonspatial data) in a project can be stored
in one geodatabase, using the relational
database management system (RDMS)
Types of geodatabases


personal
enterprise
Personal Geodatabase

The personal geodatabase
is given a name of
filename.mdb that is
browsable and editable by
the ArcGIS, and it can also
be opened with Microsoft
Access. It can be read by
multiple people at the same
time, but edited by only
one person at a time.
maximum size is 2 GB.
Multiuser Geodatabase


Multiuser (ArcSDE or enterprise) geodatabase
are stored in IBM DB2, Informix, Oracle, or
Microsoft SQL Server.
It can be edited through ArcSDE by many users
at the same time, is suitable for large
workgroups and enterprise GIS
implementations. no limit of size. support raster
data.
3-tier ArcSDE client/server architecture with both
the ArcSDE and Oracle RDBMS running on the
same server, which minimizes network traffic
and client load while increasing the server load
compared to 2-tier system, in which the clients
directly connect to the RDBMS
Personal and Multiuser
Geodatabase Comparison
source: www.esri.com
5. Geometric transformation
projection and coordinate system is to project the 3D earth to 2D plane, so the 3D
earth can be represented in different GIS data models (2D digital format) in a GIS
system.
Geometric transformation is the process of using a set of control points
and transformation equations to register a (2D) digitized map, a satellite
image, or an air photograph onto a (2D) projected coordinate system.
In GIS, geometric transformation includes map-to-map transformation,
image-to-map transformation, image-to-image transformation.
The root mean square (RMS) error is a quantitative measure of
location accuracy that can determine the quality of a geometric
transformation.
Image to map (or image) needs an additional step resampling to fill the
cell values from the original image.
6. Data accuracy and quality

Raster data quality



Geolocation accuracy
Estimation accuracy
Vector data quality


Location errors
Topographical errors
7. Vector data analysis
Vector data analysis uses the spatial features of point, line, and
polygon as inputs.
The accuracy of analysis results depends on the accuracy of spatial
features in terms of location and shape.
Topology can also be a factor for some vector data analyses such as
buffering and overlay.
Intersect
Union
Symmetrical difference
Identity
Pattern Analysis
AND
OR
XOR
AND OR
Point pattern: nearest neighbor, Ripley’s K-function
Moran’s I
G-Statistic
8. Raster data analysis
Raster data analysis is based on cells and rasters.
Raster data analysis can be performed at the level of individual
cells, or groups of cells, or cells within an entire raster.
Some raster data operations use a single raster, while others
use two or more rasters.
Raster data analysis is also related to the type of cell value
(numeric or categorical values) in the input raster(s).
Local, focal, zonal
Allocation and direction
Clip and mosaic
Aggregate and regiongroup
Map algebra
9. Lab 1

Getting Started With the Geodatabase
COPY the result map of your last step to your home work
Copy your
exam questions
and result to
your homework
Download