Lab 4. unsupervised classification

advertisement
Ministry of Education and Science of Russian Federation
Federal State Autonomous Educational Institution of Higher Education
Tomsk Polytechnic University
REMOTE SENSING DATA PROCESSING
LABORATORY OPERATIONAL MANUAL
Author - Olga S. Tokareva
Department of Computer Engineering
IC TPU
Tomsk – 2015
1
ANNOTATION
The manual consists of the instructions to perform laboratories on module
«Remote Sensing Data Processing» of Professional Training delivered in English
(«Remote Sensing Basics. Data obtaining», «ERDAS Imagine basics. Image
preparation methods», «Map rectification», «Unsupervised Classification»,
«Supervised Classification»).
The laboratory operational manual includes short theoretic materials. The
basic terms and phrases in English are given.
2
CONTENTS
INTRODUCTION ...........................................................................................................................4
Lab 1. Remote Sensing Basics. Data obtaining ...............................................................................5
Lab 2. ERDAS Imagine basics. Image preparation methods ..........................................................8
Lab 3. Image rectification ..............................................................................................................15
Lab 4. unsupervised classification .................................................................................................18
Lab 5. Supervised classification ....................................................................................................21
REFERENCES ..............................................................................................................................24
3
INTRODUCTION
Remote sensing is the measurement or acquisition of data about an object or
scene by a satellite or other instrument above or far from the object. Aerial
photography, satellite imagery, and radar are all forms of remote sensing. Image is
a picture or representation of an object or scene on paper, or a display screen.
Remotely sensed images are digital representations of the Earth [1, 2].
The laboratory operational manual introduces the main methods of digital
image processing using Erdas Imagine software.
In many cases, images of one area that are collected from different sources
must be used together. The tools for rectifying image data are used to transform
disparate images to the same coordinate system. Rectification is described in Part 1
of the manual.
Multispectral classification is using to extract some information from image
via the process of sorting pixels into a finite number of individual classes, or
categories of data, based on their data file values. The mostly used classification
tools are considered in Part 2 and Part 3.
4
LAB 1. REMOTE SENSING BASICS. DATA OBTAINING
Objective
The
exercise
is
training
in
LANDSAT
image
obtaining
using
earthexplorer.usgs.gov Internet service.
Requisites:
Internet connection
http://earthexplorer.usgs.gov/
Introduction
EarthExplorer (http://earthexplorer.usgs.gov) provides online search, browse
display, metadata export, and data download for earth science data from the
archives of the U.S. Geological Survey (USGS). EarthExplorer provides an
enhanced user interface using state-of-the art JavaScript libraries, Hypertext
Preprocessor (PHP), and the advanced Oracle spatial engine.
The Archive spans from 1937 aerial photographs to millions of satellite
images of the Earth’s surface, starting with the original Earth orbits in the 1960’s
and first Landsat satellite in 1972, to our current hourly additions of satellite
images. Click on any product category to review detailed information on the
products, plus how to order or download selected images at no charge.
The USGS EarthExplorer (EE) tool provides users the ability to query,
search, and order satellite images, aerial photographs, and cartographic products
from several sources. In addition to data from the Landsat missions and a variety
of other data providers, EE now provides access to MODIS land data products
from the NASA Terra and Aqua missions, and ASTER level-1B data products over
the U.S. and Territories from the NASA ASTER mission. Registered users of EE
have access to more features than guest users.
Procedure:
Follow the instruction in the picture below to query satellite image Landsat.
5
Use Search Criteria:
Path= 155
Row=19
Date: July 2014
Data Set: L7 ETM+ SLC-on (1999-2003)
Cloud Cover: less than 90 %
Day/Night: Day
Data type Level 1: ETM+ L1T
General Search Criteria
6
Data Set
7
Additional Criteria
8
Choose and download image
9
Terms and definitions:
Band – a set of data file values for a specific portion of the electromagnetic
spectrum of reflected light or emitted heat (red, green, blue, near-infrared, infrared,
thermal, etc.), or some other user-defined information created by combining or
enhancing the original bands, or creating new bands from other sources.
Sometimes called channel.
Cell size – the area that one pixel represents, measured in map units. For
example, one cell in the image may represent an area 30’ 30’ on the ground.
Sometimes called pixel size.
Data – 1. in the context of remote sensing, a computer file containing
numbers that represent a remotely sensed image, and can be processed to display
that image.
2. a collection of numbers, strings, or facts that requires some processing before it
is meaningful.
Detector – the device in a sensor system that records electromagnetic
radiation.
Digital Number – (DN) variation in pixel intensity due to composition of
what it represents. For example, the DN of water is different from that of land. DN
is expressed in a value – typically from 0-255.
Electromagnetic radiation – (EMR) the energy transmitted through space
in the form of electric and magnetic waves.
Electromagnetic spectrum – the range of electromagnetic radiation
extending from cosmic waves to radio waves, characterized by frequency or
wavelength.
ETM+– Enhanced Thematic Mapper Plus.
Monochrome image – an image produced from one band or layer, or
contained in one color gun of the display device.
10
LAB 2. ERDAS IMAGINE BASICS. IMAGE PREPARATION
METHODS
Objective
The exercise is training in LANDSAT image preparation using Erdas
Imagine.
Requisites:
- Input images
- Output image – multispectral image.
- Output image – multispectral image.
Task:
 Explore settings and parameters of Preference Editor dialog.
 Create a file containing the multispectral data.
 Improve the spatial resolution of the satellite images using data from
the panchromatic channel.
Perform Layer Stack, Spаtial Enchancement using data obtained due to
Lab 1.
Procedure:
Open Session/Preferences and setup next preferences:
Data Directory – set URL within a directory that is convenient for me.;
Output Directory – set URL within a directory that is convenient for me.;
Веер After Job Finished –turned on beep signal;
Session Log Printing – choose: Never.
11
Tool Palette Size – Small by default;
Combine Tools and Commands – turned on by default;
Clear display – off
Fit to frame – On
Background transparent – On
Merge
the
6
bands
into
one
Interpreter/Utilities/Layer Stack menu
12
file.
You
can
do
it
with
After this operation you must merge the high resolution file with multispectral file
with Interpreter/Spatial Enhancement/Resolution Merge using next settings
Method – Principal Component;
Resampling Techniques – Nearest Neighbor;
Output Options – Ignore Zero in Stats;
Data Type – Unsigned 8 bit.
13
Terms and definitions:
Radiometric correction – the correction of variations in data that are not
caused by the object or scene being scanned, such as scanner malfunction and
atmospheric interference.
Radiometric enhancement – an enhancement technique that deals with the
individual values of pixels in an image.
Radiometric resolution – the dynamic range, or number of possible data
file values, in each band. This is referred to by the number of bits into which the
recorded energy is divided.
Resolution – a level of precision in data. For specific types of resolution
Resolution merging – the process of sharpening a lower-resolution
multiband image by merging it with a higher-resolution monochrome image.
Scene – the image captured by a satellite.
Temporal resolution – the frequency with which a sensor obtains imagery
of a particular area.
14
LAB 3. IMAGE RECTIFICATION
Objective
The exercise is training in image rectification using Geometric Correction
Tool of ERDAS Imagine.
Requisites:
- Input image – forest map
(ftp://ftp.vt.tpu.ru/study/Tokareva/public/Earth%20remote%20sensing/Raste
rData/To/to.jpg).
- Reference image – Landsate image
(ftp://ftp.vt.tpu.ru/study/Tokareva/public/Earth%20remote%20sensing/Raste
rData/To/p155r019_7t20000626_6ch.img). .
- Output image – geocoded forest map.
Procedure:
In rectifying the forest map, use the basic steps:
- display files;
- start Geometric Correction Tool;
- record GCPs;
- compute a transformation matrix;
- resample the image;
- verify the rectification process.
Follow the rules:
ftp://ftp.vt.tpu.ru/study/Tokareva/public/Earth%20remote%20sensing/Prof_Englis
h/Erdas_TourGuide.pdf (Chapter 6 «Polynomial Rectification») to perform the
laboratory work step by step [3].
Video: http://www.youtube.com/watch?v=eg0H_-mQO24
Video: http://www.youtube.com/watch?v=uYBriSx-d3U
15
The source and reference coordinates of the GCPs can be entered in the
following ways:
-
They may be known a priori, and entered at the keyboard.
-
Use the mouse to select a pixel from an image in the Viewer. With
both the source and reference Viewers open, enter source coordinates and
reference coordinates for image-to-image registration. The Multipoint Geometric
Correction tool contains the both the source and reference Viewers within the
tool.
-
Use an existing Ground Control Coordinates file (.gcc file extension).
This file contains the X and Y coordinates along with the GCP point ID, saved as
an external file.
-
Use a digitizing tablet to register an image to a hardcopy map.
Polynomial equations are used to convert source file coordinates to rectified
map coordinates. Depending upon the distortion in the imagery, the number of
GCPs used, and their locations relative to one another, complex polynomial
equations may be required to express the needed transformation. The degree of
complexity of the polynomial is expressed as the order of the polynomial. The
order is simply the highest exponent used in the polynomial.
A transformation matrix is computed from the GCPs. The matrix consists of
coefficients that are used in polynomial equations to convert the coordinates. The
size of the matrix depends upon the order of transformation. The goal in
calculating the coefficients of the transformation matrix is to derive the polynomial
equations for which there is the least possible amount of error when they are used
to transform the reference coordinates of the GCPs into the source coordinates. It is
not always possible to derive coefficients that produce no error [4].
Terms and definitions:
Ground Control Points (GCPs) are specific pixels in an image for which
the output map coordinates (or other output coordinates) are known. GCPs consist
16
of two X,Y pairs of coordinates:
-
source coordinates – usually data file coordinates in the image being
rectified;
-
reference coordinates – the coordinates of the map or reference
image to which the source image is being registered.
Check point – additional ground points used to independently verify the
degree of accuracy of a triangulation.
Bilinear interpolation – a resampling method that uses the data file values
of four pixels in a 2 * 2 window to calculate an output data file value by computing
a weighted average of the input data file values with a bilinear function.
GCP prediction – the process of picking a GCP in either coordinate system
and automatically locating that point in the other coordinate system based on the
current transformation parameters.
Geocoded data – an image(s) that has been rectified to a particular map
projection and cell size and has had radiometric corrections applied.
Geographical coordinates – a coordinate system for explaining the surface
of the Earth. Geographical coordinates are defined by latitude and by longitude
(Lat/Lon), with respect to an origin located at the intersection of the equator and
the prime (Greenwich) meridian.
Geometric correction – the correction of errors of skew, rotation, and
perspective in raw, remotely sensed data.
Georeferencing – the process of assigning map coordinates to image data
and resampling the pixels of the image to conform to the map projection grid.
RMS (Root Mean Square) error — the distance between the input (source)
location of a GCP and the retransformed location for the same GCP. RMS error is
calculated with a distance equation [4].
17
LAB 4. UNSUPERVISED CLASSIFICATION
Classification is a process of sorting pixels into a finite number of individual
classes, or categories of data based on their data files value. If a pixel satisfies a
certain set of criteria, then the pixel is assigned to the class that corresponds to that
criteria [1, 2].
Unsupervised classification is dependent upon the data itself for the
definition of classes. This method is usually used when less is known about the
data before classification. It is then the analyst’s responsibility, after classification,
to attach meaning to the resulting classes [4].
Erdas Imagine uses the ISODATA algorithm to perform an unsupervised
classification.
Objective
To create a thematic raster layer using the Landsat image and ISODATA
algorithm to perform unsupervised classification with ERDAS Imagine software.
Requisites:
- Landsate image
(ftp://ftp.vt.tpu.ru/study/Tokareva/public/Earth%20remote%20sensing/Raste
rData/To/p155r019_7t20000626_6ch.img).
- the geocoded forest map obtained on Lab2.
Procedure:
To perform an unsupervised classification using the ISODATA algorithm,
use the basic steps:
1. Select Unsupervised Classification from the Data Preparation menu.
2. Choose processing Options to specify how the process is performed:
- Maximum Iteration – 24;
18
- Convergence Treshold – 0,95.
3. Start the classification process.
4. After classification is performed, display the thematic raster to recode the
classes.
5. Evaluate and test the accuracy of the classification.
Follow the steps:
ftp://ftp.vt.tpu.ru/study/Tokareva/public/Earth%20remote%20sensing/Prof_
English/Erdas_TourGuide.pdf (Chapter 5 «Classification») to perform the
laboratory work step by step [3].
Terms and definitions:
Accuracy assessment – the comparison of a classification to geographical
data that is assumed to be true. Usually, the assumed-true data are derived from
ground truthing.
Area of interest (AOI) – a point, line, or polygon that is selected as a
training sample or as the image area to be used in an operation. AOIs can be stored
in separate .aoi files.
Attribute – the tabular information associated with a raster or vector layer.
Class – a set of pixels in a GIS file that represents areas that share some
condition. Classes are usually formed through classification of a continuous raster
layer.
Class value – a data file value of a thematic file that identifies a pixel as
belonging to a particular class.
Classification accuracy table – a list of known values of reference pixels,
supported by some ground truth or other a priori knowledge of the true class, and a
list of the classified values of the same pixels, from a classified file to be tested.
Classification scheme – (or classification system) a set of target classes.
The purpose of such a scheme is to provide a framework for organizing and
categorizing the information that can be extracted from the data.
19
Clustering – unsupervised training; the process of generating signatures
based on the natural groupings of pixels in image data when they are plotted in
spectral space.
Clusters – the natural groupings of pixels when plotted in spectral space.
Confidence level – the percentage of pixels that are believed to be
misclassified.
Distance image file – a one-band, 16-bit file that can be created in the
classification process, in which each data file value represents the result of the
distance equation used in the program. Distance image files generally have a chisquare distribution.
Ground truth – data that are taken from the actual area being studied.
Ground truthing – the acquisition of knowledge about the study area from
field work, analysis of aerial photography, personal experience, etc. Ground truth
data are considered to be the most accurate (true) data available about the area of
study.
Iterative
Self-Organizing
Data
Analysis
Technique
(ISODATA
clustering) – a method of clustering that uses spectral distance as in the sequential
method, but iteratively classifies the pixels, redefines the criteria for each class,
and classifies again, so that the spectral distance patterns in the data gradually
emerge.
Landsat – a series of Earth-orbiting satellites that gather MSS and TM
imagery, operated by EOSAT.
Mean vector – an ordered set of means for a set of variables (bands). For a
data file, the mean vector is the set of means for all bands in the file [4].
20
LAB 5. SUPERVISED CLASSIFICATION
Supervised classification is usually appropriate when you want to identify
relatively few classes, when you have selected training sites that can be verified
with ground truth data, or when you can identify distinct, homogeneous regions
that represent each class.
The maximum likelihood decision rules assigns each pixel having pattern
measurements to the class C whose units are most probable or likely to have given
rise to feature vector x. It assumes that the training data statistics for each class in
each band are normally distributed, that is, Gaussian [1, 2].
Objectives
To create a thematic raster layer using the Landsat image and the Maximum
Likelihood Classifier to perform supervised classification with ERDAS Imagine
software.
Requisites
- Landsate image
(ftp://ftp.vt.tpu.ru/study/Tokareva/public/Earth%20remote%20sensing/Raste
rData/To/p155r019_7t20000626_6ch.img).
- the geocoded forest map.
Procedure:
To perform an supervised classification using the Maximum Likelihood
Classifier, use the basic steps:
1. Mark training samples and generate signatures.
2. Evaluate signatures using the contingence matrix, histograms, signature
separability, statistics.
3. Start the classification process.
4. After classification is performed, display the thematic raster to evaluate
the accuracy of the classification.
21
Follow the steps:
ftp://ftp.vt.tpu.ru/study/Tokareva/public/Earth%20remote%20sensing/Prof_
English/Erdas_TourGuide.pdf (Chapter 17 «Advanced Classification») to perform
the laboratory work step by step [3].
Video: http://www.youtube.com/watch?v=ttjHpjJ7dV0
Terms and definitions:
Bayesian – a variation of the maximum likelihood classifier, based on the
Bayes Law of probability. The Bayesian classifier allows the application of a priori
weighting factors, representing the probabilities that pixels are assigned to each
class.
Covariance – measures the tendencies of data file values for the same pixel,
but in different bands, to vary with each other in relation to the means of their
respective bands. These bands must be linear. Covariance is defined as the average
product of the differences between the data file values in each band and the mean
of each band.
Covariance matrix – a square matrix that contains all of the variances and
covariances within the bands in a data file.
Decision rule – an equation or algorithm that is used to classify image data
after signatures have been created. The decision rule is used to process the data file
values based upon the signature statistics.
Divergence – a statistical measure of distance between two or more
signatures. Divergence can be calculated for any combination of bands used in the
classification; bands that diminish the results of the classification can be ruled out.
Error matrix – in classification accuracy assessment, a square matrix
showing the number of reference pixels that have the same values as the actual
classified points.
Feature space – an abstract space that is defined by spectral units (such as
an amount of electromagnetic radiation).
22
Feature space image – a graph of the data file values of one band of data
against the values of another band (often called a scatterplot).
Histogram – a graph of data distribution, or a chart of the number of pixels
that have each possible data file value. For a single band of data, the horizontal
axis of a histogram graph is the range of all possible data file values. The vertical
axis is a measure of pixels that have each data value.
Maximum likelihood – a classification decision rule based on the
probability that a pixel belongs to a particular class. The basic equation assumes
that these probabilities are equal for all classes, and that the input bands have
normal distributions.
Separability — a statistical measure of distance between two signatures.
Signature — a set of statistics that defines a training sample or cluster. The
signature is used in a classification process. Each signature corresponds to a GIS
class that is created from the signatures with a classification decision rule.
Supervised training — any method of generating signatures for
classification, in which the analyst is directly involved in the pattern recognition
process. Usually, supervised training requires the analyst to select training samples
from the data that represent patterns to be classified.
Thematic data — raster data that are qualitative and categorical.
Thematic layers often contain classes of related information, such as land
cover, soil type, slope, etc. In ERDAS IMAGINE, thematic data are stored in
image files.
Training — the process of defining the criteria by which patterns in image
data are recognized for the purpose of classification.
Training field — the geographical area represented by the pixels in a
training sample. Usually, it is previously identified with the use of ground truth
data or aerial photography. Also called training site.
Training sample — a set of pixels selected to represent a potential class.
Also called sample [4].
23
REFERENCES
1.George Joseph. Fundamentals of Remote Sensing. – Universities Press:
Hyderabad, 2011. – 490 c.
2.Lillesand T.M., Kiefer R.W., Chipman J.W. Remote Sensing and Image
Interpretation. – New Delhi: Wiley India Pvt. Ltd. – 2011. – 758 c.
3.ERDAS
Tour
Guides
//
http://www.gis.usu.edu/manuals/labbook/erdas/manuals/TourGuide.pdf
4.ERDAS
Field
Guide
http://www.gis.usu.edu/manuals/labbook/erdas/manuals/FieldGuide.pdf
24
//
Download