Unsupervised Classification - Integrated Geospatial Education

advertisement
Unsupervised Classification of Forest Stands with Landsat
Data and Comparison to Higher Resolution Orthoimager
STUDENT HANDOUT
Introduction.
Foresters and land managers spend many hours delineating forest stands from orthoimagery using GIS. The
use of free and up-to-date Landsat imagery could provide a reliable method to extract forest stand data and
create stand maps in a fraction of the time and cost. In this exercise, you will apply unsupervised
classification to extract forest vegetation types from a portion of a fall 2014 Landsat 8 scene. You will
evaluate the resulting classification compared to higher resolution orthoimagery and the capacity of 30meter resolution imagery to yield reliable forest stand maps.
Objectives.
 To execute unsupervised classification
 To evaluate the resulting classification as a reliable method to generate stand maps
 To modify and edit classes to yield minimum sizes and homogeneous areas
 To create a vector polygon layer of those homogeneous areas
Developed in 2015 by the Integrated Geospatial Education and Technology Training (iGETT) project, with
funding from the National Science Foundation (DUE-1205069) to the National Council for Geographic
Education. Opinions expressed are those of the author and are not endorsed by NSF. Available for educational
use only. See igettremotesensing.org for additional remote sensing exercises and other teaching materials.
Background.
In the past, practitioners delineated land cover types on stereo aerial photo pairs to create forest stand maps.
Forest stands were classified by height and stand density (crown closure). Currently, heads-up digitizing of
high resolution orthoimagery is very common. And while heads-up digitizing is an easy method for stand
typing and for creating digital stand maps and databases, imagery is expensive to contract and must be
updated regularly to reflect changes in the landscape. Landsat imagery is now free and with luck, multiple
scenes of a given area and time of year are available.
The image on the first page is a 1-meter resolution NAIP orthoimage of the Violette Stream area in
Northern Maine. You will use it later to evaluate the land cover classification created from satellite data.
In this exercise you will use ArcGIS Spatial Analyst tools to classify land cover types, with emphasis on
forest types, from Landsat 8 imagery of Northern Aroostook County Maine, captured on September 19,
2014. The study area encompasses the west side of Van Buren, Maine and the east side of township, T17
R3 WELS. Landsat has a coarse resolution of 30 meters compared to higher resolution orthoimagery.
However, Landsat imagery has the advantage of being free and up-to-date. Further, Landsat data has been
archived been since 1972. This makes the data extremely useful in studying land cover change as well as
automating land-use classification via computer software.
Landsat 8 has 11 bands, ten are discrete spectral bands. Bands 1 through 7 and 9 have 30 meter resolution.
For this exercise, we will use the blue through middle-infrared bands, 2 through 7. Thermal bands 10 and
11, although captured at a 100m resolution, are resampled to 30m. See the table below.
Landsat 8
Operational
Land Imager
(OLI)
and
Thermal
Infrared
Sensor
(TIRS)
Launched
February 11, 2013
Bands
Wavelength
(micrometers)
Resolution
(meters)
Band 1 - Coastal aerosol
0.43 - 0.45
30
Band 2 - Blue
0.45 - 0.51
30
Band 3 - Green
0.53 - 0.59
30
Band 4 - Red
0.64 - 0.67
30
Band 5 - Near Infrared (NIR)
0.85 - 0.88
30
Band 6 - SWIR 1
1.57 - 1.65
30
Band 7 - SWIR 2
2.11 - 2.29
30
Band 8 - Panchromatic
0.50 - 0.68
15
Band 9 - Cirrus
1.36 - 1.38
30
Band 10 - Thermal Infrared (TIRS) 1
10.60 - 11.19
100 * (30)
Band 11 - Thermal Infrared (TIRS) 2
11.50 - 12.51
100 * (30)
Source: USGS at http://landsat.usgs.gov/band_designations_landsat_satellites.php
2
Your goal is to complete an unsupervised classification of the Landsat data and modify the resulting data to
create a forest stand map. You will then analyze the automated results and compare them to a high
resolution orthoimage of the area to see if the resulting classification is a valid method of mapping forest
stands.
Steps in this Exercise (Unsupervised Classification Workflow).
 ArcGIS and the Spatial Analyst Extension are required
-this exercise was prepared using ArcMap 10.2.2
 Data Preparation
-Download Landsat 8 full scene, all bands.
-Download Orthoimagery from MEGIS
-Clip Landsat Scene to study area
 Explore Landsat imagery
 Unsupervised Classification
-Clustering using Iso Cluster tool
-Examine Signature file (Dendrogram tool)
-Edit Signature file
-Apply Max. Likelihood Classification
 Evaluate the resulting classification
 Post Process
-Filter to remove isolated pixels
-Smooth class boundaries (Boundary Clean tool)
-Generalize (remove isolated regions)
 Convert to vector polygon layer
Evaluation.
Your final products will be the answer to questions, your final classification layer, and the forest stand
vector map.
References.
Crossley, P. 2015, March 5. How to Decide which Landsat Scenes to Choose. YouTube. Retrieved 30
April 2015: https://www.youtube.com/watch?v=AhQxSlhcN2w
ESRI. 2013. Iso Cluster Unsupervised Classification (Spatial Analyst). ArcGIS Desktop Help.
ESRI. 2013. Maximum Likelihood Classification (Spatial Analyst). ArcGIS Desktop Help.
Hobbins, D. 2015, April 29. iGETT Concept Module Object Recognition on Aerial Imagery. YouTube.
Retrieved 30 April 2015: https://www.youtube.com/watch?v=kmIPOabAGOg
Lillesand, T.M., R.W. Kiefer, J.W. Chipman. 2008. Remote Sensing and Image Interpretation, 6th ed.
John Wiley and Sons. New York. 756 p.
Paine, D.P. and J.D. Kiser. 2012. Aerial Photography and Image Interpretation. 3th ed. John Wiley and
Sons. New York. 648 p.
3
DATA PREPARATION.
EarthExplorer
 Create a Landsat working folder on your hard drive.
 Download the necessary Landsat file from EarthExplorer or another Landsat download tool.
http://earthexplorer.usgs.gov/ (Remember you need an account before you can download.)
The dataset you are using is from Landsat Archive L8 OLI/TIRS, Path 11 Row 27 captured on 19
September 2014. The file name is LC80110272014262LGN00.
If you need instructions, see Phil Crossley’s Concept Module on choosing and downloading
Landsat imagery. Scroll to minute 5:10 where he explains EarthExplorer.
https://www.youtube.com/watch?v=AhQxSlhcN2w
Or visit Resources for Instruction on the iGETT webpage (igettremotesensing.org) for written
guidelines on downloading using GloVis.
Maine Office of GIS
 Open a web Browser and navigate to the Maine Office of GIS website Catalog page at
www.maine.gov/megis/catalog/
 Scroll down to the Imagery, Base Maps and Land Cover section.
 Find the NAIP 2013 images section.


Click on the MEGIS Viewer icon
.
In the viewer, find the Zoom To Town inputbox. From the drawdown list, select T17R3 WELS or
Van Buren. The map will zoom there and appear as below.
4


From the menu bar at the top click on the menu.
Set the Select Images dialog to Draw Point and NAIP 2013, as below.

Move the mouse over the image. Click on the area above the pointed road on the east boundary of
T17R3. See the image below.

From the Select Images menu, click GO.
The resulting layer appears.

Right click on the URL and select Save Link
As. This will vary based on your browser and
software. Save the layer in your working folder.

Close the browser.
5
EXERCISE 1. Add, Examine, and Modify Data.







Open ArcMAP. In the Customize menu, click on Extensions and be sure that the Spatial Analyst is
checked on.
Connect to the Landsat working folder on your hard drive.
Using the Catalog in ArcMAP, examine the folder contents.
Create a new folder in Landsat called Scratch. It will be used later.
Access the ArcGIS Desktop Help dialog from the Help menu. Type Iso Cluster into the Search tab.
Select the topic titled, “Iso Cluster Unsupervised Classification (Spatial Analyst).”
Read the summary and Usage sections.
Note the following points under the Usage section.
 The minimum valid value for the number of classes is two. There is no maximum number of
clusters. In general, more clusters require more iterations.
 To provide the sufficient statistics necessary to generate a signature file for a future classification,
each cluster should contain enough cells to accurately represent the cluster. The value entered for
the minimum class size should be approximately 10 times larger than the number of layers in
the input raster bands.
 The value entered for the sample interval indicates one cell out of every n-by-n block of cells is
used in the cluster calculations.
 You shouldn't merge or remove classes or change any of the statistics of the ASCII signature file.
 Generally, the more cells contained in the extent of the intersection of the input bands, the
larger the values for minimum class size and sample interval should be specified. Values
entered for the sample interval should be small enough that the smallest desirable categories
existing in the input data will be appropriately sampled.
 The class ID values on the output signature file start at one and sequentially increase to the number
of input classes. The assignment of the class numbers is arbitrary.
 The output signature file's name must have a .gsg extension.
 Better results will be obtained if all input bands have the same data ranges. If the bands have
vastly different data ranges, the data ranges can be transformed to the same range using Map
Algebra to perform the equation.
DATA
 GO to the Catalog in ArcMAP and expand the Landsat
folder.
 Expand the folder named LC80110272014262LGN00 that
contains the data.
 Grab and drop band 4 into the Data View.
Examine the image, a small version of the full scene is at right.
Note the clouds. A good portion of this image contains clouds.
6
CLIP

Use the Clip tool (Data Management) (Tool) and clip bands 2 through 7 to the following extent. In
the instructions that follow, the names of the clipped layers are L8_B2, etc.


Add the clipped bands to the map.
Save the map document as VS_StudyArea.
The clipped layers will appear as at right.
 Remove the band 4 layer.
 From the database, add the clipped bands 2
through 7 and examine each as you add them.





Save the map document as VS_StudyArea in the
Landsat folder.
Remove the band 4 layer and the StudyArea
feature class.
From the L8image database, add the clipped
bands 2 through 7 and examine each as you add
them.
Save the map document again. Do this
frequently.
Examine the bands and answer the following
questions. Answers can be found in the key on
the last page of the document.
Question 1. By eye, what band has the greatest (brightest) reflectance in the plowed field on the right of the
image?
Question 2. What is the reflectance value for a bright cell in that field using band 2 and band 7? (Hint. Use
the Identify tool.)
Question 3. By eye, what band has the greatest reflectance in the square clearcut in the left center of the
image?
Question 4. What band has the lowest (darkest) reflectance for the clearcut?

From your working folder, Add the violette_stream_sw.tif image to the map and examine it. If you
get the Coordinate System warning, close it.
This 2013 NAIP, 1-meter resolution layer clearly shows that forests dominate the area. The default RGB
image shows conifers, the dark green areas. The lighter green areas are broadleaved forest. The smooth
textured, light green areas are clearcuts, the roads are whitish, and the plowed field on the east side is tan
light in color. A key to types can be found on page 13.
7
Data Values
Remember the statement from the help menu that speaks to getting the best results from your classification.
Here it is again. “Better results will be obtained if all input bands have the same data ranges. If the
bands have vastly different data ranges, the data ranges can be transformed to the same range using
Map Algebra to perform the equation.”








Examine the data range for the band 2 image.
Note the value range is from a low of 7591 to a high of 8948.
Turn off all the layers above band 5 in the Table of Contents
(TOC).
Examine the data range for the band 5 image.
Why would band 5 have such a high end value of 22887?
Examine the band 5 image.
Why are the values so high in the band 5 image? Answer: Band 5 is near IR. Vegetation can have
3-4 times the reflectance in near IR than in green (band 3). Most broadleaved trees have very high
reflectance in near IR.
Check on the NAIP violette_stream layers.
Switch between the NAIP layer and band 5. Study the area. Find areas that are clearly hardwood,
clearly softwood and mixedwood.
Right Click on band 5 and open the Properties dialog.
Click on the Symbology Tab. See the image below.
Note the value range and that this data is stretched via the Percent Clip method.
8

Click on the Histograms button. CAREFUL! If you click in the histogram it may change the values
displayed.
This displays the distribution the pixels by their value. Note the standard deviation.

Close the two dialogs.
Transform Data Ranges with Map Algebra
Return to the Table of Contents (TOC) and record the low and high data value for each band. These values
are needed for the transformation. Use the table below.
Data Values By Band
Band
2
High (max)
Value
8948
Low (min)
Value
7591
3
4
5
6
7
9959
5416
You will use these values in the transformation equation to equalize the band ranges. The formula follows.
New Layer = [(Old_Layer - OLDMIN) x (NEWMAX- NEWMIN) / (OLDMAX – OLDMIN)]
Note the value ranges and that bands 5 and 6 have value ranges that far exceed the other bands. We will use
the value range for band 7 (9959 – 5416 = 4543) for all of the bands to equalize them
9


Type “raster calculator” into the Search tool. (No quotes)
Double click on the Raster Calculator (Spatial Analyst).



Enter two left brackets i.e. ((.
Double click on the Band 2 file in the left window.
Complete the equation as noted above and below.



Set the output to your C drive and call it L8_B2.
Click OK.
Repeat this step for each layer using the Band 7 values for the NewMax (9959) and NewMin
(5416) and the original band values for the Oldmax and Oldmin.
When you have completed the transformations, remove the original clipped layers from the map
document.
Save the map document!

10
EXERCISE 2. Unsupervised Classification with ISO Cluster.
ISO Cluster
 From the Customize menu, select Toolbars and activate the Image Classification toolbar.
 On the Image Classification toolbar, click Classification \ Iso Cluster Unsupervised Classification.
The Iso Cluster Unsupervised Classification tool is opened.
 In the tool dialog box, add the clipped bands 2 through 7 for the Input Raster Bands. See the image
below.
 Specify 10 Number of classes.
 Set the Output classified raster to your working folder and call it ISO102010
 Set BOTH the Minimum class size and Sample interval as 10 each.
 Set the Output signature file as VS102010.


Click OK to run the tool.
The output classified raster will be automatically added to ArcMap when the tool finishes. See the
output below.
Warning: Your results may be differ from what you see below. The colors are random. You may
even have more classes, ten is possible. The important thing is for you to identify the land cover type
assigned to each class.
11
This area has several forest types that occur in northern forest regions, especially from the Lake States
east to the Canadian Maritimes. Compare your results with the violette_stream_sw NAIP layer.
 Use the NAIP layer to identify those classes that are roads, hardwood, softwood, mixedwood,
plowed fields, and other classes. How many classes can you confirm?
Normally we would ground truth the classification. Here you will use the following image and
information to “ground-truth” your ISO classification.
 Zoom in and out as needed to find areas and examine them. Compare the class with the NAIP layer.
Pan around and examine many areas. Find the best description for each classification.
CLASS 1: in my output the black areas are clearly softwood with some shrub and riparian (stream) areas.
CLASS 2: green areas appear to be softwood predominantly. There are also some mixedwood stands.
These are candidates for merger.
CLASS 3 and 4: these also appear to be predominantly softwood areas. Although class 3 has some forest
shrub.
CLASS: white in my output. Those areas appear to be a mix of hardwood and softwood but vary.
Class 6: “pink” in the above image. It is a mix of recent clearcuts, roads, bare areas (including the plowed
field), and forest shrub or riparian (stream) areas.
CLASS 7 and 8: appear to be mostly hardwood in my output. These seem to be good candidates for merger.
To better distinguish areas, you can change a class to a bright color that will be easier to see and
use. Below is an image with labels for the forest types to help in your interpretation.
12
Editing the Signature File
Next we will use a tool to examine the signatures of each class.
 Find the Search tool and type in Dendrogram.
 Click on the Dendrogram tool in the list.
Read the help. What is the purpose of the tool?
 In the dialog, input the vs102010gsg file. See below.


Set the Output to the Scratch folder and name it Dendro_vs102010.
Accept the other defaults and click OK
13
A table will be added to the bottom of the Table of Contents.
 Right click on the table name and open it.
It contains many rows of data. The upper portion contains pairs of classes by class number and the distance
between their value ranges.
 Examine the pairs that have small differences, top of list. Those have similar signatures and need to
be examined. They are likely candidates to be merged or omitted.
Notice that classes 3 and 4 top the list. Their respective signatures are close. Did you rate then as
being similar? They are good candidates for merger.
Note that classes 1 and 2 are next on the list. They have similarities but I feel they are different.
Two may be more mixed than class 1. What did you find?
Classes 7 and 8 are next. They very much both appear to be hardwood to me. What do you think?
Remember, you may have different classes. That is OK. The main thing is to find those similar
classes and merge them, if they are close.

Close the table.
Based on close examination and my results, I propose we do the following.
Make the following classifications by merging classes.
Softwood = classes 1, 2, 3, and 4
Hardwood = classes 7 and 8
Mixedwood = class 5
Clearcut and other = class 6
 Use the Help tool and search box and type Edit Signatures. What does the intro and Usage say
about the tool?
Note the line about needed a remap ASCII file.
 Now type How Edit Signatures works. This shows the format for the ASCII or text file.
14
Create a Text Document
 From the MS Start Button, select All Programs/ Accessories / Wordpad. (Any text document
software will work!)
The format of the ASCII file is left number is to change, right value is to keep or merge with. So let’s
format as follows.
 Type the following into Wordpad. (Note: Class 5 and 6 are left alone.)
2:1
3:1
4:1
8:7
 SAVE it as a text document named vs102010.txt in the Scratch subfolder.
 Close Wordpad.
Return to ArcMap.
 In ArcMAP, use the Search dialog and type Edit Signatures.
 Click on the tool in the list.
This opens the dialog.
 Turn on the help window on the dialog.
 Click in the Inputbox for the Input signature remap file. Read the help and note it says you can use
an asc or txt file. We will use the text document you just created.
15





As above, drop the reformatted bands 2 through 7 into the Input Raster Bands box.
Bring in the signature file and text file into the next two inputs.
Browse to the scratch folder and name the output vs102010ed.gsg.
Use the default of 10 for the sample interval.
Click OK.
When this process is complete, there will be a processing message but nothing else. Your new signature
file will be used in the next step to redefine the classification.
Maximum Likelihood Classification
 From the Classification menu, select the Maximum Likelihood Classification tool.
 Complete the dialog as below.
 When it looks as below, Click OK.
16
The result is below. Is your output only four classes?
Set the labels to read the class name instead of numbers (i.e. 1 = Softwood, 5 = Mixedwood, etc.)
You will not process this classification to clean it up and end up with blocks not less than 10 acres.
EXERCISE 3. Post Processing.
Post processing will result in filtering to remove isolated pixels, smoothing class boundaries (Boundary
Clean tool) and generalizing the data (remove isolated regions). The result will be a cleaner map with
blocks of 10 acres and greater.
Clean Up the Map
Before you proceed, let’s remove layer that we no longer need.
 Remove all layers except the MLClass and violette_stream_sw layers.
 SAVE the map document!
Majority Filter
 In the ArcMAP Customize menu, click Extensions. Be sure the Spatial Analyst is checked on.

From the Search window, type in Majority Filter. Read the Help menu. It reads, “Replaces cells
in a raster based on the majority of their contiguous neighboring cells.” We are trying to
remove isolated pixels and create more uniform groupings of pixels.
17

In the Search Window, Click on the Majority Filter name to open the dialog. Alternately you can
access the tool from the toolbox.




Drop the MLClass layer into the Input raster box.
Use the browse button to set the Output Raster to your Scratch folder and call it Maj_Filter.
Accept the defaults of Four and Majority for the other options.
Click OK!
Examine the results. See
the image to the right.
Compare those with your
MLClass results.
Note how isolated pixels
were removed!

SAVE the map
document! Do so
after each step.
18
Clean Boundary
Next we will use the Clean Boundary tool.
 From the Search window, type Boundary Clean. You want the Boundary Clean tool from the
Spatial Analyst tools.
 Click on the tool in the Results window to open the dialog.
 Read the Help menu – “Smoothes the boundary between zones by expanding and shrinking it.”
 Drop the Maj_Filter layer into the Inputbox.
 Set the Output to the Scratch folder and name the output layer Bndry_Clean.
 Click on the Help menu and read the help for the Sorting Technique. It reads as follows.
“ASCEND — Sorts zones in ascending order by size. Zones with smaller total areas have a higher priority
to expand into zones with larger total areas.”


Set the Sorting technique to Ascend.
Check OFF the Run expansion. Note the Help menu.

When the dialog looks like that above, click OK.
19


Examine the results. See the image above.
The boundaries between the patches should be improved.
SAVE the map document.
Build Contiguous Areas: Region Group.
Forest managers usually set minimum size blocks in which to work. In Maine, ten acres is not
unreasonable. Therefore, we will try to create blocks of at least ten acres. For this we will use several tools.
First, as our data is metric, what is ten acres in square meters and how many 30m cells are needed to make
10 acres?
Ten acres is equivalent to approximately 40468.56 sq. meters. A 30m cell is 900 sq. meters. So, 40468.56
divided by 900 sq. meters is 44.9 cells. So, 45 cells is roughly equivalent to 10 acres.







Use the Search Window to find the Region Group tool.
Click it to open the dialog.
Examine the help menu. What does it say?
Basically, a unique number is assigned to each region.
Drop the bndry_clean layer into the Inputbox.
Use the Browse button to set the Output Raster to your Scratch folder and name it Reg_grp.
Accept the defaults of FOUR neighbors and WITHIN for zone grouping method.
Check OFF the Add link field option. See below.
20

When the dialog looks like that above, click OK.
The output looks a bit different!
Set Null Tool
 Search for and Open
the Set Null tool.
 Drop the Reg_grp
into the Input
conditional raster
box.
 Click on the SQL
button. See the dialog
below.
21

Build a query that looks like the dialog below.


Click OK.
Drop Reg_grp
into the Input
False raster
inputbox
Set the Output
raster to your
Scratch folder and
name it Null45.
Click OK.


Turn off the Reg_gpr layer and the Null45 output looks similar to that
at right.
22




Search for and Open the Nibble tool.
Set the Input raster to Bndry_Clean.
Set the Input raster mask to Null45.
Set the Output to your Scratch folder and name it Class.

Click OK.
See the output image that follows!
 In the TOC, set the class numbers to names.
 Remove all layers except the Class layer and the NAIP ortho layers (violette_stream_sw).
23




Examine the final product.
Zoom to the extent of the Class layer.
Load the violette_stream_sw.jp2 orthoimage.
Compare the Class layer with the violette_stream_sw layer. Answer the following questions.
Question 5. Are there areas where the classification is incorrect? If so, where specifically?
Question 6. Are areas classed as hardwood mostly correct? Examine class 7 (HW). How does that class
compare to the NAIP image?
Converting Raster to Vector (polygon) OPTIONAL
If the classification is satisfactory, then it may be useful to convert the layer to a vector product. This way it
can be used or processed with other vector layers.
Use the Search menu to find and open the Raster to Polygon conversion tool.




Fill it in as listed above.
Use the help to see what the Simplify tool does!
Click OK when it appears as above.
The Data View shows the new polygon map.
SAVE!
Congratulations, you just created a stand map (polygon shapefile) from Landsat satellite data!
24
ANSWER KEY:
1. Bands 2, 3, 4 and 7 all appear bright. Band 3 appears to have the highest pixel value.
2. I clicked in the middle of the plowed field with the IDENTIFY tool and got values of 2149 (band 2) and
2788 (band 7), respectively.
3. Band 5 is clearly the brightest.
4. I clicked near the center of the clearcut/dense young stand and got a value of 375 for band 7.
5. Yes, many sites. Band 6 is such a mix of types that roads and clearcuts have the same class. That needs
work. Also, many of the softwood sites by the rivers include forest shrub and riparian strips of vegetation.
6. The hardwood is mostly good. There will always be some areas that contain mixed pixels but for the
most part it is good. The evaluation also must consider how the information will be used.
25
Download