Lecture 10
Prepared by R. Lathrop 11//99
Updated 3/06
Readings:
ERDAS Field Guide 5th Ed. Ch 6:234-260
• Remote sensing science concepts
– Rationale and technique for post-classification smoothing
– Errors of omission vs. commission
– Accuracy assessment
• Sampling methods
• Measures
Fuzzy accuracy assessment
• Math Concepts
– Calculating accuracy measures: overall accuracy, producer’s accuracy and user’s accuracy and kappa coefficient.
• Skills
Interpreting Contingency matrix and Accuracy assessment measures
• Most classifications have a problem with “salt and pepper”, i.e., single or small groups of mis-classified pixels, as they are “point” operations that operate on each pixel independent of its neighbors
• “Salt and pepper” may be real. The decision on whether to filter/eliminate depends on the choice of the minimum mapping unit = does it equal single pixel or an aggregation
•
Majority filtering : replaces central pixel with the majority class in a specified neighborhood (3 x 3 window); con: alters edges
•
Eliminate
: clumps “like” pixels and replaces clumps under size threshold with majority class in local neighborhood; pro: doesn’t alter edges
2
2
6
2
2
6
8
6
6
2
2
2
6
6
2
6
6
6
2
2
6
6
6
6
2
6 6 2
6 2 6
8 2 6
3x3 window
Class 6 = majority in window
Example from ERDAS IMAGINE
Field Guide, 5th ed.
2
2
6
2
2
6
8
6
6
2
Input Output
2
2
6
6
2
6
6
6
2
2
6
6
6
6
2
2
2
6
2
2
2
2
6
6
2
6
2
6
6
2
6
6
6
6
2 2
6
6
6
6
Edge
2
2
6
2
2
Input
6
8
6
6
2
2
2
6
6
2
Output
6
6
6
2
2
6
6
6
6
2
2
2
6
2
2
2
2
6
6
2
6
2
6
6
2
Edge
6
6
6
6
2 2
6
6
6
6
2
2
6
2
2
6
8
6
6
2
2
2
6
6
2
6
6
6
2
2
6
6
6
6
2
6 6 2
6 2 6
8 2 6
3x3 window
Class 6 = majority in window
Example from ERDAS IMAGINE
Field Guide, 5th ed.
2
2
6
2
2
6
8
6
6
2
Input
Output
2
2
6
6
2
6
6
6
2
2
6
6
6
6
2
2
2
6
2
2
6
2
6
6
2
2
2
6
6
2
6
6
6
2
2 2
6
6
6
6
Edge
Small clump “eliminated”
• Always want to assess the accuracy of the final thematic map! How good is it?
• Various techniques to assess the “accuracy’ of the classified output by comparing the “true” identity of land cover derived from reference data (observed) vs. the classified (predicted) for a random sample of pixels
• The accuracy assessment is the means to communicate to the user of the map and should be included in the metadata documentation
• R.S. classification accuracy usually assessed and communicated through a contingency table, sometimes referred to as a confusion matrix
• Contingency table: m x m matrix where m = # of land cover classes
– Columns: usually represent the reference data
– Rows: usually represent the remote sensed classification results (i.e. thematic or information classes)
Accuracy Assessment Contingency Matrix
Class.
1.60
2.00
2.10
2.40
2.50
Col
Data
1.10
1.20
1.40
Total
1.10
0
0
116
2
0
0
109
2
3
1.20
1
0
0
11
82
4
2
1
101
1.40
Reference Data
1.60 2.00
14
2
0
0
123
0
1
0
140
0
0
5
0
5
0
0
0
0
0
0
38
22
3
0
0
13
0
2.10
9
0
0
0
1
10
0
0
0
2.40 2.50
0
41
43
1
0
0
1
0
0
74
4
80
1
0
0
1
0
0
Row
77
47
533
25
8
9
Total
126
111
130
• Sampling Approaches: to reduce analyst bias
– simple random sampling: every pixel has equal chance
– stratified random sampling: # of points will be stratified to the distribution of thematic layer classes
(larger classes more points)
– equalized random sampling: each class will have equal number of random points
• Sample size: at least 30 samples per land cover class
• How accurate should the classified map be?
• General rule of thumb is 85% accuracy
• Really depends on how much “risk” you are willing to accept if the map is wrong
• Are you interested in more in the overall accuracy of the final map or in quantifying the ability to accurately identify and map individual classes
• Which is more acceptable overestimation or underestimation
• USGS_NPS National Vegetation classification standard
• Horizontal positional locations meet
National Map Accuracy standards
• Thematic accuracy >80% per class
• Minimum Mapping Unit of 0.5 ha
• http://biology.usgs.gov/npsveg/aa/indexdoc.
html
A whole set of field reference point can be developed using some sort of random allocation but due to travel/access constraints, only a subset of points is actually visited.
Resulting in a not truly random distribution.
• What constitutes reference data?
- higher spatial resolution imagery (with visual interpretation)
- “ground truth”: GPSed field plots
- existing GIS maps
• Reference data can be polygons or points
• Problem with “mixed” pixels: possibility of sampling only homogeneous regions (e.g., 3x3 window) but introduces a subtle bias
• If smoothing was undertaken, then should assess accuracy on that basis, i.e., at the scale of the mmu
• If a filter is used should be stated in metadata
• Ideally, % of overall map that so qualifies should be quantified, i.e., 75% of map is composed of homogenous regions greater than 3x3 in size – thus 75% of map assessed, 25% not assessed.
• Error of Omission: pixels in class 1 erroneously assigned to class 2; from the class 1 perspective these pixels should have been classified as class1 but were omitted
• Error of Commission: pixels in class 2 erroneously assigned to class 1; from the class 1 perspective these pixels should not have been classified as class but were included
from a Class2 perspective
Omission error: pixels in Class2 erroneously assigned to
Class 1
# of pixels
Commission error: pixels in Class1 erroneously assigned to Class 2
Class 1 Class 2
0 255
Digital Number
• Overall accuracy: divide total correct (sum of the major diagonal) by the total number of sampled pixels; can be misleading, should judge individual categories also
• Producer’s accuracy: measure of omission error; total number of correct in a category divided by the total # in that category as derived from the reference data; measure of underestimation
• User’s accuracy: measure of commission error; total number of correct in a category divided by the total # that were classified in that category ; measure of overestimation
Accuracy Assessment Contingency Matrix
Reference Data
2.10
2.40
2.50
Col
Total
Class
.
Data
1.10
1.20
1.40
1.60
2.00
1
3
1
315
0
0
2
0
1.10
1.20
1.40
1.60
2.00
2.10
2.40
2.50
Row
Total
308 23 12 1 0 1 3 0 348
279
1
1
0
9
372
0
1
2
0
26
0
0
1
0
10
0
1
0
0
2
4
0
2
1
0
0
5
295
379
27
18
0
1
0
305
2
12
0
408
0
0
0
29
2
0
0
13
93
1
1
97
1
176
1
189
0
1
48
55
99
194
51
1411
Code
2.00
2.10
2.40
2.50
1.10
1.20
1.40
1.60
Land Cover Description
Developed
Cultivated/Grassland
Forest/Scrub/Shrub
Barren
Unconsolidated Shore
Estuarine Emergent Wetland
Palustrine Wetland: Emergent/Forested
Water
Totals
Number
Correct
308
279
372
26
10
93
176
48
1312
Producer
= s
Accuracy
---
User
= s
Accuracy
---
Kappa
---
Overall Classification Accuracy =
Code
2.00
2.10
2.40
2.50
1.10
1.20
1.40
1.60
Land Cover Description
Developed
Cultivated/Grassland
Forest/Scrub/Shrub
Barren
Unconsolidated Shore
Estuarine Emergent Wetland
Palustrine Wetland: Emergent/Forested
Water
Totals
Number
Correct
308
279
372
26
10
93
176
48
1312
Producer
= s
Accuracy
308/315
279/305
372/408
26/29
10/13
93/97
176/189
48/55
User
= s
Accuracy
308/348
279/295
372/379
26/27
10/18
93/99
176/194
48/51
Kappa
Overall Classification Accuracy = 1312/1411
Code
2.00
2.10
2.40
2.50
1.10
1.20
1.40
1.60
Land Cover Description
Developed
Cultivated/Grassland
Forest/Scrub/Shrub
Barren
Unconsolidated Shore
Estuarine Emergent Wetland
Palustrine Wetland: Emergent/Forested
Water
Totals
Number
Correct
308
279
372
26
10
93
176
48
1312
Producer = s
Accuracy
97.8
91.5
91.2
89.7
76.9
95.9
93.1
87.3
User = s
Accuracy
88.5
94.6
98.2
96.3
55.6
93.9
90.7
94.1
Kappa
.
.
.
.
.
Overall Classification Accuracy = 93.0%
• Kappa coefficient : provides a difference measurement between the observed agreement of two maps and agreement that is contributed by chance alone
• A Kappa coefficient of 90% may be interpreted as 90% better classification than would be expected by random assignment of classes
• What’s a good Kappa? General range
K < 0.4: poor 0.4 < K < 0.75: good K > 0.75: excellent
• Allows for statistical comparisons between matrices (Z statistic); useful in comparing different classification approaches to objectively decide which gives best results
• Alternative statistic: Tau coefficient
K hat
= (n * SUM X ii
) - SUM (X i+ n 2 - SUM (X i+
* X
+i
)
* X
+i
) where SUM = sum across all rows in matrix
Xii = diagonal
X i+
= marginal row total (row i)
X
+I
= marginal column total (column i) n = # of observations
Takes into account the off-diagonal elements of the contingency matrix ( errors of omission and commission )
K
ˆ
N * i k
1
N
2 x ii
i k
1 i k
1
( x i
( x i
* x
i
)
* x
i
)
(SUM X ii
) =
308 + 279 + 372 + 26 +10 + 93 + 176 + 48 = 1312
SUM (X i+
* X
+i
) =
(348*315) + (295*305) + (379*408) + (27*29) +
(18*13) + (99*97) + (194*189) + (51*55) =
K hat
= 1411(1312) – 404,318
(1411) 2 – 404,318
K hat
= 1851232 – 404,318 = 1,446,914 = .912
1990921 – 404,318 1,586,603
Code
2.00
2.10
2.40
2.50
1.10
1.20
1.40
1.60
Land Cover Description
Developed
Cultivated/Grassland
Forest/Scrub/Shrub
Barren
Unconsolidated Shore
Estuarine Emergent Wetland
Palustrine Wetland: Emergent/Forested
Water
Totals
Number
Correct
308
279
372
26
10
93
176
48
1312
Producer
= s
Accuracy
97.8
91.5
91.2
89.7
76.9
95.9
93.1
87.3
** Sample Size for this Land Cover Class Too Small (< 25) for valid Kappa measure
Overall Classification Accuracy = 93.0%
User
= s
Accuracy
88.5
94.6
98.2
96.3
55.6
93.9
90.7
94.1
Kappa
.8520
.9308
.9740
.9622
***
.9349
.8929
.9388
.9120
Case Study
Multi-scale segmentation approach to mapping seagrass habitats using airborne digital camera imaging
Richard G. Lathrop¹, Scott Haag¹·² , and Paul Montesano¹.
¹Center for Remote Sensing & Spatial Analysis
Rutgers University
New Brunswick, NJ 08901-8551
²Jacques Cousteau National Estuarine Research Reserve
130 Great Bay Blvd
Tuckerton NJ 08087
Method> Field Surveys
All transect endpoints and individual check points were first mapped onscreen in the
GIS.
Endpoints were then loaded into a GPS (+- 3meters) for navigation on the water.
A total of 245 points were collected.
Method> Field Surveys
For each field reference point, the following data was collected:
•
GPS location (UTM)
•
Time
•
Date
• SAV species presence/dominance: Zostera marina or
Ruppia maritima or macroalgae
•
Depth (meters)
•
% cover (10 % intervals) determined by visual estimation
• Blade Height of 5 tallest seagrass blades
• Shoot density (# of shoots per 1/9 m2 quadrat that was extracted and counted on the boat)
•
Distribution (patchy/uniform)
•
Substrate (mud/sand)
•
Additional Comments
Results> Accuracy Assessment
The resulting maps were compared with the 245 field reference points.
All 245 reference points were used to support the interpretation in some fashion and so can not be truly considered as completely independent validation
The overall accuracy was 83% and
Kappa statistic was 56.5%, which can be considered as a moderate degree of agreement between the two data sets.
GIS Map
Seagrass
Absent
Seagrass
Present
Producer’s
Accuracy
Reference Reference
Seagrass
Absent
Seagrass
Present
User’s
Accuracy
67
10
87%
32
136
81%
68%
93%
83%
Results> Accuracy Assessment
The resulting maps were also compared with an independent set of
41 bottom sampling points collected as part of a seagrass-sediment study conducted during the summer of
2003 (Smith and Friedman, 2004).
GIS Map
The overall accuracy was 70.7% and
Kappa statistic was 43%, which can be considered as a moderate degree of agreement between the two data sets.
Seagrass
Absent
Seagrass
Present
Producer’s
Accuracy
Reference Reference
Seagrass
Absent
Seagrass
Present
User’s
Accuray
14
9
61%
3
15
83%
82%
62%
71%
SAV Accuracy Assessment Issues
• Matching spatial scale of field reference data with scale of mapping
• Ensuring comparison of “apples to apples”
• Spatial accuracy of “ground truth” point locations
• Temporal coincidence of “ground truth” and image acquisition
•“Real world” is messy; natural vegetation communities are a continuum of states, often with one grading into the next
•R.S. classified maps generally break up land cover/vegetation into discrete either/or classes
•How to quantify this messy world? R.S. classified maps have still have some error while still having great utility
•Fuzzy Accuracy Assessment: doesn’t quantify errors as binary correct or incorrect but attempts to evaluate the severity of the error
•Fuzzy rating: severity of error or conversely the similarity between map classes is defined from a user standpoint
•Fuzzy rating can be developed quantitatively based on the deviation from a defined class based on a % difference (i.e., within +/- so many %)
•Fuzzy set matrix: fuzzy rating between each map class and every other class is developed into a fuzzy set matrix
For more info, see: Gopal & Woodcock, 1994. PERS:181-188
Level Description
5 Absolutely right: Exact match
4
3
2
1
Good: minor differences; species dominance or composition is very similar
Acceptable Error: mapped class does not match; types have structural or ecological similarity or similar species
Understandable but wrong: general similarity in structure but species/ecological conditions are not similar
Absolutely wrong: no conditions or structural similarity http://biology.usgs.gov/npsveg/fiis/aa_results.pdf
http://www.fs.fed.us/emc/rig/includes/appendix3j.pdf
•Each user could redefine the fuzzy set matrix on an application-by-application basis to determine what percentage of each map class is acceptable and the magnitude of the errors within each map class
•Traditional map accuracy measures can be calculated at different levels of error
Exact – only level 5
Acceptable – level 5, 4, 3
(MAX)
(RIGHT)
•Example: from USFS
Label #Sites MAX(5 only)
CON 88 71 81%
RIGHT (3,4,5)
82 93%
Confusion Matrix based on Level 3,4,5 as Correct
Label Sites CON MIX HDW SHB HEB NFO Total
CON 88 X 0 1 5 0 0 6
MIX 14 2 X 1 1 0 0 4
HDW 6 1 1 X 0 0 0 2
SHB 8 1 0 0 X 0 0 1
HEB 1 0 0 0 1 X 0 1
NFO 4 3 0 0 3 0 X 6
Total 121 7 1 2 10 0 0 20 http://www.fs.fed.us/emc/rig/includes/appendix3j.pdf
•Ability to evaluate the magnitude or seriousness of errors
•Difference Table: error within each map class based on its magnitude with error magnitude calculated by measuring the difference between the fuzzy rating of each ground reference point and the highest rank assigned to all other possible map classes
• All points that are Exact matches have Difference values >= 0; all mismatches are negative. Values -1 to 4 generally correspond to correct map labels. Values of -2 to -4 correspond to map errors with -4 representing a more serious error than -1
Label Sites # Mismatches # Matches
-4 -3 -2 -1 0 1 2 3 4
CON 88 4 2 0 11 3 0 12 23 33
Higher positive values indicate that pure conditions are well mapped while lower negative values show pure conditions to be poorly mapped. Mixed or transitional conditions, where a greater number of class types are likely to be considered acceptable, will fall more in the middle http://www.fs.fed.us/emc/rig/includes/appendix3j.pdf
•Ambiguity Table: tallies map classes that characterize a reference site as well as the actual map label
• Useful in identifying subtle confusion between map classes and may be useful in identifying additional map classes to be considered
•Example from USFS
Label Sites CON MIX HDW SHB HEB NFO Total
CON 88 X 11 6 15 0 0 32
15 out of 88 reference sites mapped as conifer could have been equally well labeled as shrub http://www.fs.fed.us/emc/rig/includes/appendix3j.pdf
• Method of statistically adjusting for over- or underestimation
• Randomly allocate “test areas”, determine area from map and reference data
• Ratio estimation uses the ratio of Reference/Map area to adjust the mapped area estimate
• Uses the estimate of the variance to develop confidence levels for land cover type area
Shiver & Border, 1996. Sampling Techniques for Forest Resource
Inventory, Wiley, NY, NY. Pp. 166-169
Example: NJ 2000 Land Use Update
Comparison of urban/transitional land use as determined by photo-interpretation of 1m
B&W photography vs. 10m SPOT PAN
1 m B&W 10 m SPOT PAN
400
350
300
250
200
150
100
50
0
0
Comparison of Land Use between Reference Imagery & SPOT: Urban & Transitional
Tile
1 to 1 line
50
Above 1-to-1 line: underestimate
100
Below 1-to-1 line: overestimate
150 200
SPOT (acres)
250 300 350 400
Land Use Change
Category
Urban
Transitional/Barren
Total Urban &
Barren
Mapped
Estimate
(Acres)
73,191
20,861
94,052
Statistically
Adjusted
Estimate with
95% CI (acres)
77,941 +/-
17,922
16,082 +/-
7,053
89,876 +/-
16,528
Urban/Suburban
Mixed Pixels: varying proportions of developed surface, lawn and trees
30m TM pixel grid on IKONOS image
False
Color
Composite
Image
R: Forest
G: Lawn
B: IS
Impervious
Surface
Estimation
Grass
Estimation
Woody
Estimation
• For homogenous 90mx90m test areas interpreted DOQ
-DOQ pixels scaled to match TM
• For selected sub-areas:
IKONOS multi-spectral image
– 3 key indicator land use classified map: impervious surface, lawn, and forest
-IKONOS pixels scaled to match TM
Impervio us
Grass
Woody
Egg Harbor City
IKONOS
Landsat LMM
Interior LMM - Impervious Surface
8000
6000
4000
2000
0
1 3 5 7 9 11 13
Plot
LMM
Referenc e
Interior LMM - Lawn
8000
6000
4000
2000
0
1 3 5 7 9 11 13
Plot
LMM
Referenc e
Interior LMM - Urban Tree
8000
6000
4000
2000
0
1 3 5 7 9 11 13
Plot
LMM
Referenc e
Landsat SOM-LVQ
Impervio us
Grass
Woody
IKONOS
Hammonton
Landsat LMM
Interior LMM - Impervious Surface
8000
6000
4000
2000
0
1 3 5 7 9 11 13 15
Plot
LMM
Referenc e
Landsat SOM-LVQ
Interior LMM - Lawn
8000
6000
4000
2000
0
1 3 5 7 9 11 13 15
Plot
LMM
Referenc e
Interior LMM - Urban Tree
8000
6000
4000
2000
0
1 3 5 7 9 11 13 15
Plot
LMM
Referenc e
Root Mean Square Error: 90m x 90m test plots
Hammonton
IKONOS
LMM
SOM_LVQ
IKONOS
LMM
SOM_LVQ
Impervious Grass Tree
±
7.4%
±
8.2%
±
7.1%
±
10.8%
±
13.6%
±
20.7%
±
12.0 %
±
10.3%
±
11.0%
Egg Harbor City
Impervious
±
±
±
5.6%
7.7%
6.8%
Lawn Urban
Tree
±
5.8%
±
6.1%
±
12.5%
±
19.6%
±
6.0%
±
5.0%
SOM-LVQ vs.
IKONOS
Study sub-area comparison
3x3 TM pixel zonal % r v i
I m p e o u s
G r a s s
Hammonton
Egg Harbor
City
RMSE =
± 13.5%
RMSE =
± 15.0%
RMSE =
± 17.6%
RMSE =
± 14.4%
T r e e s RMSE =
± 17.6%
RMSE =
± 21.6%
NJDEP - Landsat_SOM IS Area
500000
300000
0
0 0
0
0 0 500000 2000000 3000000 4000000
300000 600000 900000 1200000 1500000
NJDEP
NJDEP
• Impervious surface estimation compares favorably to DOQ and IKONOS
– ±10 to 15% for impervious surface
– ±12 to 22% for grass and tree cover.
• Shows strong linear relationship with IKONOS in impervious surface and grass estimation
• Greater variability in forest fraction due to variability in canopy shadowing and understory background
1 Majority filter: remove “salt & pepper” and/or eliminate clump-like pixels.
2 Sampling methods of reference points
3 Contingency matrix and Accuracy assessment measures: overall accuracy, producer’s accuracy and user’s accuracy, and kappa coefficient.
4 Fuzzy accuracy assessment: Fuzzy rating, set matrix, and ratio estimators.
1 Homework: Accuracy Assessment;
3 Reading Textbook Ch 13, Field Guide Ch 6