The propogation of the uncertainty of land use maps to... Shoufan Fang , George Z. Gertner

advertisement
The propogation of the uncertainty of land use maps to modeled landscape dynamics
Shoufan Fang1, George Z. Gertner 1, Guangxing Wang 1, and Alan B. Anderson2
1
Department of Natural Resources and Environmental Sciences, University of Illinois
at Urbana-Champaign, W503 Turner Hall, 1102 S. Goodwin Avenue, Urbana, IL
61801, USA. E-mail:Gertner@uiuc.edu
2
US Army Corps of Engineer, Construction Engineering Research Laboratory, 9000
Research Park, Champaign, IL 61824, USA.
__________________________________________________________________________________
Abstract
Land use maps are widely used in modeling land use change, urban sprawl, and
other landscape related studies. Misclassification of land use maps is usually
provided as a measure of their quality. However, this very important information
is rarely considered in land use based studies, especially in modeling landscape
dynamics. The ignorance of the uncertainty of land use maps may cause models
to provide unreliable prediction. This study is an attempt to investigate the impact
of the accuracy of land use maps to modeling urban sprawl. In this study, the
regional confusion matrix has been localized using a topographical map. Based
on the regional and local confusion matrices, several uncertainty levels have
been adopted. The results showed that the localized confusion matrix has
significant change in error rates to reflect the character of the study area. The
predictions made at different uncertainty levels are quite different. The
uncertainty sources are also analyzed in this study.
__________________________________________________________________________________
Introduction
Spatially modeling ecosystem dynamics has boosted in landscape-based studies
(Fang et al., 2005a). Previous spatial modeling usually uses land use/cover maps as base
maps and implies that maps are error free or error negligible. However, this implication is
very doubtful due to classification errors, which are usually listed in a confusion matrix, from
remote sensors and data processing (Lunetta et al., 1991). Fang et al. (2002) have revealed
that classification error was a major uncertainty source and its influence is significant.
Therefore, misclassification of land use maps has to be considered in spatial modeling so as
to increase the reliability of spatially modeled ecosystem dynamics.
Large trees can cover their surrounding urbanization in urban/suburban areas. As it
happens, urbanization can be misclassified as forest when satellite imagery was used to
make land use maps. If a land use map covers a large area, its confusion matrix may not
represent the condition in a small study area with such a character thus the confusion matrix
needs to be modified for the small area. Suitable historical information needs to be used to
modify confusion matrices, since a land use map is published for use, it is impossible to
collect ground observations to modify a confusion matrix because ground conditions
changed from the time when satellite imagery was taken.
The objectives of this study are to utilize a topographical map as historical information to
modify a confusion matrix of a land use map for a study area in order to obtain more realistic
error rates, and to investigate the influence of misclassification of a land use map in the
prediction of the probability of urban sprawl.
Study Area, Materials, and Methodology
This study was conducted in the east of Peoria City, State of Illinois, USA. The study area
was of 5.7 × 6.9 km². It contained mainly a lake, hills, commercial-industrial and residential
districts, forests, agricultural fields, and wetland. One of the major characters of this area is
that plenty number of houses (residential land use) have been built on a hill and covered by
tall trees.
The 1993 land use map (resolution: 30×30 m²), its accuracy, and the feature maps for
predicting urban sprawl of the study area had been provided by research team of Land Use
Evolution and Impact Assessment Model (LEAM). The confusion matrix of the land use map
was converted from the map accuracy (Table 1).
Table 1 The regional confusion matrix obtained from EPA accuracy table.
Class
10
20
30
40
50
80
90
Total
10
0.910
0.002
0.006
0.050
0.000
0.000
0.032
1.000
20
0.004
0.617
0.005
0.068
0.004
0.291
0.012
1.000
30
0.000
0.005
0.110
0.573
0.029
0.111
0.170
0.999
Classified
40
0.001
0.001
0.001
0.822
0.003
0.051
0.120
1.000
50
0.045
0.012
0.008
0.211
0.087
0.362
0.275
1.000
80
0.000
0.003
0.001
0.017
0.001
0.967
0.010
1.000
90
0.000
0.002
0.006
0.212
0.008
0.106
0.667
1.000
The topographical map (scale: 1:24000) made by The United States Geographical Survey
(USGS) in 1996 had been used for modifying the confusion matrix. The topographical map
had been scanned to make a graphic file. The resolution of the scanned topographical map
was 3.9 and 4.2 meters along north-south and east-west directions, respectively. A block
(matrix) containing 56 pixels (8 row by 7 column) on the scanned topographical map was
used to match one pixel on the 1993 land use map. Six distinct points were selected to
estimate the coefficients of a pair of models to overlap the two maps using least square
adjustment (Wolf and Ghilani, 1997):
⎧⎪ xtp = A1,0 +A1,1 ⋅ xLU + A1,2 ⋅ yLU
⎨
⎪⎩ ytp = A 2,0 +A 2,1 ⋅ xLU + A 2,2 ⋅ yLU
(1)
where x and y were respectively the coordinates of the maps, subscripts tp and LU
represent topographical and land use maps, respectively, and A’s were the unknown
coefficients.
About 0.5% of the pixels on the 1993 land use map had been randomly selected to modify
the original confusion matrix. Majority method was used to determine the category of the
sampled pixels on the topographical map, and the classification from the topographical map
was treated as the “truth” in modification of the original confusion matrix.
Table 2 Levels of uncertainty adapted in prediction of the probability of urban sprawl.
Level
ER0
Description
Treat the land use map as the “truth”
Level
ER3
Description
Original error rates on all cells
ER1
ER2
Original error rates on neighboring cells
Modified error rates on neighboring cells
ER4
~
Modified error rates on all cells
~
The uncertainty in 1993 land use map was assumed at five levels in probability prediction
based on the regional and modified confusion matrices (Table 2). Probability of urban sprawl
was predicted based on the 1993 land use map and the assumed uncertainty levels. The
probability model had the following form (for details see Fang et al., 2005a):
55
æ P(U _ R)' ö÷ 13
Logçç
A
X
B jUj
÷=
+
(2)
å
å
i
i
÷
çè1 - P(U _ R)' ÷
ø
i= 1
j= 1
where P(U_R)’ was the probability of land use converted from available (Undeveloped) to
Residential, and A, B, X, and U were coefficients and mapped features and their cross
products.
Map uncertainty was induced into prediction via two ways. For uncertainty levels ER1 and
ER2, the error rates of neighboring pixels were applied to compute the effect of the
immediate neighboring pixels (N_E):
1 8
N_E = ∑ I(Ck = R) ⋅ p(R | Ck )
(3)
8 k =1
where I(.) is an indicator function, Ck is the category of the kth neighboring pixel, R is
Category Residential, and p(R| Ck ) is the rate of R when the pixel was classified as Ck . For
uncertainty levels ER3 and ER4, first use Eqs. (3) to induce errors from neighbors, then
adjust the predicted probability from Eq. (2) to obtain the final probability:
P(U _ R) = P(U _ R)'⋅ p(av | C)
(4)
where P(U_R)’ was the original predicted probability using Eq. (2) and p(av|C) was the rate
of the categories which were available for development when the pixel in estimation was
classified as C. For further information and details about the uncertainty methods described
here, see Fang et al. (2005b).
Results
The fitted two models for overlapping topographical and land use maps had very high quality
(see Table 3). Both R-squares were not smaller than 0.9999 and the residuals were at most
a half of the land use pixel width/length.
Table 3 Estimated coefficients and the quality measures of the fitted matching models.
Matching Model
A0 (Intercept) (pvalue)
A1 ( xLU ) (p-value)
A2 ( yLU ) (p-value)
Model’s F-value
(p-value)
R-square
xtp
ytp
-1206598
(<0.0001)
1.04650
(<0.0001)
0.11640
(<0.0016)
15498.3
(<0.0001)
0.9999
-857779
(<0.0001)
-0.11314
(<0.0001)
0.93771
(<0.0001)
94304.8
(<0.0001)
1.0000
Residual (Unit: meter)
Model
xtp
ytp
Largest
-15.956
4.451
Median
13.265
-3.145
Mean
Smallest
11.721
-3.747
3.223
1.167
More than 80% of the sample was in the categories of Forest and Urbanization on the
topographical map and any one of the other categories had a small number of observations.
For this reason, just the rates of Forest and Urbanization were modified for the study area.
When a pixel was classified as Forest on the land use map, the rates of the “truth” was
Forest and Urbanization were respectively 0.770 and 0.197. For classified Urbanization, its
rates of true and Forest were 0.877 and 0.096, respectively. These error rates showed
evidential difference from the regional confusion matrix. The error rates of other categories in
the modified confusion matrix were taken from the original confusion matrix.
A.
B.
C.
D.
E.
Figure 1 Predicted urban sprawl probabilities when different levels of error were used in
prediction. From A to E, uncertainty levels are from ER0 to ER4, respectively.
Figure 1 listed the probability maps predicted considering different uncertainty levels. When
the original confusion matrix was applied to only neighboring pixels, the predicted probability
of urban sprawl was almost the same to that predicted without considering misclassification
(Figures 1.A and B). When the modified confusion matrix was applied to just neighboring
pixels, the predicted probability map had noticeable difference (Figures 1.A and C), and
much higher probability predicted on the hill along the east side of the lake. The much higher
error rate of Urbanization-in-Forest in the modified confusion matrix seemed to be the major
cause of the increased probability. When confusion matrix was applied to all pixels, the major
difference in the predicted probability maps was that there was a probability almost
everywhere except water pixels (Figures 1.D and E). When the modified confusion matrix
was applied, the probabilities of development at forest pixels turned lower than those
predicted with the original confusion matrix. This also reflected the higher rate of
Urbanization-in-Forest.
Discussion and Conclusion
The modified confusion matrix reflects the ground characters better than the original one. It
may also reflect the differences of the definitions of the land use maps and the historical
information used to localize the confusion matrix. It is very difficult to eliminate processing
uncertainty when historical data are used to modify confusion matrix. Temporal and scalar
uncertainty and the differences of definitions existing in land use maps and the historical
information used to localize confusion matrices are the most important uncertainty sources.
When additional processes (such as scanning) are necessary to treat the historical
information before sampling, those treatments also raises uncertainty. In order to reduce the
uncertainty of the localized confusion matrices, eliminate as more uncertainty sources
mentioned above as possible.
Incorporation of the uncertainty of land use maps into land use prediction may change the
outcome of spatial modeling. The impact of the uncertainty of land use maps depends on
which factors have how much uncertainty. A spatially-specified confusion matrix map
(Gertner et al. 2002) is necessary to improve the spatial modeling considering the
uncertainty of land use maps.
Acknowledgments
The authors would like to appreciate the U.S. Army Corps of Engineering, Construction
Engineering Research Laboratory for support and the team of Land Use Evolution and
Impact Assessment Model (LEAM) for providing maps and LEAM model.
Reference
Fang, S., G. Z. Gertner, Z. Sun, and A. A. Anderson (2005a) The impact of interactions in
spatial simulation of the dynamics of urban sprawl. Landscape and Urban Planning
73(4):294-306.
Fang, S., G. Z. Gertner, G. Wang, and A. A. Anderson (2005b) The impact of
misclassification in land use maps in the prediction of landscape dynamics. Landscape
Ecology (Accepted).
Fang, S., S. Wente, G. Z. Gertner, G. Wang, and A. B. Anderson (2002) Uncertainty
analysis of predicted disturbance from off-road vehicular traffic in complex landscapes at Fort
Hood. Environmental Management 30(2): 199-208.
Gertner, G. Z., S. Fang, G. Wang, and S. Shinkareva (2002) Image-aided spatial accuracy
assessment of land cover classification. In the International Union of Forestry Research
Organization Conference entitled, Symposium on Statistics and Information Technology in
Forestry, 09/2002, Blacksburg, Virginia.
Lunetta R.S., Congalton R.G., Fenstermaker L.K., Jensen J.R., McGwire K.C. and
Tinney L.R. (1991) Remote sensing and geographic information system data integration:
Error sources and research issues. Photogrammetric Engineering & Remote Sensing
57(6):677-687.
Wolf, P. R. and C. D. Ghilani (1997) Adjustment computations: statistics and least squares
in surveying and GIS. John Wiley, New York.
Download