SIMULATING REMOTELY SENSED IMAGERY FOR CLASSIFICATION EVALUATION

advertisement
SIMULATING REMOTELY SENSED IMAGERY FOR CLASSIFICATION
EVALUATION
Desheng Liu
Department of Geography and Department of Statistics, The Ohio State University
1036 Derby Hall, 154 N Oval Mall, Columbus, OH, 43210, USA (liu.738@osu.edu)
KEY WORDS: Simulation, Remote Sensing, Image Classification, Accuracy Assessment, Simulated Annealing
ABSTRACT:
The remote sensing community has long been active in developing, evaluating, and comparing different classification algorithms
using a variety of remotely sensed imagery. As an integral component of image classification, accuracy assessment is usually
conducted to evaluate the agreement between the classified map and the corresponding reference data. However, current accuracy
assessment practices are limited by the difficulties in obtaining high quality reference data and the lack of spatial representation of
classification uncertainties. To overcome these limitations, we developed a simulation approach to obtaining the desired reference
data for better evaluation of classification algorithms. The simulation approach involves three components: 1) a real image scene, 2)
a reference map, and 3) a simulated image scene. The real image scene is assumed as a random realization of a spectral probability
model governed by an unknown underlying process, which is defined as the ground truth of the classification of the image scene.
The reference map represents a reasonable estimate of the unknown process that generates the real image scene. The simulated
image scene is generated as a random realization of the spectral probability model governed by the estimated process represented in
the reference map. Specifically, an initial simulated image is firstly generated by independently sampling based on probability
distributions estimated from the real image scene and reference map. Then, the initial simulated image is iteratively perturbed using
simulated annealing to create a final simulated image which has similar spectral, spatial, and textual properties to the real image
scene. The simulation approach was applied to a Landsat TM image scene and promising results were achieved.
1. INTRODUCTION
Image classification is one of the most fundamental applications
of remotely sensed data (Jensen 2004). Thematic maps
generated from the classification of remotely sensed imagery
are widely used in many environmental, ecological, and socialeconomic studies. Driven by the need for better classification
results, the remote sensing community has long been active in
developing, evaluating, and comparing different classification
algorithms (Erbek et al, 2004; Flygare, 1997; Liu et al, 2006;
Lu et al, 2004). The past decades have witnessed the
developments of a large number of computer-based
classification algorithms using a variety of remotely sensed
imagery across different fields (Landgrebe, 2003; Lu and Weng,
2007; Tso and Mather, 2001). With the increased availability
and enhanced capability of remotely sensed data, the efforts in
advancing classification algorithms are expected to continue.
As an integral component of image classification, accuracy
assessment (Congalton and Green, 1999; Foody, 2002) is
usually conducted to evaluate the agreement between the
classified map and the corresponding reference data (or ground
truth). For map users, the accuracy assessment serves as an
indicator on the accuracy of the classified map. For
classification algorithm developers, the accuracy assessment is
an essential procedure to evaluate the performances of
classification algorithms. Current practices on accuracy
assessment are primarily based on various global accuracy
indices (e.g. overall accuracy, user’s accuracy, producer’s
accuracy, and kappa coefficient) derived from a confusion or
error matrix, which is built upon the reference data for a small
subset of the classified imagery. The reference data are usually
obtained from field survey, visual interpretation of high spatial
resolution imagery, or other existing maps.
299
However, there are two major limitations for the current
accuracy assessment methods. The first limitation is related to
the quality of reference data. One important requirement for the
accuracy indices to be valid estimators of the true map accuracy
is that the reference data are representative to the whole
imagery and accurately reflect the ground truth (Congalton and
Green, 1999; Foody, 2002). Representative reference data
demand a careful design of probability sampling. In field survey,
rigorous probability sampling design is often hard to implement
due to physical constraints or high costs. The interpretation of
high spatial resolution imagery can be biased by the tendency
for selecting homogeneous regions and avoiding boundaries and
complex regions. Existing maps often contain classification
errors and are outdated compared to the classified image.
Moreover, all the above three means of obtaining reference data
are subject to registration errors of the reference data to the
classified imagery. Consequently, accuracy assessment based
on non-representative, inaccurate, and mis-registered reference
data is not reliable for both map users and classification
algorithm developers (Congalton and Green, 1999; Foody,
2002). The second limitation lies in the lack of spatial
representation of classification uncertainties in the current
accuracy assessment methods. It is well-known that
classification errors are not randomly distributed across the
imagery but have strong spatial patterns. Knowledge on the
spatial patterns of classification errors can inform map users on
the classification uncertainty and possible error propagation.
For classification algorithm developers, a good understanding
on the error patterns across the imagery helps to better evaluate
the spatial aspects of classification algorithms and provide
insights for further improvements. This is particularly true when
contextual and object-based classification algorithms are
assessed. Unfortunately, the accuracy indices derived from
confusion matrix only give a global view of classification
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B2. Beijing 2008
advantages over the traditional approaches using limited
reference data of the real image scene. Firstly, the reference
map is free from classification and registration errors to the
simulated image scene. Secondly, the reference map consists of
the entire sample space, so it allows classification developers to
experiment with different sampling designs. Finally, the
reference map allows the spatial representation of classification
errors by evaluating every classified pixel.
results but provide no information on the spatial distribution of
classification errors.
The two limitations discussed above call for more
comprehensive and accurate reference data with which a more
reliable and complete accuracy assessment can be performed.
For map users, this remains an open question as it depends on
the real situation of the specific mapping area. For classification
algorithm developers, however, the reference data are not
necessarily limited to the real situation given that the main
objective is to evaluate classification algorithms. Therefore, we
propose to use image simulation as a new approach to obtaining
the desired reference data for better evaluation of classification
algorithms. The purpose of this paper is to develop a new
approach to simulating remotely sensed imagery for fully
evaluating classification algorithms. The rest of the paper is
organized as follows. We first introduce our conceptual model
for simulating remotely sensed imagery followed by detailing
the procedures for image simulation. Then we present the
results using Landsat TM image. Finally, we discuss issues
related to image simulation and conclude with a summary.
2.2 Reference Map Generation
One approach to generating a reference map is through
supervised classification of the real image scene based on the
real reference data collected in a traditional way. The classified
image can be further post-processed (e.g. removing “salt-andpepper” effects) to entail certain cartographic generalization
and achieve a map-like quality. Nevertheless, the final reference
map may still have certain level of classification errors due to
the imperfection of the reference data and the classifier, but it
should approximately reflect the general pattern of the
underlying process that generates the real image scene. Since
the main goal at this stage is not to test the classification
algorithms but to have a baseline reference map for simulating
images, the final reference map will be used as a starting point
for the following image simulation.
2. METHODS
2.1 The Conceptual Model
2.3 Image Simulation
As illustrated in Figure 1, the conceptual model used for image
simulation consists of three components: 1) a real image scene,
2) a reference map, and 3) a simulated image scene. The real
image scene can be any remotely sensed imagery of interest. It
provides basic information on the spectral, spatial, and textual
aspects of a real remotely sensed image scene. The real image
scene is assumed to be a random realization of spectral
reflectance governed by an unknown underlying process, which
is defined as the ground truth for the classification of the image
scene. The reference map represents a reasonable estimate of
the unknown process that generates the real image scene. As an
estimate, the reference map may not accurately represent the
ground truth for the classification of the real image scene.
Hence, accuracy assessment based on the reference map may
not be reliable. However, the reference map serves as a
connection between the real image scene and the simulated
image scene. The simulated image scene is generated as a
random realization of the spectral reflectance, which has similar
spectral, spatial, and textual properties to the real image scene,
governed by the estimated process represented in the reference
map. Therefore, the reference map is the true representation of
the classification of the simulated image scene.
The main goal of the image simulation is to generate a
simulated image scene so that 1) its underlying process is
governed by the reference map, and 2) its spectral, spatial, and
textual property is similar to that of the real image scene. In
doing so, we utilize an optimization algorithm called simulated
annealing for the simulation. The basic idea of simulated
annealing is to perturb an initial (or seed) image according to a
suitable annealing schedule in order to create a desired (optimal)
image which minimizes a set of user-defined objective
functions (Geman and Geman, 1984; Goovaerts, 1997; Burnicki
et al, 2007). In what follows, we describe the generation of
initial image, define objective functions, and specify
perturbation mechanism.
2.3.1 Initial Image Generation: Based on the real image
scene and the reference map, the probability distribution of
spectral reflectance is estimated for each class on the reference
map. Gaussian models can be used for simple distributions
while Gaussian mixture models may be needed for more
complex distributions. Denote the initial image scene by
{ X (0) (u1 ),..., X (0) (uN )} , where ui is the spatial location of
Figure 1. The conceptual model of image simulation
pixel i (i = 1,..., N ) . The initial image scene is then generated as
a random realization of the estimated probability distribution
models. Specifically, for each pixel, a random sample is drawn
from the probability distribution conditional on its class type on
the reference map. In this way, the spectral properties of the
real image scene were captured in the simulated image.
However, the initial image scene is purely a collection of
independently, identically distributed (I.I.D.) samples of each
class. The following perturbation will make it a more realistic
image scene for classification analysis.
Based on the conceptual model, the classification algorithm
developers can apply any classification algorithm to the
simulated image scene and evaluate its performance by
accuracy assessment based on the reference map. This
simulation approach to accuracy assessment has three
2.3.2 Objective Functions: The initial image does not have
any spatial and textual patterns that a real image scene should
have. The purpose of the perturbation is to modify the initial
image to achieve the desired spatial and textual properties. To
do so, objective functions are needed to guide the perturbation.
Real Image Scene
Simulated Image Scene
Reference Map
300
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B2. Beijing 2008
( m)
( m −1)
(u 2 )
⎪⎧ X (u1 ) = X
.
⎨ ( m)
( m −1)
(u1 )
⎪⎩ X (u2 ) = X
Therefore, we define two objective functions to represent the
spatial and textual components of the real image scene. Since a
single image band is often used to model the spatial and textual
properties of a real image scene, we construct the two objective
functions based on grayscale imagery in the following. For
multispectral imagery, a principle component analysis (PCA)
can be applied firstly; and the following objective functions are
then calculated using the major principle component.
The m-th overall objective function O ( m ) is then calculated
based on { X ( m ) (u1 ),..., X ( m ) (u N )} . Finally, a decision is made
on whether to accept the m-th perturbation according the
following rules:
The first objective function, denoted by O1, characterizes the
spatial autocorrelation of an image scene using a semivariogram
model. O1 is calculated as the root mean squared difference
between the semivariances of the real image scene and those of
the simulated image scene over a set of lags:
O1 =
1 L
2
∑ (γ S (l ) − γ R (l ) ) ,
L l =1
2.
When O ( m ) > O ( m −1) , accept the m-th perturbation
with a probability p = exp( −m / λ ) , where λ is a
constant to control the rate of the decrease of the
probability p with respect to the increase of m.
same as { X ( m −1) (u1 ),..., X ( m −1) (u N )} .
The above iteration process proceeds until a predefined
objective function value is achieved or the maximum number of
iterations is reached.
3. RESULTS
We tested our algorithms based on a scene of Landsat TM
image. This image scene has a size of 256 by 256 pixels, which
are of 28.5 meter resolution, with the exception of the thermal
infrared band of 57 meter resolution. For the purpose of image
simulation, three bands (band 3, band 5, and band 7) were
selected for this study. A false colour representation of the real
image scene is shown in Figure 2(a). Two classes (forest and
bare) were considered for the reference map in the simulation.
The reference map shown in Figure 2(b) was generated by
unsupervised classification followed by some post-processing.
As introduced before, the reference map is not perfect ground
truth to the real image scene. However, it reasonably estimated
the underlying process that generated the real image scene,
which was used to generate the simulated image scene.
(2)
where ui is the spatial location of pixel i (i = 1,..., N ) ;
ES (ui ) and ER (ui ) are the entropy measures at pixel i for the
simulated image scene and the real image scene respectively.
The two objective functions are then combined as one overall
objective function using a suitable weight ω:
O = O1 + ωO2 .
When O ( m ) ≤ O ( m −1) , accept the m-th perturbation;
If the decision is to reject the m-th perturbation, then the update
in (4) is not taken. Equivalently, { X ( m ) (u1 ),..., X ( m ) (u N )} is the
The second objective function, denoted by O2, quantifies the
textual pattern of an image scene using a texture measure.
Specifically, the textual measure is defined by the local entropy
of a 9 × 9 image window centred on each pixel. O2 is calculated
as the root mean squared difference between the pixel-wise
entropy measures of the real image scene and the simulated
image scene:
1 N
2
∑ ( ES (ui ) − ER (ui ) ) ,
N i =1
1.
(1)
where l = 1,..., L are the spatial lags; γ S (l ) and γ R (l ) are the
empirical semivariances at lag l for the simulated image scene
and the real image scene respectively.
O2 =
(4)
(3)
(a)
(b)
2.3.3 Iterative Perturbation: The initial overall objective
function O (0) is firstly calculated based on the initial simulated
image. A sequence of iterative stochastic perturbation
(m = 1,..., M ) is then introduced to modify the initial image to
minimize the overall objective function. Denote the simulated
image scene after the m-th iteration as { X ( m ) (u1 ),..., X ( m ) (u N )} .
Figure 2. (a) The real image scene, (b) the reference map
At the m-th iteration, one perturbation is initiated by swapping a
pair of randomly chosen pixels at ui and u j . That is, X ( m ) (ui )
and X
(m)
Preliminary exploratory analysis on the histograms of spectral
data of two classes indicated that multivariate Gaussian models
were sufficient for estimating the probability distributions.
Therefore, the initial simulated image shown in Figure 3(a) was
generated by randomly drawing independent samples from the
two multivariate Gaussian models which were estimated from
(u j ) are updated as follows
301
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B2. Beijing 2008
the real image scene. Figure 3(a) showed that the initial
simulated image was extremely noisy due to the lack of spatial
and textural patterns.
(b)
(a)
Figure 5. The semivariograms of the real image scene (solid
red line), initial simulated image scene (dotted green line), and
the final simulated image (dotted black line)
Figure 3. (a) The initial simulated image scene, (b) the final
simulated image scene
After about one million perturbations using simulated annealing,
the final simulated image shown in Figure 3(b) was generated
with a pre-defined small threshold value on the overall
objective function achieved. Compared with the initial
simulated image scene in Figure 3(a), the final simulated image
scene showed strong spatial and textual patterns. More
importantly, the spatial and textual patterns in the final
simulated image scene appeared very similar to those observed
in the real image scene. This was further confirmed when
comparing their local entropy measures and semivariograms. In
Figure 5, the red solid line nearly coincided with the black
dotted line, indicating that the final simulated image scene
achieved the targeted semivariance of the real image scene at all
lags. The local entropy image illustrated in Figure 4(c) also
matched perfectly with that in Figure 4(a). Visual comparison
of the textures between Figure 2(a) and Figure 3(b) further
confirmed this.
The differences in spatial and textual patterns between the real
image scene and the initial simulated image scene are further
illustrated in their local entropy measures in Figure 4(a) and
Figure 4(b) and semivariograms in Figure 5. From Figure 4(a)
and (b), the local entropy measure of the initial simulated image
scene was quite different from that of the real image scene.
Figure 5 showed that the semivariance of the initial simulated
image scene (dotted blue line) was larger than that of the real
image scene (solid red line) at all lags, particularly at the small
lags, indicating large nugget effects for the initial simulated
image scene.
(a)
(b)
4. DISCUSSION
The simulation approach showed promising results for the
Landsat TM image scene as evidenced by the desired spatial
and textual properties on the final simulated image scene.
However, the final simulated image scene differed from the real
image scene in some local structural patterns. For example, the
linear features in the real image scene were not captured in the
final simulated image. This was due to the fact that the two
objective functions defined in this paper could not uniquely
determine the spatial and textual properties of an image scene.
For the purposes of evaluating classification algorithms, this
may have the advantage that more simulations can be
implemented to explore different spatial and textual patterns.
(c)
This study was focused on a relatively small image scene
because the use of simulated annealing in the simulation
process is computationally expensive for large image scene. As
each perturbation only involves two pixels, the changes of
objective functions are only affected by the local
neighbourhood of the two pixels. Hence, some implementation
tips suggested by Goovaerts (1997) can be used to improve the
computation efficiency.
Figure 4. The local entropy measures of (a) the real image
scene, (b) the initial simulated image scene, and (c) the final
simulated image scene
The proposed simulation framework is pretty general and
should be applicable to any image scenes. This study only
tested a relatively simple landscape, where Gaussian models
were sufficient as probability distributions. Future research
should explore a variety of image scenes with different
302
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B2. Beijing 2008
artificial neural network algorithms for land use activities,
International Journal of Remote Sensing, 25, pp. 1733–1748.
complexities in landscape and probability distributions. More
sophistic probability models and objective functions may be
needed for more complex image scenes.
Flygare, A-M., 1997. A comparison of contextual classification
methods using Landsat TM, International Journal of Remote
Sensing, 18, pp. 3835–3842.
5. CONCLUSIONS
Current accuracy assessment practices for classification
evaluation are limited by the difficulties in obtaining high
quality reference data and the lack of spatial representation of
classification uncertainties. In this paper, we proposed a new
simulation approach to generating remotely sensed imagery in
an attempt to obtain ideal reference data for better evaluation of
classification algorithms. The simulation approach involves
three components: 1) a real image scene, 2) a reference map,
and 3) a simulated image scene. Specifically, an initial
simulated image is firstly generated by independently sampling
probability distributions estimated from the real image scene
and reference map. Then, the initial simulated image is
perturbed using simulated annealing to create a final simulated
image which has similar spectral, spatial, and textual properties
of the real image scene. When the reference map is used for
accuracy assessment of the classification of the simulated image
scene, three advantages can be found: 1) the reference map is
free from classification and registration errors to the simulated
image scene, 2) the reference map consists of the entire sample
space, so it allows classification developers to experiment with
many different sampling schemes, and 3) it allows the spatial
representation of classification uncertainties.
Foody, A.M., 2002. Status of land cover classification accuracy
assessment. Remote Sensing of Environment, 80, pp. 185-201.
Geman, S. and D. Geman, 1984. Stochastic relaxation, Gibbs
distributions, and the Bayesian restoration of images, IEEE
Transactions on. Pattern Analysis and Machine Intelligence, 6,
pp. 721-741.
Goovaerts, P., 1997. Geostatistics for natural resources
evaluation, Oxford University Press, New York.
Jensen, J.R., 2004. Introduction to Digital Image Processing: A
remote sensing perspective, 3rd edition. Prentice Hall,
Piscataway.
Landgrebe, D.A., 2003. Signal Theory Methods in Multispectral
Remote Sensing, John Wiley and Sons, Hoboken.
Liu, D., M. Kelly, and P. Gong, 2006. A spatial-temporal
approach to monitoring forest disease spread using multitemporal high spatial resolution imagery, Remote Sensing of
Environment, 101(2): 167-180.
Lu, D., P. Mausel, M. Batistella, and E. Moran, 2004.
Comparison of land-cover classification methods in the
Brazilian Amazon Basin, Photogrammetric Engineering and
Remote Sensing, 70, pp. 723–731.
REFERENCES
Burnicki, A.C., D.G. Brown, and P. Goovaerts, 2007.
Simulating error propagation in land-cover change analysis:
The implications of temporal dependence. Computers,
Environment and Urban Systems, 31(3), pp. 282-302.
Lu, D. and Q. Weng, 2007. A survey of image classification
methods and techniques for improving classification
performance. International Journal of Remote Sensing, 28(5),
pp. 823-870.
Congalton, R. G., and K. Green, 1999. Assessing the accuracy
of remotely sensed data: principles and practices, Lewis
Publishers, Boca Raton.
Tso, B. and P.M. Mather, 2001. Classification Methods for
Remotely Sensed Data. Taylor and Francis, New York.
Erbek, F.S., C. Ozkan and M. Taberber, 2004. Comparison of
maximum likelihood classification method with supervised
303
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B2. Beijing 2008
304
Download