Uploaded by eseaganbi

Abstract

advertisement
Abstract
The use of ground truth (GT) data in the learning and/or the assessment of
classification algorithms is essential. In order to obtain better decisions, these
algorithms must provide consistent results as regards the reference spectral
signatures of GT and observation on the Earth’s surface. Using a biased or a
simplified GT attached to a hyperspectral image to be partitioned does not allow a
rigorous explanation of the physical phenomena that these images reflect.
Unfortunately, this scientific problem is not always treated carefully and generally
neglected in the relevant literature. In this case, the impacts that can result from
classification algorithm design is negative. This is inconsistent with respect to the
considerable investments in both the development of sophisticated sensors and the
design of objective classification algorithms. Any GT must be validated according to a
rigorous protocol before any utilization, which is unfortunately not always the case. In
this paper, we study and bring evidence through two examples of images (Indian Pine
and Pavia University) that misleadingly are frequently used without care by the
remote sensing community, because the associated GTs are not accurate. Through
this analysis, we prove also the heterogeneity of the spectral signatures of some GT
classes by using a semi-supervised and an unsupervised classification method.
Through this critical analysis, we propose a general framework for a minimum
objective assessment and validation of the GT accuracy, before exploiting them in a
classification method.
Introduction
During the last decade, hyperspectral imagery has become an appropriate earth
observation means to help decision making, and today, is considered an excellent
information source to help analysis and interpretation of imaged objects in a variety of
applications, well beyond the field of remote sensing. Indeed, it is recognized that
hyperspectral imagery allows a better characterization of physical phenomena and
allows a more accurate discrimination of observed materials than traditional 3-bands
in the visible range (RGB) or even multispectral images (a few to tens of spectral
bands).
Aerial hyperspectral imagery provides detailed and objective information on a scene
thanks to its large spectral range (several hundreds of spectral bands covering the
visible and infrared domains) and its fine spatial resolution (a few tenths of
centimeters). Owing to such richness of information, the interest in hyperspectral
image (HSI) data has increased during the recent years in many application fields.
Among these fields, we can mention the qualitative and quantitative inventory of
vegetation species and their spatial distribution,1,2the early detection of vegetation
diseases3,4 and of invasive species,5,6 the identification of marine algae,7,8 the human
and animal impacts on environment,9,10 etc.
Despite the richness of information it provides, and the wide range of currentand
potential applications it encompasses, hyperspectral imagery exploitation is still a big
challenge, due to the difficulty to analyze image data sets which can be very large in
both the spatial and spectral dimensions.11,12
To highlight and exploit this wealth of information given by hyperspectral images
(HSIs), classification is a central stage in decision making processes. It helps
summarizing the image information content by assigning a uniquelabel to similar
pixelsin the image, objectively based on its spectral signature. Classification
methodscan be categorized in three families, namely supervised, semi-supervised or
unsupervised.13,14,15Supervised methods require a priori knowledge of the ground
truth (GT) label information in the learning and assessment stages.16,17 In the case
ofsemi-supervised methods,the knowledge of the number of classes(often given by
the GT), and/or some threshold values, or the number of iterations for iterative
methods, are required to perform the classification task.18,19Lastly, unsupervised
methods objectively aggregate the objects (pixels) in classes without any knowledge
(neither the number of classes to discriminate, nor learning samples). They
estimatethe number of classes and aggregate pixels in classes owing to oneor
several optimization criteria.20
Whatever the category a classification method belongs to, one always needsa
reliableGT. This knowledgeis essential during the stages of evaluation and validation
of classification results or algorithms, otherwise the assessment of classification
methods will have no scientific credibility. For instance, imagine we have an aerial
HSI of cultivated area for which the reference classes information (GT) is wrongly
summarized to a single class content (e.g. wheat). It is very likely that this image
exhibits spectral variations due to existence of several homogenous regions, though it
is claimed as homogeneous and wrongly reduced to a single region in the GT map.
These spectral variations detected by the hyperspectral imager may come from
regions in which the seeded plants did not grow uniformly for multiple reasons (plant
disease, local moisture, path through the plant crop, etc.). Now, assume we want to
apply and assess an unsupervised (no prior knowledge) classification algorithm to this
image. The chosen algorithm, without much a prior iinformation, will probably be able
to objectively discriminate these variations and to provide several classes which
account for these variations, therefore highlightinginformational content which is not
present in the original GT map. On the one hand, forcing pixels to belong to a wrong
class during the learning stage of a supervised classification method,or assuming a
lower number of classes with respect to reality in the case of semi-supervised method
can have a high negative impact: the measured classification accuracy does not
significantly reflect the physical reality of the observed image, since pixels with very
different spectral signatures are merged into ‘virtual’ classes. Indeed, in the absolute,
a homogenous class must be formed of individuals or objects having the same or very
close characteristics. Therefore, at a first level, a ground truth must first take into
account the physical characteristics of the objects present in the imaged scene. Then
at another level, during the elaboration of GT or with the help of end-users it must
mention end-user show the classes will be forced to merge to form virtual classes: for
example, in an agricultural field, how pixels belonging to bare soil should be grouped
with those belonging to growing corn. The practical consequences of such
knowledge-based (sometimes arbitrary) class merging are here not very serious
critical, but they might be disastrous, depending on the application area; for example,
in the medical field, imagine the consequences of confusing a tumor area with a sane
area from a partitioned image.
Another important point is the evaluation of classification methods based on a false or
simplified GT. With such a GT, unsupervised classification methods15 are doomed to
failure and unjustifiably disqualified versus supervised or semi-supervised methods
using a biased GT, though they are likely to provide classification maps closer to the
physical reality.
To illustrate the problem addressed here, we provide in this paper an analysis of the
GT data associated with two well-known HSIs: Indian Pine (AVIRIS) and Pavia
University (ROSIS). Both images have been extensively used in the remote sensing
literature dealing with HIS pixels classification or clustering. For example, so far, more
than 200 scientific papers mention these two data sets in their abstract or keywords.
By analyzing some specific classes defined in the GT map, and when possible thanks
to field observations, we demonstrate the fact that these reference maps are illconditioned and should be at least reconsidered before being used for classification
purposes.
We must specify that the problem raised here does not aim to propose a new method
for selecting learning samples. It is rather an objective critical analysis that underlines
the use of inconsistent GT data, for the assessment of classification algorithms, as
well as the incoherent results given by certain algorithms, which follow the biased
GTs too closely. This scientifically worrying problem is becoming more and more
pregnant and unfortunately creates a lot of confusion in the related scientific literature.
It calls into question the credibility of the contribution of new generation sensors and
the accurate and objective analysis, with sophisticated algorithms, of the information
these sensors can acquire. It is regrettable that this problem is not systematically
avoideddespite the existence of credible scientific reasons.This paper underlines the
fact that any ground truth should not be considered systematically as absolute. Before
any use it must be validated according to a rigorous protocol which is unfortunately
not always the case. It is therefore important to remember that the fineness and
richness of the data provided by the new generation of imaging sensors, and the
development of increasingly sophisticated algorithms must contribute to more and
more objective decision making. The paper gives a comprehensive analysis and
further details of the work published in Chehdi and Cariou22. The steps of the
proposed analysis can be used as a basic approach to validate a ground truth data
set.
The remaining of the paper is organized into two sections. The second section
presents (i) a spectral analysis of two popular HSIs based on their associated GT
maps,(ii) an assessment of the homogeneity of the GT classes owing to a semisupervised and an unsupervised classification methods, (iii) a description of the
impacts of a biased GT, and (iv) a general framework to assess and validate a given
GT data base. The last section provides a conclusion.
Spectral analysis of biased Ground Truth of HSIs and Impacts
In the remote sensing field, the ancillary data associated with acquired imagesare
sometimes misleadingly called ground truth data because they are incorrect or too
much simplified. This problem is particularly frequent in airborne and space borne
remote imaging where GT data are often utilized in an abusive and inappropriate
manner. Before we prove this finding, it is very important to first recall before
definitions and the meaning of the GT authenticity.
1.1 Ground truth definition
According to the Oxford English dictionary,23 there are three definitions of ground
truth, depending on its usage:
i.
Information that has been checked or facts that have been collected at source.
ii.
Information obtained by direct measurement at ground level, rather than by
interpretation of remotely obtained data (as aerial or satellite images, etc.), especially
as used to verify or calibrate remotely obtained data.
iii.
Information obtained by direct observation of a real system, as opposed to a
model or simulation; a set of data that is considered to be accurate and reliable, and
is used to calibrate a model, algorithm, procedure, etc. Also (specifically in image
recognition technologies) information obtained by direct visual examination, especially
as used to check or calibrate an automated recognition system.
These definitions converge and bring no confusion to the interpretation of the noun
“ground truth”.They are also concordant with that given by Claval21in the sense “thatit
guarantees the authenticity of the collected observations”.
1.2 Ground truth authenticity
Since the advent of technological remote sensing means, several authors have
pointed out the risk of abandoning the precision and authenticity of the so-called
"microlevel" knowledge (e.g. Rundstrom and Kenzer24) in favor of the "macrolevel"
generalization. Therefore, the field work, called "intimate sensing" by Porteous25,
nevertheless corresponds to a necessary complement of knowledge, even at the
macroscopic scale.
Whatever the application domain or the theme which a “ground truth” is associated to,
this latter therefore must guarantee the authenticity and accuracy of observations,
and must be faultless, since it is a reference, a model. In a decision making
framework based on image processing and analysis, a GT map must be consistent
with the corresponding image data since the latter are bounded to the physical
characteristics of objects or real materials which are present in the imaged scene.
Moreover, each area declared as homogeneous classes must refer to the same
content. This GT area must therefore be coherent with the corresponding area in the
HSI that objectively represents the real scene, meaning that the pixels of an
homogeneous image region must have similar spectral features; otherwise the results
of the objective analysis of images exploited in the decision making process will never
match those of the simplified or wrong GT.This means that any analysis method using
untrue GT data will provide biased and non-rigorously exploitable results as well as
irrelevant conclusions.
To illustrate this, in the following subsection, we will present analysis results
focusingon two significant examples, namely the cases of the Indian Pine and Pavia
University datasets.These are the most widely used benchmark datasets (HSIs and
associated superimposable GT maps) referred to in the remote sensing community
for classification purposes. For each dataset, we first present the characteristics of the
image and the correspondingGT. Then, we show the main results of the different
analyses performed to put in evidence the inhomogeneity of GT classes owing to the
corresponding spectral signatures of the pixels. The average and standard deviation
of the spectral signatures of each GT class are presented. We highlight the anomalies
of these two GTs by calculating the spectral dispersions within the reference classes.
Due to space limitation, only the results of the analysis on GT classes presenting a
significant number of samples are presented.
We also provide examples of classification results which highlight the need in
subdividing the classes of the original GTs to allow a better coherence with the HSIs,
based on the spectral features. Finally, we discuss the approximations made in
constructing the GT maps associated with HSIs and their negative impacts in the
analysis and interpretation of their informational content.
1.3 Analysis of two biased ground truth
1.3.1 Indian Pine GT classes
The AVIRIS Indian Pine HSI26 has a spatial size of 145x145 pixels, where each pixel
is characterized by a set of 220 spectral values (features).The spectral range is from
400 to 2499 nm. The ground spatial resolution is approximately 20m per pixel.The
corresponding GT mapis made of 16 classes.
Fig. 1displays the HSI visualized under two different wavelengths triplets in order to
highlight the variations in the regions corresponding to each original GT class, as well
as the image of class labels of the associated GT. Table 1 details the nature of each
supposed homogeneous class and the number of pixels that compose it.
Spectral signature analysis of hyperspectral images
A.
In a HSI, a pixel is characterized by its spectral signature, a set of features
corresponding to spectral bands.
Let X= {x1, x2, … ,xN} the set of elements (pixels) to be partitioned.Each pixel xi is
characterized by the feature vectorAi= A(xi) = (ai1, ai2, … , aiB)T, where B is the
number of features (spectral bands).
Consider a partition of X into K indexed subsets or classes Cn 1nK , and
li 1,
label associated to pixel xi, so that the n-th class Cn   xi : li  n1iN and
Cn =M n .
, K  the
The
average spectral signature (barycenter) is given by:
g
n

1

M n x C
i n
A  xi  ,
(1)
The metric used here to calculate the dispersion of a class Cn, is the L1-norm
distance(sum of the absolute values of errors).
The total dispersion of a class Cnis defined by:
D
n

 d  xi , gn  ,
xi Cn
(2)
with d  xi , gn  the L1-norm distance between a pixel xi and the barycenter gn of class
Cn:
d  xi , gn 
B


a  g ,
ik
ik
k 1
(3)
In order to account for the population size within a class, we also calculate the
average total dispersion of class Cn:
D
Dn 
M
n
,
(4)
n
For the GT data under study,n = 1, 2, …,16, i.e.K = 16.
Table 2shows the results of the total dispersion, the averaged total dispersion as well
as the dispersion rank of each GT class using the L1-norm distance for the Indian
Pine dataset. The four GT classes which exhibit the highest total dispersion are (in
decreasing order), C11, C2, C12 and C14. This ranking is different when considering the
average dispersions. Apart from C12, these classes are the ones which contain the
highest number of pixels. In the following, we have limited the spectral analysis to
C11(Soybeans min-till) and C2 (Corn no-till) GT classes.Fig. 2shows the selected
regions of the original image corresponding to these GT classes. Fig. 3 shows the
spectral signatures of the pixels, the average spectral signature and the standard
deviation within the C11 and C2GT classes.The wavelengths of the first band and the
last band are respectively to 400 and 2499 nm.
The high variations of the spectral signatures inside each GT class confirm the
dissimilarity of the pixels which form these two classes. This conclusion is consistent
with the disparity of these classes observed with just three bands of the original HSIas
seen in
Fig. 1and Fig. 2and no further criterion is necessary to confirm it. The most
homogeneous class for this GT is C7, even if a few pixels are distant from the class
barycenter. This fact is confirmed by observing the weak variations of the standard
deviation of the spectral signatures around the average spectral signature (see Fig.
4).
Fig. 1Original Indian Pine image. (a) and (b): visualization based on two compositions
of three different spectral bands (26, 16, 6) / (37, 21, 5) resp.; (c) and (d) the selected
regions of the images (a) and (b) respectively corresponding to the GT class labels
given in (e).
Table 1Data from the Indian Pine GT.
Total GT pixels: 10 336
Table 2Total dispersion and average dispersion of Indian Pine spectral signatures per
GT class using the 1-norm distance.
Supposed Dispersion in each Average dispersion
GT
class and dispersion in each class and
Classes
rank
dispersion rank
Fig. 2Indian Pine original images visualized with three spectral bands (26, 16, 6)
corresponding to the label of C11, claimed as Soybeans min-till, and C2, claimed as
Corn no-till.
Fig. 3Indian Pine GT: Spectral signatures (black), average spectral signature (central
curve), and  standard deviation interval (blue) of C11 and C2GT classes.
Fig. 4Indian Pine GT: Spectral signatures (black), average spectral signature (central
curve), and  standard deviation interval (blue) of the assumed homogeneous class
C7.
B.
Discussion
The above examples of C11and C2classes show that some regions of the HSI which
are declared in the GT as relating to two classes of vegetation species, do not exhibit
coherent and similarspectral signatures in the acquired image. The corresponding
variations can be even easily detected visually, moreover only from visible bands.
One might ask whether such variations do really exist from the field viewpoint and are
not part of some artifacts, e.g. caused by the sensor itself.In fact, some answers to
theissue of heterogeneity of most original GT classes,reside in the supplemental
material provided with the HSI, i.e. the observation notes and field picturesassociated
with the field work of Baumgartner et al.26, which surprisingly is barely referred to in
the HSI classification literature. This ~70 pages document including handwritten notes
taken approximately at the time of the Indian Pine flight survey, as well as the pictures
taken by the field specialists, contain rich information that has only partially been
reported in the GT map. For instance, let us consider the field numbered as 3-10 in
the observation notes document. On Ошибка! Источник ссылки не найден.-(a)
thisfield corresponds to the bottom-most left-mostfieldamong those of C11class
(soybeans min-till). Thevegetative canopy reportedfor this field in the observation
notes is soybeans, drilled in 8” rows, and a plant height of 4-5”, with very few weed
infestation. In the same report, the soil characteristics also mention a minimum tillage
system, not freshly tilled, with corn residues on the surface. These observations,
which are only partly reported in the name of the C11 GT class, seem to indicate that
the same, uniform soil and vegetation conditions are available over the whole field.
However, this is not the case, as can be seen from the picture of this field taken
during the field observations. Thispicture show inОшибка! Источник ссылки не
найден.-(a)available from the field work25clearly exhibits local variations along
lineaments traversing the north part of the field (particularly the WSW-ENE
lineament)has been taken from the north end of the field, in the direction of southeast.The first line of trees and bushes at the background correspond to the east end
of the field. Ошибка! Источник ссылки не найден.-(b) shows theC11 class regions
overlaid on a Google Earth archive image acquired three months before the
hyperspectral acquisition.The two orange lines in Ошибка! Источник ссылки не
найден.-(b) delineate approximately the field of view of the picture in Ошибка!
Источник ссылки не найден.-(a).We can notice that the local variations of grey
levels in this image are in accordance with those observed in the HSI.
As said above, thiscrop field is claimed by the GT map as uniformly grown with
soybeans on a minimum tillage soil. However, the central part of the picture in
Ошибка! Источник ссылки не найден.-(a) showing brown areas (probably bare
soil) partlycontradicts the original GT class map. Besides, this area is very likely to
correspond to the lineaments detectable in both the HSI (cf. C11 of Fig. 2 and in
Ошибка! Источник ссылки не найден.-(b)).
The variations of the spectral signatures in classes C11 et C2 (cf. Fig. 3) probably
have two origins: one originates from the influence of the nature and moisture of the
soil because the vegetation is not found at a very advanced stage of growth and the
other of the inclusion in these classes of objects of different natures.
From this example, it is clear that the users and developers of classification
algorithms must paygreat attention to the GT maps provided with the HSI for their
absolute truthfulness.
Download