Ρ Σ| Τ|

advertisement
A CLUSTERING APPROACH TO VECTOR MATHEMATICAL MORPHOLOGY
Constantin Vertan, Marius Malciu, Titus Zaharia, Vasile Buzuloiu
Applied Electronics Department, Bucuresti “Politehnica” University
fax: + 40 1 312. 31. 93
email: vertan@alpha.imag.pub.ro, marius@alpha.imag.pub.ro, zaharia@alpha.imag.pub.ro,
buzuloiu@alpha.imag.pub.ro
Mailing address: Constantin Vertan, PO Box 35-57, Bucuresti 35, Romania
ABSTRACT
The processing and analysis of vector
valued signals has become in the last decade a
major field of interest. The direct extension of
classical processing methods for the scalar signals
is not always possible. The mathematical
morphology viewed as a processing technique for
gray images is such a one that cannot be extended
easily. In this paper we present an approach to the
vector mathematical morphology based on
clustering techniques in the signal sample space.
1. INTRODUCTION
Signal processing evolved from analysing
time signals to the study of 2-D signals (images)
and of the multidimensional ones. In the last
decades a growing interest has been devoted to
the emerging vector valued (or multichannel,
multispectral, multicomponent) multidimensional
signals. Such signals are typically (but not
restricted to) color images, where every pixel
(image sample) is represented by a three
component vector. The components are (generally)
the amounts of pure red, green and blue that
compose the local color. Other common used
vector valued signals are the remote sensing
images, the thermal detection images, the stereo
and quadraphonic sound samples (represented by
vectors having 2 to 7 components). For all these
signals, it is natural, although not obvious, that the
vector components are correlated. This is why the
separate component processing, using the
standard scalar techniques, is not appropriate.
2. MULTICHANNEL PROCESSING
APPROACHES
The linear processing approach uses the
same integral transforms techniques as in the
scalar case, in conjunction with common
decorrelation procedures, applied for the
separation of individual channels. Typically, for
the decorrelation, the Karhunen-Loeve and the
Discrete Cosine transform are used [1], [2]. The
decorrelated components are then processed
separately.
A significant part of the nonlinear
processing techniques are based on order
statistics filters. Regardless of the particular filter
type, a common processing step is the ordering
(sorting) in ascending order of the signal samples
within the filter window. Although the scalar
ordering is obvious and very simple, there is no
way to extend it to vector valued signals for
obtaining a mathematically correct (according to
Definition 1) and topology preserving ordering
relation.
Definition 1: An ordering relation (x p y )
must be reflexive (x p x ), transitive (x p y and
y p z imply x p z ) and antisymmetric ( x p y and
y p x imply x = y ).
An ordering relation in a vector space that
satisfies Definition 1 is the lexicographic ordering,
expressed as stated in Definition 2.
Definition 2: The lexicographic ordering
relation
is
expressed
by:
x p y Υ ∃k ∈[1, n] such that
Ρ
|Σxi = yi , ∀i = 1, k − 1 and
|Τ x k < yk
Two problems arise when applying it: the space
topology is not preserved, and an importance
criterion must be found for deciding the order of
the components.
Barnett [3] has investigated the possible
types of multidimensional orderings and retained
four variants: the marginal ordering, that is
equivalent to the separate ordering of each
component; the partial ordering, which uses
convex hull like sets; the conditional ordering,
which is the scalar ordering of a single component;
the reduced ordering, that performs the ordering of
vectors according to some scalars, computed from
the components of each vector. The reduced
ordering offers the best results, being intensively
used [4], [5], [6]. The computed scalar is either the
generalized distance from each data sample to
some fixed reference point (the average, the
marginal median) [3], either a sum of distances from
each data sample to some fixed characteristic
points [5], either the sum of intersamples distances
[4] or angles [6]. These computed scalar values
allow the total vector ordering (the first two
situations) or just the determination of the vector
median of the samples in the third case. Two main
difficulties are connected to this approach:
choosing of appropriate fixed points (or the local
adaptation of the computed scalar [5]) according to
the particular type of degradation, and the color
preservation as a measure of quality. The color
preservation implies that the filter output is a
sample from the set of input values (that means
that the average of values will not preserve colors).
The adaptation of the filter is a difficult task that
cannot be accomplished without heuristics and
intensive computation.
3. MATHEMATICAL MORPHOLOGY
OVERVIEW
The
theoretical
foundations
of
mathematical morphology have been constructed
by Serra [7], extending some set operators, known
as the Minkowski [set] addition and subtraction.
Primarily used as a processing and analysis tool for
binary images (easily interpreted as sets), the
mathematical
morphology
extended
to
multidimensional scalar valued signals by some
algebraic reformulations of the basic definitions.
The fundamental operators, called dilation and
erosion are algebraically defined as stated in
Definition 3.
Definition 3: The dilation and the erosion
are operators over a complete lattice, that commute
with sup and inf respectively.
According to this definition, the condition to be
fulfilled for the existence of the morphological
operators is the possibility of defining an ordering
relation in the signal value space. For scalar signals
the problem is trivial, but for vector valued signals
we encounter the problems mentioned above. The
use of some reduced ordering can be addressed
[8], but the results show no dramatical
improvement compared to the separate component
processing (marginal ordering) and still the exact
conditions (complete lattice induced by an
ordering relation) are not accomplished.
In what follows we will describe how
another (more intuitive) approach to the definition
of mathematical morphology operators is possible,
by unsupervised clustering in the signal value
space. This theory will be applied to color images
(although there exists no limitation regarding the
number of vector components and the signal type).
4. MORPHOLOGICAL OPERATIONS BY
CLUSTERING
The idea behind this approach is a very
simple one: the selected pixel values (vector valued
data) have to be partitioned in some classes,
corresponding to their position in the sample
space. Typically we will use three classes: a central
class, corresponding to the ordinary pixels, and
two extremum classes that will gather the outlier
pixels, generated by the noise. The clustering is
performed by classical algorithms, such as Basic
Isodata [9], using a least mean square error (LMSE)
or cluster compactness optimality criterion.
Each class may be characterized by some
prototype (or centroid) that can be computed from
the condition of optimality. We will define the
result of the pseudo-dilation as the class centroid
that lies closest to the marginal maximum and the
pseudo-erosion as the class centroid that lies
closest to the marginal minimum. We may also
notice that a median-like-filtering can be performed
on the basis of the prototype of the remaining
class (the “central” one).
It is obvious that the class centroid is an
“artificial” value, since there is a very small
probability of having it as an input pixel value. If
the creation of new colors in the resulting image is
not desired, the class centroid can be approximated
by its nearest data sample. The use of these two
variants produces significantly different results:
since the centroid is somehow equivalent to a
mean value, its effect will be an increased
smoothing (shown by smaller values of the NMSE
(Definition 4)) and a slightly bigger color difference
(shown by an increased MCRE, (Definition 5)) than
the nearest neighbour of the centroid.
Any unsupervised clustering algorithm
may be used. Typical examples are the Basic
Isodata (k-means or LBG) [9] and the family of
hierarchical clustering algorithms (the Pairwise
Nearest Neighbour (PNN) [10] and the Modified
Pairwise Nearest Neighbour (MPNN) [11]). The
Basic Isodata is relatively well known and
produces good and stable results. Its major
disadvantage is the undetermined running time,
given by the number of iterations through the data
set until stability and the big memory requirements
(all vectors must be memorized, since the clustering
is performed on the entire set). The hierarchical
clustering methods require a constant running
time, since the vectors are processed in a single
path. The principle is the following one: a bigger
number of classes than desired is constructed and
then the classes are successively reduced by
merging. The PNN starts with an initial number of
classes equal to the number of vectors to be
clustered; at each processing step, the closest two
classes (in terms of the reduction of the
partitioning mean square error) are merged. That
implies a number of ( N − k )( N − k + 1)/ 2
possible pairs of classes to be joined in the step k
(N is the number of vectors), so the method implies
a large amount of computation.
The
MPNN
algorithm,
originally
introduced in [11], uses a slight modification of the
PNN, considerably saving the computation time.
The idea is quite simple: the MPNN starts with a
number of classes equal to the desired one; each of
these classes is initialized with a vector from the
data set to be clustered. If the data set contains N
vectors to be partitioned in C classes, the
algorithm requires N-C identical steps. At each
step, a new class (the C+1-th) is formed with a new
vector from the data set and the two closest
classes within the C+1 classes are merged, forming
a new class (and the total number of classes
becomes again C). The decision of optimal merging
is taken with respect to the LMSE criterion of
clustering quality. According to this criterion, the
cluster quality is measured by the mean square
error of the samples to their classes prototypes,
of vectors assigned to classes are determining
completely the results.
5. EXPERIMENTAL RESULTS
The tests were conducted on true color
images (24 bits per pixel, or 8 bit for coding each
color component); two classic images were used:
“fish” and “waterfall”, each having 256 by 256
pixels. On each image different types of noise were
superimposed: impulsive (salt&pepper) noise, 4%
of pixels from each color component affected, with
no interchannel correlation; normal (gaussian)
noise with zero mean and variance 25 and a mixture
of 2% impulsive and zero mean, variance 25 normal
noise. The image quality is measured by two
criteria, defined below.
Definition 4: The Normalized Mean Square
Error
(NMSE)
is
defined
as
H
NMSE =
λ λ uij x j − yi
data vectors, y i are the classes prototypes and uij
are the values from the membership matrix of the
data set: 1 if x j belongs to class i and 0
otherwise). Expressing the quality J before and
after the class merging, we obtain the quality
degradation by merging classes i and j as
ni n j
∆ ij =
yi − y j , where ni and n j are the
ni + n j
.
W
λλ
2
f (i , j )
2
i =1 j =1
( x j are the
i = 1 j =1
f (i , j ) − f (i , j )
i =1 j =1
H
C N
that means J =
W
λλ
(MCRE)
Definition 5: The Mean Chromaticity Error
is
defined
as:
2
H
W
λλ
MCRE =
i =1 j = 1
f (i, j )
f (i, j)
−
f (i, j)
f (i, j )
HW
.
In the definitions above, H and W are the image
2
dimensions in pixels,
is the L norm and
is
1
the L norm, f(i,j) and f (i , j ) are the original and
computed values of the pixel at location (i,j).
On each image a morphological filtering
by opening (erosion followed by dilation) was
performed; the results were compared for the
marginal ordering induced morphology, the
reduced ordering induced morphology by
weighted Euclidean distances to marginal extremes
[8], the Vector Median Filter [4] and the pseudo
morphologic operators defined by clustering (with
Basic Isodata and MPNN algorithms). The quality
measures for the “waterfall” image are presented in
Table 1; for the “fish” images the results are of the
same type. Some images are presented in Figures 1
to 2 (image luminance).
number of vectors assigned to the classes i and j.
The merging decision is taken so that ∆ ij is
minimal within all pairs of classes. After merging,
the prototypes and the number of vectors in each
class are recomputed.
The process is repeated until all the
vectors of the data set have been considered. We
have to notice that the overall method is not
optimal, even if each merge is optimal and the
clustering result is strongly dependent upon the
vector order in the data set. From a computational
point of view, it could be shown that the MPNN
(obviously much faster than the PNN) requires the
number of operations performed by two iterations
of the classic Basic Isodata. We may also notice,
that the individual vectors do not need to be
stored, as long as the prototypes and the number
Table 1: Image quality measures
“waterfall” image, 256 X 256 pixels, true color
4% Impulsive noise
NMSE %
MCRE %
Noise degraded image
31.048
4.072
Mixture noise
NMSE %
29.281
MCRE %
19.298
VMF, 3X3 window
Marginal ordering open, 3X3 window.
Reduced ordering open, 3X3 window
Basic Isodata, 3X3 window, prot. approx.
Basic Isodata, 5X5 window, prot. approx.
MPNN, 3X3 window
Figure 1: 4% impulsive noise degraded “waterfall”
6. CONCLUSIONS
In this paper we presented a novel
approach to mathematical morphology (and general
filtering) of vector valued multidimensional signals.
The used examples were color images (for the sake
of simplicity and nice representation), but the
method is valid for any type of signal. The cluster
approach seems to produce good results in the
cases of noise elimination: the impulsive noise is
totally removed (like the morphological filters
usually do) but a better behaviour was noticed in
the presence of normal noise. This normal noise
filtering appears from computation of the classes
prototypes by average type relations.
The novel MPNN clustering algorithm
was found appropriate for this task: a slight
migration of the classes prototypes (compared with
the prototypes computed by Basic Isodata) to the
marginal extremes was noticed. This fact (not
desired for a general clustering problem) is
welcomed for the implementation of the pseudo
morphological operators. The computing time is
similar to some standard multichannel filtering
methods (such as the Vector Median Filter) and the
required memory is very small (since the MPNN
does not need the entire data set, but just a one by
one evaluation of the vectors). A hardware
implementation or a VLSI integration of the MPNN
clustering algorithm and subsequent filtering
13.533
15.179
14.843
12.621
16.288
13.838
2.187
2.179
2.212
2.365
2.643
3.380
15.800
18.381
16.131
15.420
17.812
18.093
8.881
24.163
8.566
7.759
8.879
10.596
Figure 2: Cluster opening of Figure 1
seems at hand and this problem deserves to be
addressed in the future.
7. REFERENCES
[1] B. R. Hunt: “Karhunen Loeve multispectral
image restauration, part I: theory”, IEEE Trans. on
ASSP, vol. 32, no. 3, pp. 592-599, June 1984
[2] I. Pitas, A. N. Venetsanopoulos: Nonlinear
Digital Filters - Principles and Applications,
Norwell MA:Kluwer Academic Publ., 1990
[3] V. Barnett: “The ordering of multivariate data”,
J. R. Stat. Soc., A, vol. 139, part 3, pp. 318-343, 1976
[4] J. Astola, P. Haavisto, Y. Neuvo: “Vector
Median Filters”, Proc. IEEE, vol. 78, pp. 678-689,
Apr. 1990
[5] K. Tang, J. Astola, Y. Neuvo: “Nonlinear
Multivariate Image Filtering Techniques”, IEEE
Trans. on Image Processing, vol. 4, no. 6, pp. 788798, June 1995
[6] P.E. Trahanias, A. N. Venetsanopoulos: “Vector
Directional Filters - A New Class of Multichannel
Image Processing Filters”, IEEE Trans. on Image
Processing, vol.2, no. 4, pp. 528-534, Oct. 1993
[7] J. Serra: Image Analysis and Mathematical
Morphology, London: Academic Press, 1982
[8] C. Vertan, V. Popescu, V. Buzuloiu:
“Morphological like Operators for Color Images”,
accepted at EUSIPCO’ 96, Trieste, Italy
[9] A. Gersho, R. M. Gray: Vector Quantization and
Signal Compression, Boston: Kluwer Academic
Publ., 1992
[10] W. H. Equitz: “A New Vector Quantization
Clustering Algorithm”, IEEE Trans. on ASSP, vol.
37, no. 10, pp. 1568-1575, Oct. 1989
[11] D. Comaniciu: “An Efficient Clustering
Algorithm for Vector Quantization”, Proc. of the
9th Scandinavian Conference on Image Analysis,
Uppsala, Sweden, June 1995, pp. 423-430
Download