tant , . objects

advertisement
INF 386, 2003, Lecture 1, page 1 of 21
University of Oslo
Department of Informatics
Fritz Albregtsen
Segmentation by Thresholding
03.03.2003
Lecture 1
from Digital Image Analysis
Selected Themes
INF 386, V-2003
INF 386, 2003, Lecture 1, page 2 of 21
• In edge based segmentation we find the
basic elements of edges, i.e. edge- pixels
(or even line-pixels, corner-pixels etc),
based on local estimates of gradient.
In the next steps we thin broad edges, join
edge fragments together into edge chains.
Then we have a (partial) region border.
• Thresholding splits the image histogram.
Pixels belonging to same class get same
label. These pixels are not necessarily
neighbours.
• In region based segmentation
we find those pixels that are similar.
• There are two categories of methods,
based on two different principles,
namely similarity and discontinuity.
• Segmentation creates regions and objects
in images.
• Segmentation is one of the most important
components of a complete image analysis
system.
Segmentation
INF 386, 2003, Lecture 1, page 3 of 21
• Thresholding may also be a pre-process for
adaptive filtering, adaptive
compression etc.
• Thresholding is usually a pre-process for
various pattern recognition techniques.
• In bi-level thresholding, the histogram of
the image is usually assumed to have one
valley between two peaks, the peaks
representing objects and background,
respectively.
• Automatic thresholding is important in
applications where speed or the physical
conditions prevent human interaction.
Introduction to Thresholding
INF 386, 2003, Lecture 1, page 4 of 21
— Nonparametric methods are
more robust, and usually faster.
— Separate the two gray level classes
in an optimum manner
according to some criterion
∗ between-class variance
∗ divergence
∗ entropy
∗ conservation of moments.
• Non-parametric case:
— Difficult or impossible to establish a
reliable model.
— Estimate parameters of two
distributions from given histogram.
• Parametric techniques:
Parametric versus Non-parametric
INF 386, 2003, Lecture 1, page 5 of 21
• Distinction between
supervised (with training) and
unsupervised (clustering).
• Distinction between
automatic methods and
interactive methods.
• There are no truly automatic methods,
always built-in parameters.
• Automatic means that the user does not
have to specify any parameters.
Automatic versus Interactive
INF 386, 2003, Lecture 1, page 6 of 21
• Contextual methods make use of the
geometrical relations between pixels.
• Non-contextual methods rely only on the
gray level histogram of the image.
• The fundamental framework of the
global methods is also applicable
to local sub-images.
— the uniformity
in lighting and detection
— the gray level characteristics
of objects and background
• Global methods put severe restrictions on
• Local methods
optimize new threshold for a number of
blocks or sub-images.
• Global methods
use single threshold for entire image
Global and Non-contextual ?
t
∞
Z
b(z)dz
f (z)dz
b(z)dz + P2 ·
t
−∞
Z ∞
t
−∞
t
f (z)dz
INF 386, 2003, Lecture 1, page 7 of 21
∂E
= 0 ⇒ P1 · b(T ) = P2 · f (T )
∂t
• Differentiate with respect to the threshold t
E(t) = P1 ·
Z
• The total error is :
E2(t) =
E1 (t) =
Z
• The probabilities of mis-classifying a pixel,
given a threshold t:
p(z) = P1 · b(z) + P2 · f (z)
• Histogram is assumed to be twin-peaked.
Let P1 og P 2 be the a priori probabilities
of background and foreground. (P1+P 2=1).
Two distributions given by b(z) and f (z).
The complete histogram is given by
Bi-level thresholding
INF 386, 2003, Lecture 1, page 8 of 21
• If the a priori probabilities P1 og P2 are
equal
(µ1 + µ2)
T =
2
σ2
P2
(µ1 + µ2)
+
ln
T =
2
(µ1 − µ2)
P1
σB2 = σF2 = σ 2
• If the two variances are equal
• Two thresholds may be necessary !
+2(µ1σ22 − µ2σ12) · T
P1σ2
2 2
2 2
2 2
=0
+σ1 µ2 − σ2 µ1 + 2σ1 σ2 ln
P2σ1
(σ12 − σ22) · T 2
• We get a quadratic equation:
• For Gaussian distributions
2
(T −µ2 )2
1)
−
P
P1 − (T −µ
2
2
2
√
e 2σ1 = √
e 2σ2
2πσ1
2πσ2
Bi-level thresholding
+
z=tk +1 zp(z)
PG−1
tk +1 p(z)
PG−1
#
INF 386, 2003, Lecture 1, page 9 of 21
• The correctness of the estimated threshold
depends on the extent of the overlap,
as well as on the correctness of the
P1 ≈ P2-assumption.
• Note that µ1(t) and µ2(t) are the
a posteriorimean values, estimated from
overlapping and truncated distributions.
The a priori µ1 and µ2 are unknown to us.
• µ1(tk ) is the mean value of the gray values
below the previous threshold tk , and µ2(tk )
is the mean value of the gray values above
the previous threshold.
tk+1
tk
zp(z)
Pz=0
tk
z=0 p(z)
"P
=
P1 (t)µ21 (t)
+
P2(t)µ22 (t)
−
µ20
[µ0 P1 (t) − µ1 (t)]2
=
.
P1 (t) [1 − P1(t)]
INF 386, 2003, Lecture 1, page 10 of 21
• Optimal threshold T is found by a sequential search
for the maximum of σB2 (t) for values of t where
0 < P1(t) < 1.
σB2 (t)
• The expression for σB2 (t) reduces to
2
.
• Maximizing σB2 ⇔ minimizing σW
2
σW
+ σB2 = σ02,
2
• The sum of the within-class variance σW
and the between-class variance σB2 is equal
to the total variance σ02:
σB2 (t) = P1(t) [µ1(t) − µ0]2 + P2(t) [µ2(t) − µ0]2
• Threshold value for k + 1-th iteration given
by
µ1(tk ) + µ2 (tk ) 1
=
=
2
2
• Maximizes the a posteriori between-class
variance σB2 (t), given by
The method of Otsu
• Initial threshold value, t0, equal to average
brightness.
The method of Ridler and Calvard
INF 386, 2003, Lecture 1, page 11 of 21
• Starting with a threshold t0 = µ0 ,
fast convergence is obtained equivalent to
the ad hoc technique of Ridler and Calvard.
• Exhaustive sequential search gives same result as
Otsu’s method.
where µ1 and µ2 are the mean values below and
above the threshold.
µ1 (T ) + µ2 (T ) = 2T,
• This may be written as
• Differentiating σB2 and setting δσB2 (t)/δt = 0, we find
a solution for
"P
#
PG−1
T
zp(z)
zp(z)
+1
= 2T
+ Pz=T
Pz=0
T
G−1
p(z)
z=0
z=T +1 p(z)
• We may write σB2 = P1 (t)µ21(t) + P2(t)µ22 (t) − µ20
hP
i2
P t
2
G−1
zp(z)
z=t+1
zp(z)
+ PG−1
σB2 (t) = Pz=0
− µ20
t
z=0 p(z)
z=t+1 p(z)
• The method of Reddi et al. is based on the same
assumptions as the method of Otsu, maximizing the
a posteriori between-class variance σB2 (t).
The method of Reddi
INF 386, 2003, Lecture 1, page 12 of 21
• It gives the same numerical results as the extensive
search technique of Otsu, but is orders of magnitude
faster in multi-level thresholding !
• This procedure has a very fast convergence.
• The process is repeated until all thresholds are
stable.
1
0
t1 = (µ(0, t1) + µ(t1 , t2))
2
.
.
1
0
tM = (µ(tM −1, tM ) + µ(tM , G))
2
• Starting with an arbitrary set of initial thresholds
t1 , ..., tM we iteratively compute a new set of
0
0
thresholds t1, ..., tM by
where µ(ti, tj ) is the mean value between
neighbouring thresholds ti and tj .
µ(0, t1) + µ(t1 , t2) = 2t1
µ(t1 , t2) + µ(t2 , t3) = 2t2
.
.
µ(tM −1, tM ) + µ(tM , G) = 2tM
• The interclass variance reaches a maximum when
Maximizing inter-class
variance for M thresholds
INF 386, 2003, Lecture 1, page 13 of 21
• The a posteriori model parameters will
represent biased estimates.
Correctness relies on small overlap.
Cho et al. (1989) have given improvement.
• An unfortunate starting value for an
iterative search may cause the iteration to
terminate at a nonsensical threshold value.
• The criterion function has local minima at
the boundaries of the gray scale.
• As t varies, model parameters change.
Compute J(t) for all t; find minimum.
−2 [P1 (t)lnP1 (t) + P2 (t)lnP2 (t)] .
J(t) = 1 + 2 [P1 (t)lnσ1(t) + P2(t)lnσ2 (t)]
• Kittler and Illingworth (1985) assume a
mixture of two Gaussian distributions
(five unknown parameters).
Find T that minimizes the KL distance
between observed histogram and model
distribution.
A “minimum error” method
INF 386, 2003, Lecture 1, page 14 of 21
a = αp + (1 − α)q
b = αp2 + (1 − α)q 2
c = αp4 + (1 − α)q 4
• Alternatively, the above probabilities may be written
• Assuming that border effects may be neglected, we
may find these probabilities by examining all 2 × 2
neighbourhoods throughout the image.
• Now define
a = Prob (pixel gray level > t)
b = Prob (two neighbouring pixels both > t)
c = Prob (four neighbouring pixels all > t)
or equivalently φ − 1 = 0, where φ = p + q.
p(t) = 1 − q(t)
• The uniform error threshold is then found when
• For a given threshold t, let
p(t) = fraction of background pixels above t
q(t) = fraction of object pixels with gray level above t.
• Suppose we knew the background area α(t), and also
which pixels belonged to object and background.
E1 (t) = E2 (t)
• The uniform error threshold is given by
Uniform error thresholding
1
2
3
4
4
3
2
1
6
3
1
1
Number of
single pairs 4-tuples
• In a single pass through the image, a table may be formed,
giving estimates of a, b, c for all values of t.
INF 386, 2003, Lecture 1, page 15 of 21
t < g1
g1 ≤ t < g2
g2 ≤ t < g3
g3 ≤ t < g4
Threshold Rank
• For a given 2 × 2 neighbourhood, the four pixels are
sorted in order of increasing gray level, g1 , g2 , g3 , g4 .
Then for all thresholds t < g1 , the neighbourhood
has four single pixels, six pairs and one 4-tuple > t.
We may set up the scheme:
• Instead of one pass through the whole image for
each trial value of t, probabilities may be tabulated
for all possible values of t in one initial pass.
• No assumptions about underlying distributions, or
about a priori probabilities. Only estimates of a, b
and c for each trial value of t.
• φ − 1 is a monotonously decreasing function.
Root-finding algorithm instead of extensive search.
• Select gray level t where | φ − 1 | is a minimum.
(α2 − α)p4 + 2α(1 − α)p2q 2 + (1 − α)2 − (1 − α) q 4
b2 − c
=
a2 − b
(α2 − α)p2 + 2α(1 − α)pq + [(1 − α)2 − (1 − α)] q 2
(p2 − q 2 )2
=
= (p + q)2 = φ2 .
(p − q)2
• Now we note that
Uniform error thresholding - II
z=t+1
z=t+1
G−1
X
zp(z)/
G−1
X
p(z)
p(z)
z=0
INF 386, 2003, Lecture 1, page 16 of 21
G−1
ρf g (T ) = maxt=0
ρf g (t)
• The correlation coefficient has a very
smooth behaviour, and starting with the
overall average graylevel value, the optimal
threshold may be found by a steepest
ascent search for the value T which
maximizes the correlation coefficient
ρf g (t).
µ2 (t) =
zp(z)/
t
X
z=0
µ1(t) =
t
X
• The gray levels of the two classes in the
thresholded image may be represented by
the two a posteriori average values µ1(t)
and µ1(t):
• Brink (1989) maximized the correlation
between the original gray level image f
and the thresholded image g.
Maximum correlation thresholding
z=0
HG = −
Ht = −
z=0
G−1
X
z=0
t
X
p(z)ln(p(z))
p(z)ln(p(z))
p(z) p(z)
p(z)
p(z)
ln
−
ln
P1 (t) P1 (t) z=t+1 1 − P1 (t) 1 − P1(t)
G−1
X
Ht
HG − Ht
+
.
P1(t) 1 − P1(t)
INF 386, 2003, Lecture 1, page 17 of 21
• The discrete value T of t which maximizes
ψ(t) is now the selected threshold.
ψ(t) = ln [P1 (t)(1 − P1(t))] +
the sum of the two entropies may be
written as
• Using
ψ(t) = −
t
X
• For two distributions separated by a
threshold t the sum of the two class
entropies is
• Kapur et al. proposed a thresholding
algorithm based on Shannon entropy.
Entropy-based methods
Pst = −
i=0 i=0
t
s X
X
Hst HGG − Hst
+
Pst
1 − Pst
i=0 j=0
G−1
G−1 X
X
pij ln(pij ), Hst = −
i=1 j=1
t
s X
X
pij ln(pij )
INF 386, 2003, Lecture 1, page 18 of 21
• In most cases, this gives an appreciable improvement over the
single feature entropy method of Kapur et al. (1985).
• A much faster alternative is to treat the two features s and t
separately.
• The discrete pair (S, T ) which maximizes ψ(s, t) are now the
threshold values which maximize the loss of entropy, and
thereby the gain in information by introducing the two
thresholds.
HGG = −
where the total system entropy HGG and the partial entropy
Hst are given by
ψ(s, t) = H1 (st) + H2 (st) = ln [Pst (1 − Pst )] +
pij .
pij
pij
ln
1 − Pst 1 − Pst
i=s+1 j=t+1
G−1 G−1
X
X
• The sum of the two entropies is now
where
H2(st) = −
• Abutaleb (1989) proposed a thresholding method based on
2 − D entropy. For two distributions and a threshold pair (s, t),
where s and t denote gray level and average gray level, the
entropies are
t
s X
X
pij pij
H1(st) = −
ln
P
Pst
i=0 j=0 st
Two-feature entropy
j=0
G−1
X
pj (zj )i,
i = 1, 2, 3.
j=0
G−1
X
pj (zj )i,
i = 1, 2, 3.
INF 386, 2003, Lecture 1, page 19 of 21
• Solving the four equations will give threshold T .
P1 (t) + P2 (t) = 1
0
Pj (t)(zj )i = mi = mi =
and we have
j=1
2
X
• We want to preserve the moments
• Let P1 (t) and P2(t) denote a posteriori fractions of
below-threshold and above-threshold pixels in f .
mi =
• The i-th moment may be computed from the
normalized histogram p(z) by
• Find threshold T such that if all below-threshold
values in f are replaced by z1,
and all above-threshold values are replaced by z2 ,
then the first three moments are preserved.
• Observed image f is seen as blurred version of
thresholded image g with gray levels z1 and z2.
Preservation of moments
INF 386, 2003, Lecture 1, page 20 of 21
• The optimal threshold, T , is then chosen as the
P1 -tile (or the gray level value closest to the P1-tile)
of the histogram of f .
• In the bi-level case, the equations are solved as
follows
m m 0 1
cd = m1 m2 −m m 2
1 c0 = (1/cd ) −m3 m2 m −m 0
2 c1 = (1/cd ) m1 −m3 h
i
z1 = (1/2) −c1 − (c21 − 4c0 )1/2
h
i
z2 = (1/2) −c1 + (c21 − 4c0 )1/2
1 1 Pd = z1 z2 1 1 P1 = (1/Pd ) m2 z2 Solving the equations
INF 386, 2003, Lecture 1, page 21 of 21
• Threshold is found by sequential search for
maximum exponential convex hull
deficiency.
• Transform histogram p(z) by ln{p(z)},
compute convex hull, and transform
convex hull back to histogram domain by
he(k) = exp(h(k)).
• In the ln{p(z)}-domain, upper concavities
are produced by bimodality or shoulders,
not by tail of normal or exponential,
nor by extension of histogram.
• Upper concavity of histogram tail regions
can often be eliminated by considering
ln{p(z)} instead of the histogram p(z).
• This may work even if no “valley” exists.
• “Convex deficiency” is obtained by
subtracting the histogram from its convex
hull.
Exponential convex hull
Download