INTRODUCTION IN 384, H-2001

advertisement
$
'
$
'
INTRODUCTION
IN 384, H-2001
• Automatic thresholding is important in
applications where speed or the physical
conditions prevent human interaction.
• In bi-level thresholding, the histogram of
the image is usually assumed to have one
valley between two peaks, the peaks representing objects and background, respectively.
SEGMENTATION
METHODS
21/11 2001
• Thresholding is usually a pre-process for
various pattern recognition techniques.
Fritz Albregtsen
• Thresholding may also be a pre-process
for adaptive filtering, adaptive
compression etc.
FA/BLAB/IfI
&
'
%
&
$
'
Parametric versus Non-parametric
$
• Automatic means that the user does not
have to specify any parameters.
– Estimate parameters of two distributions from given histogram.
– Difficult or impossible to establish a
reliable model.
• There are no truly automatic methods,
always built-in parameters.
• Distinction between
automatic methods and
interactive methods.
• Non-parametric case:
– Separate the two gray level classes
in an optimum manner according to
some criterion
∗ between-class variance
∗ divergence
∗ entropy
∗ conservation of moments.
– Nonparametric methods are
more robust, and usually faster.
FA/BLAB/IfI
%
Automatic versus Interactive
• Parametric techniques:
&
FA/BLAB/IfI
• Distinction between
supervised (with training) and
unsupervised (clustering).
%
&
FA/BLAB/IfI
%
$
'
Bi-level thresholding
Global and Non-contextual ?
• Global methods use single threshold for
entire image
• The histogram is assumed to be twin-peaked.
Let P1 og P 2 be the a priori probabilities
of background and foreground. (P1+P 2=1).
The two distributions are given by b(z) and f (z).
The complete histogram is given by
• Local methods optimize new threshold for
a number of blocks or sub-images.
• Global methods put severe restrictions on
p(z) = P1 · b(z) + P2 · f (z)
– the gray level characteristics
of objects and background
– the uniformity
in lighting and detection
• The probabilities of mis-classifying a pixel,
given a threshold t:
E1(t) =
• The fundamental framework of the
global methods is also applicable
to local sub-images.
E2(t) =
E(t) = P1 ·
'
%
&
$
'
Bi-level thresholding
P2
e
2πσ2
t
b(z)dz + P2 ·
Z t
−∞
f (z)dz
FA/BLAB/IfI
%
$

tk+1 =
+2(µ1σ22 − µ2σ12) · T


P 1 σ2 
+σ12µ22 − σ22µ21 + 2σ12σ22 ln 
=0
P 2 σ1
• If the two variances are equal
σB2 = σF2 = σ 2
z=0 zp(z)
Ptk
z=0 p(z)

PG−1
Ptk
+
z=tk +1 zp(z) 


PG−1
tk +1 p(z)
• Note that µ1(t) and µ2(t) are the
a posteriorimean values, estimated from
overlapping and truncated distributions.
The a priori µ1 and µ2 are unknown to us.

P2
(µ1 + µ2)
σ2
T =
+
ln  
2
(µ1 − µ2)
P1
• If the a priori probabilities P1 og P2 are equal
• The correctness of the estimated threshold
depends on the extent of the overlap, as well as
on the correctness of the P1 ≈ P2-assumption.
(µ1 + µ2)
2
FA/BLAB/IfI
µ1(tk ) + µ2(tk ) 1 
= 
2
2
• µ1(tk ) is the mean value of the gray values
below the previous threshold tk , and µ2(tk )
is the mean value of the gray values above
the previous threshold.
• Two thresholds may be necessary !
&
Z ∞
• The threshold value for the k + 1-th iteration
is given by
(σ12 − σ22) · T 2

b(z)dz
• The initial threshold value, t0, is set equal to
the average brightness.
(T −µ2 )2
−
2σ22
• We get a quadratic equation:
T =
t
The method of Ridler and Calvard
• For Gaussian distributions
=√
Z ∞
f (z)dz
∂E
= 0 ⇒ P1 · b(T ) = P2 · f (T )
∂t
FA/BLAB/IfI
&
P1
e
2πσ1
−∞
• Differentiate with respect to the threshold t
• Contextual methods make use of the
geometrical relations between pixels.
√
Z t
• The total error is :
• Non-contextual methods rely only on the
gray level histogram of the image.
(T −µ1 )2
−
2σ12
$
'
%
&
FA/BLAB/IfI
%
$
'
$
'
The method of Otsu
The method of Reddi
• Maximizes the a posteriori between-class variance
σB2 (t), given by
• The method of Reddi et al. is based on the same
assumptions as the method of Otsu, maximizing
the a posteriori between-class variance σB2 (t).
σB2 (t) = P1(t) [µ1(t) − µ0]2 + P2(t) [µ2(t) − µ0]2
• We may write σB2 = P1(t)µ21(t) + P2(t)µ22(t) − µ20
σB2 (t)
2
• The sum of the within-class variance σW
and the
between-class variance σB2 is equal to the total variance σ02:
P
PG−1
+
z=t+1 zp(z)
PG−1
z=t+1 p(z)
2
− µ20
T
zp(z)
+
z=0 p(z)

PG−1
z=T +1 zp(z) 
PG−1
z=T +1 p(z)

= 2T
µ1(T ) + µ2(T ) = 2T,
where µ1 and µ2 are the mean values below and
above the threshold.
[µ0P1(t) − µ1(t)]2
=
.
P1(t) [1 − P1(t)]
• Exhaustive sequential search gives same result as
Otsu’s method.
• Optimal threshold T is found by a sequential search
for the maximum of σB2 (t) for values of t where 0 <
P1(t) < 1.
FA/BLAB/IfI
&
2
• This may be written as
• The expression for σB2 (t) reduces to
=
z=0 zp(z)]
Pt
z=0 p(z)
 z=0
 P
T
2
• Maximizing σB2 is equivalent to minimizing σW
.
P1(t)µ21(t)+P2(t)µ22(t)−µ20
=
Pt
• Differentiating σB2 and setting δσB2 (t)/δt = 0, we
find a solution for
2
σW
+ σB2 = σ02,
σB2 (t)
[
'
• Starting with a threshold t0 = µ0,
fast convergence is obtained equivalent to
the ad hoc technique of Ridler and Calvard.
%
&
$
'
FA/BLAB/IfI
$
A “minimum error” method
Maximum correlation thresholding
• Assuming Gaussian distributions we have five unknown parameters. Kittler and Illingworth optimized
a criterion function related to the average pixel classification error rate, based on estimated a posteriori
parameters:
• Brink (1989) maximized the correlation between the
original gray level image f and the thresholded image
g.
• The gray levels of the two classes in the thresholded
image may be represented by the two a posteriori
average values µ1(t) and µ1(t):
J(t) = 1 + 2 [P1(t)lnσ1(t) + P2(t)lnσ2(t)]
−2 [P1(t)lnP1(t) + P2(t)lnP2(t)] .
µ1(t) =
• As t is varied, the model parameters change.
µ2(t) =
• Compute J(t) for all t, and find minimum value.
• The criterion function has local minima at the boundaries of the gray scale.
• The a posteriori model parameters will represent biased estimates, as the tails of the overlapping distributions are truncated. Thus, the correctness of the
estimated threshold relies on this overlap being small.
FA/BLAB/IfI
t
X
zp(z)/
z=0
G−1
X
z=t+1
t
X
p(z)
z=0
zp(z)/
G−1
X
p(z)
z=t+1
• The correlation coefficient has a very smooth behaviour, and starting with the overall average
graylevel value, the optimal threshold may be found
by a steepest ascent search for the value T which maximizes the correlation coefficient ρf g (t).
• An unfortunate starting value for an iterative search
may cause the iteration to terminate at a nonsensical
threshold value.
&
%
ρf g (T ) = maxG−1
t=0 ρf g (t)
%
&
FA/BLAB/IfI
%
$
'
$
'
Preservation of moments
Entropy-based methods
• Gives threshold without iteration or search.
• Kapur et al. proposed a thresholding algorithm based
on Shannon entropy.
• The observed image f may be seen as blurred version
of the thresholded image g with gray levels z1 and z2 .
• For two distributions separated by a threshold t the
sum of the two class entropies is
ψ(t) = −
t
X
z=0
• The method selects a threshold T such that if all belowthreshold values in f are replaced by z1 , and all abovethreshold values are replaced by z2 , then the first three
moments are preserved.
G−1
p(z) p(z)
p(z)
p(z)
X
ln
−
ln
P1(t) P1(t) z=t+1 1 − P1(t) 1 − P1(t)
• Using
Ht = −
HG = −
• The i-th moment mi of the image f may be computed
from the normalized histogram p(z) by
mi =
t
X
pj (zj )i ,
i = 1, 2, 3.
j=0
p(z)ln(p(z))
z=0
G−1
X
G−1
X
• Let P1 (t) and P2 (t) denote the a posteriori fractions
of the below-threshold and above-threshold pixels in f .
Then the moments of g are given by
p(z)ln(p(z))
z=0
the sum of the two entropies may be written as
0
mi =
Ht
HG − Ht
+
.
ψ(t) = ln [P1(t)(1 − P1(t))] +
P1(t) 1 − P1(t)
2
X
Pj (t)(zj )i ,
i = 1, 2, 3.
j=1
• We want to preserve the moments
• The discrete value T of t which maximizes ψ(t) is now
the selected threshold.
2
X
0
Pj (t)(zj )i = mi = mi =
j=1
G−1
X
pj (zj )i ,
i = 1, 2, 3.
j=0
and we have
P1 (t) + P2 (t) = 1
• Solving these four equations will give the threshold T .
FA/BLAB/IfI
&
'
%
&
$
'
Solving the equations
c0 =
(1/cd ) c1 = (1/cd ) • This may work even where no “valley” exists.
m0 −m2
m1 −m3
h
i
h
i
z1 = (1/2) −c1 − (c21 − 4c0)1/2
z2 = (1/2) −c1 + (c21 − 4c0)1/2
Pd = P1 =
• Upper concavity of histogram tail regions
can often be eliminated by considering ln{p(z)}
instead of the histogram p(z).
m1 m2 −m2
−m3
1 1
z1 z2
(1/Pd) 1
m2
$
• “Convex deficiency” is obtained by subtracting
the histogram from its convex hull.
m0 m1
cd =
m1 m2
%
Exponential convex hull
• In the bi-level case, the equations are solved as follows
FA/BLAB/IfI
• In the ln{p(z)}-domain, upper concavities can be
produced by bimodality and by shoulders,
but not by the tail of normal or exponential,
nor by extension of histogram.
• Transform histogram p(z) by ln{p(z)}, compute
convex hull, and transform convex hull back to
histogram domain by he (k) = exp(h(k)).
• Threshold is found by sequential search for maximum exponential convex hull deficiency.
1 z2 • The optimal threshold, T , is then chosen as the P1 -tile
(or the gray level value closest to the P1 -tile) of the
histogram of f .
&
FA/BLAB/IfI
%
&
FA/BLAB/IfI
%
$
'
$
'
SEGMENTERING
Region-basert segmentering
• Segmentering er ett av de viktigste elementene
i et komplett bilde-analyse system.
• Segmentering deler et bilde opp i regioner.
• Vi kan finne (deler av) omrisset
til regionene i et bilde ved kant-deteksjon.
• I segmenteringen får vi fram regioner og objekter
som senere skal beskrives og gjenkjennes.
• Vi kan segmentere ved terskling.
• Vi omtaler to kategorier av metoder,
basert på to egenskaper ved piksler i bilder,
nemlig likhet og diskontinuitet
• Vi skal nå finne regionene direkte.
• La R være hele bildet, og anta
at det består av regionene Ri , i ∈ {1, ..., n}
– Ved region-basert segmentering
(terskling, groing, split-and-merge)
får vi fram de pikslene som ligner hverandre.
Dermed har vi alle pikslene i objektet.
• Regionene danner tilsammen hele bildet
∪ni=1Ri = R
• Regionene er disjunkte
– I kant-basert segmentering finner vi
basalelementer
Ri ∩ Rj = φ,
∗ kant-, linje-, hjørne-punkter, ...
i 6= j
• Et predikat P er tilfredsstilt av alle pikslene
i hver region
P (Ri) = TRUE for i ∈ {1, ..., n}
I neste steg: Tynner og kjeder sammen til
∗ kanter, linjer, hjørner, ...
Dermed har vi omrisset av objektene.
• To nabo-regioner er forskjellige mht predikatet P
P (Ri ∪ Rj ) = FALSE for i 6= j
der P (R) er et predikat over pikslene i mengden
R og φ er den tomme mengde.
FA-in384-edge1-2
&
'
%
&
FA/IN384/region-01 %
$
'
$
Groing
“Split and merge”
• Så noen “frø-punkter” i bildet, og gro
regioner ved å ta med nabo-piksler
inn i regionen dersom de har egenskaper
(gråtone, farge, tekstur, etc.)
som faller innenfor en gitt toleranse.
• Split = del en region
i to disjunkte regioner.
• merge = slå sammen to regioner.
• Split hvis P (Ri) = FALSE
• Problemer:
- Valg av “frø-punkter”
- Valg av egenskaper og toleranser
- Stopp-kriterier
• “Merge” Rj og Rk hvis
P (Rj ∪ Rk ) =TRUE
• Lag et histogram over egenskapene.
• Velg punkter som har en hyppig
forekommende egenskaps-vektor.
• Gitt a priori kunnskap: bruk den!
• Metoden er ofte applikasjons-avhengig.
• Teksturerte objekter blir homogene
hvis vi bruker passende tekstur-egenskap.
&
• Gitt en region Ri og et predikat P .
FA/IN384/region-02 %
• Stopp når det ikke kan splittes
eller “merges” mer.
• Spesialtilfelle:
- splitt i fire kvadranter hvis
P (Ri) = FALSE
- merge nabo-regioner hvis
P (Rj ∪ Rk ) = TRUE
&
FA/IN384/region-03 %
$
'
$
'
“Merging of regions”
Hough-transform
• Gitt to regioner R1 og R2
med perimetre P1 og P2.
C = lengden av grenselinjen
D = lengde av C der δ(ampl.) < 1
• Anta at vi har foretatt en kantdeteksjon
og en terskling av resultat-bildet.
• “Merge” hvis
• Vi vil finne delmengder av disse
punktene som ligger på rette linjer.
• Dermed har vi n punkter
som f.eks. beskriver et objekt.
D
> 2, 2 ≈ 1/2
min(P1, P2)
• Betrakt et punkt (xi, yi)
og en rett linje yi = axi + b
Dette forhindrer merging av like store
regioner.
• Hvert punkt (xi, yi) i bilde-domenet
svarer til en rett linje gitt ved
• Videre merging hvis
D
> 3,
C
b = −xia + yi
3 ≈ 3/4
i parameter-domenet (ab-planet)
• Fungerer for intensitets-bilder
med få objekter og lite tekstur.
• Teksturerte regioner blir homogene
hvis vi bruker passende tekstur-egenskap.
• Bruk region-groing i tekstur-egenskaps
bilder.
&
FA/IN384/region-04 %
&
FA/IN384/hough-01 %
'
$
'
$
Hough-transform II
Hough-transform III
• Del opp ab-planet i akkumulator-celler
A [a, b] , a ∈ [amin, amax] ; b ∈ [bmin, bmax]
• sett alle A[a,b] := 0;
for hvert punkt (x,y)
der grad(x,y) > T
begin
for alle mulige verdier av a
begin
b = -xa + y;
A[a,b] := A[a,b] + 1;
end
end
• En verdi A [i, j] = M svarer til at
M punkter i x, y-planet ligger på linjen
y = ai x + bj
• Denne representasjonen håndterer
ikke vertikale linjer !
&
• Benytter normal-representasjonen
for en rett linje, gitt ved
x cos θ + y sin θ = ρ
• Hvert punkt (xi, yi) i xy-planet gir nå
opphav til en sinusoid i ρθ-planet.
• M ko-lineære punkter som ligger
på linjen x cos θj + sin θj = ρi
gir M kurver som skjærer hverandre
i punktet (ρi, θj ) i parameter-planet.
• Lokale maksima => signifikante linjer
FA/IN384/hough-02 %
&
FA/IN384/hough-03 %
$
'
$
'
Hough-transform IV
Hough-transform, problemer
• Del opp ρθ-planet i akkumulatorceller
A [ρ, θ] , ρ ∈ [ρmin, ρmax] ; θ ∈ [θmin, θmax]
• Lengde og posisjon til et
linjesegment kan ikke finnes.
• “Range” for θ er ±90o fra x-aksen
• Ko-lineære linjesegmenter
kan ikke skilles fra hverandre.
– horisontal linje har θ = 90o, ρ ≥ 0
– vertikal linje har θ = 90o, ρ ≥ 0
eller θ = −90o, ρ ≤ 0
√
• “Range” for ρ er ±D 2
der D er lengden av sidekanten i bildet.
• Hvordan skal vi finne lokale maksima
i parameter-planet, som svarer til sammen hengende linjestykker i bildeplanet?
– Høy terskel => korte linjestykker går
tapt
– Lav terskel => deteksjon av ikkesammenhengende piksler.
• Løsning:
– Lokal terskling.
– Traverser de piksler som svarer til
et maksimum i parameter-planet.
Sjekk at gap < toleranse.
&
FA/IN384/hough-04 %
&
FA/IN384/hough-05 %
'
$
'
$
Hough-transform, lokal gradient
Hough-transform, sirkler
• Gitt et gradient-magnitude bilde g(x, y)
som inneholder et linjesegment.
• Parametrisering av sirkel i xy-planet
• Enkel algoritme:
for alle g(xi, yi) > T do
for alle θ do
ρ := xi cos θ + yi sin θ
inkrementér A(ρ, θ)
• Vi få r altså et 3-dimensjonalt
parameter-rom å søke i.
(x − a)2 + (y − b)2 = c2
• Enkel prosedyre,
A[i, j, k] :
• FORENKLET:
Gitt gradient-magnituden g(x, y)
og gradient-komponentene gx and! gy , eller
φg (x, y) = arctan ggxy
• Algoritme:
for alle g(xi, yi) > T do
ρ := xi cos(φg (x, y)) + yi sin(φg (x, y))
inkrementér A(ρ, φg (x, y))
gitt
akkumulator
sett alle A[a,b,c] := 0;
for hvert punkt (x,y)
der grad(x,y) > T
begin
for alle mulige a og b
begin
c = sqrt((x-a)**2 + (y-b)**2);
A[a,b,c] := A[a,b,c] + 1;
end;
end;
• Andre, mer elegante forslag ?
&
FA/IN384/hough-06 %
&
FA/IN384/hough-07 %
Download