03.1 image processing - filtering

advertisement
3.1 Image Processing: Filtering
EECS 4422/5323 Computer Vision
!1
J. Elder
Image Processing vs Computer Vision
❖ What is the difference between image processing and computer vision?
❖ Image processing maps an image to a different version of the image.
❖ Computer vision maps one or more images to inferences about the visual scene.
❖ Image processing operations often required as pre-processing for computer vision
algorithms.
EECS 4422/5323 Computer Vision
!2
J. Elder
Outline
❖ Point Operators
❖ Linear Filters
❖ Nonlinear Filters
EECS 4422/5323 Computer Vision
!3
J. Elder
Outline
❖ Point Operators
❖ Linear Filters
❖ Nonlinear Filters
EECS 4422/5323 Computer Vision
!4
J. Elder
n output image. In the continuous domain, this can be denoted as
3.1.1 Pixel transforms
Point Operators
g(x) = h(f (x)) or g(x) = h(f (x), . . . , f (x)),
(3.1)
n
A general image processing operator0 is a function
that takes one or more input images and
produces
an output
image.
In the
continuous
domain,
this
can
be denoted
❖ Image
processing
operators
transform
each
pixel
independently
in the D-dimensional
domain
ofpoint
the
functions
(usually
D=
2 for
images)
andasthe of other pixels.
and g operate over some range,
can or
either
be=scalar
or vector-valued,
g(x) =which
h(f (x))
g(x)
h(f0 (x),
. . . , fn (x)), e.g.,
(3.1)
mages or 2D motion. For discrete (sampled) images, the domain consists of a finite
idx ofrow
where x x
D-dimensional
domain
the idx
functions (usually D = 2 for images) and the
pixel locations,
= the
(i, j),
and we cancol
write
❖ or is in
functions f and g operate over some range, which can either be scalar or vector-valued, e.g.,
= h(fFor
(i,discrete
j)).
for color images org(i,
2D j)
motion.
(sampled) images, the domain(3.2)
consists of a finite
number of pixel locations, x = (i, j), and we can write
shows how an image can be represented either by its color (appearance), as a grid
, or3.1
as a two-dimensional
function
(surface
plot).
g(i,
j) = h(f (i, j)).
(3.2)
point operator
Point operators
mmonly usedoutput
point processes
are multiplication
and
addition with a constant,
image
input
image
Figure 3.3 shows how an image can be represented either by its color (appearance), as a grid
of numbers, or as ag(x)
two-dimensional
(surface plot).
= af (x) +function
b.
(3.3)
Two commonly used point processes are multiplication and addition with a constant,
45 bias
60 parameters;
98 127 132 133sometimes
137 133
45 60 98 127
eters a > 0 and b are often called the gain 45and
these
60 98 127 132 133 137 133
46 =65af (x)
98 123
126 128 131 133
46 65 98
123
1
g(x)
+(Figures
b.
(3.3)
are said to control contrast and brightness,
respectively
3.2b–c).
The
46 65 98 123 126 128 131 133
103
103
160
160
140
140
132 133 137 133
120
120
100
126 128 131 133
47 65 96 115 119 123 135 137
47 65 96 115 119 123 135 137
65 96 115 119 123 135 137
The parameters
a can
> 0also
andbeb summarized
are often47called
gain
bias138parameters;
sometimes
47 its
63the
107 and
113luminanance)
122
134 and range
ge’s luminance
characteristics
by
key91(average
47 63 these
91 107 113 122 138 134
47 63 91 107 113 122 138 134
daele, Deussen
et
al.
2007).
50 59
80 97 110
123 133 134 (Figures 3.2b–c).
50 59 1 80The
97 110 123 133 134
parameters are said to control contrast and
brightness,
respectively
50 59 80 97 110 123 133 134
49 53 68 83 97 113 128 133
49 53 68 83 97 113 128 133
1 An image’s luminance characteristics can also be summarized by its key (average luminanance) and range
49 53 68 83 97 113 128 133
50 50 58 70 84 102 116 126
50 50 58 70 84 102 116 126
(Kopf, Uyttendaele, Deussen et al. 2007).
50 50 58 70 84 102 116 126
50 50 52 58 69 86 101 120
50 50 52 58 69 86 101 120
50 50 52 58 69 86 101 120
(b)
(a)
(b)
EECS 4422/5323 Computer Vision
(c)
(b)
(c)
!5
15
13
15 11
13 9
11 7
9 5
73
S2
S1
1
5
domain
domain
3
domain
S4
domain
S3
0
S1
1
0
S6
20
S5
20
S2
40
S4
40
S3
60
S9
S8
S7
60
S6
80
S16
S15
S14
S13
S16
S12
S15
S11
S14
S10
S13
S9
S12
S8
S11
S7
S10
range
80
S5
100range
(d)(c)
(d)
J. Elder
g(i, j) = h(f (i, j)).
(3.2)
image can be represented either by its color
(appearance), as a grid
Examples
dimensional function (surface plot).
❖ Contrast/brightness adjustment:
point processes are multiplication and addition with a constant,
g(x) = af (x) + b.
(3.3)
nd b are often called the gain and bias parameters; sometimes these
Gain (contrast)
Bias (brightness)
ontrol contrast and brightness, respectively (Figures 3.2b–c).1 The
haracteristics can also be summarized by its key (average luminanance) and range
al. 2007).
❖ Inverse gamma - undo compressive gamma mapping applied in sensor so that pixel
intensities are (approximately) proportional to the light irradiance at the sensor:
g(x) = x
γ
(note that textbook Eqn. 3.7 has this backwards)
EECS 4422/5323 Computer Vision
!6
J. Elder
finding such a mapping is the same one that people use to generate random samples from
a probability density function, which is to first compute the cumulative distribution function
shown in Figure 3.7c.
Think of the original histogram h(I) as the distribution of grades in a class after some
How can we map a particular grade to its corresponding percentile, so that students at
most images areexam.
not
uniformly distributed across the gamut.
the 75% percentile range scored better than 3/4 of their classmates? The answer is to integrate
of these colours the
todistribution
be uniform
called
histogram
equalization.
h(I) tois
obtain
the cumulative
distribution
c(I),
Histogram Equalization
❖ The colours in
❖ Redistribution
Vision:
Algorithms
and
Applications
3,
2010
draft)
108 Computer
Computer
Vision:
Algorithms
and(September
Applications
Computer
Vision:
Algorithms
and
Applications
(September
3,(September
2010(September
draft) 3, 20
108
Computer
Vision:
Algorithms
and Applications
I
X
c(I) =
6000
6000
6000
Frequency
Input(a)
Image
(a)
250
250
250
Output Intensity
250
200
200
200
150
150
150
100
100
100
50
50
50
000
000
50
50
50
100
100
100
200
200
150
150
100
100
50
50
0
(d)
(d)
150
0
150
150
0
200
50
200
0
200
4000
4000
4000
3000
3000
3000
3000
2000
2000
1000
1000
1000
1000
1000
000
50
50
50
0
200
500
200
200
Input Intensity
(b)
(b)
B
G
G
R
RR
Y
YY
R
R
Y
Y
150
100
0
150
0
150
150
100
100
100
B
Input Intensity
EECS 4422/5323 Computer Vision
3000
2000
2000
2000
B
BB
G
GG
250
100
250
50
250
200
150
(d) (d)
1) +
i=0
6000
(a) (a)
250
h(i) = c(I
1
h(I),
N
(3.9)
350000
350000
150000
150000
150000
150000
150000
100000
100000
100000
100000
100000
50000
50000
50000
50000
50000
350000
350000
where N is the number of6000
pixels inB the image or350000
students Bin the class. For any given
grade
or
B
B
B
BB
B
300000
300000
300000 c(I) and determine
300000
300000
intensity, we can 5000
look up 5000
its corresponding
percentile
the final
valueGGGthat
G
G
G
GG
250000
R
250000
R
250000
pixel should take.4000
When workingRRRwith eight-bit
pixel values,
theR I and
c axes250000
are rescaled
250000
RR
4000
Y
Y
Y
Y
YY
YY
200000
from [0, 255].
200000
200000
200000
200000
5000
5000
5000
000
1
N
250
200
Frequency
108
108
250
100
250
50
250
150
100
000
200
150000
50
250
50
200
50
(b) (b)
100
100
250
100
0
150
150
150 0
0
200
10050
200
250
20050 0 250
250
Input Intensity
(c)
(c)
(c)
(f)
(f)
(f)
250
(e)
(e)
!7
(e) Output
(e) Image
J. Elder
15
End of Lecture
Sept 24, 2018
EECS 4422/5323 Computer Vision
!8
J. Elder
Outline
❖ Point Operators
❖ Linear Filters
❖ Nonlinear Filters
EECS 4422/5323 Computer Vision
!9
J. Elder
mLocally
equalization
is histogram
an exampleequalization
of a neighborhood
operator
local
adaptive
anaexample
of aorneighborhood
ve histogram
equalization
is an exampleisof
neighborhood
operator or operator
local or local
ollection
ofwhich
pixel uses
values
in the vicinity
of a giveninpixel
to deter-of a given pixel to deteroperator,
collection
of in
pixel
the
112
Computer
and Applications (September 3
h uses a collection
ofa pixel
values
thevalues
vicinity of
avicinity
given Vision:
pixel toAlgorithms
deteremine
(Figurefinal
3.10).
In addition
to performing
tone to
adjustment,
output
value
3.10). toInlocal
addition
performing
local tone adjustment,
outputitsvalue
(Figure
3.10).(Figure
In addition
performing
local
tone adjustment,
an
be used
filter images
inbeorder
blur,
❖ toMany
image processing
operations
involve
linear
of thede-pixels within a
neighborhood
used to
to add
filtersoft
images
insharpen
order
todeaddcombinations
soft blur, sharpen
45 60
127soft
132 133
137 sharpen
133
operators can operators
be used tocan
filter
images
in order
to 98add
blur,
deneighbourhood
of a(Figure
pixel.
remove
noisefinite
(Figure
3.11b–d).
this
section,
we
look
tails,
accentuate
edges,
or removeInnoise
Inlinear
this131section,
we look at linear
46 3.11b–d).
65 98
123 at
126
133
69 95 116 125 129 1
e edges, or remove noise (Figure 3.11b–d). In this section, 128
we look
at linear
rfiltering
Vision:weighted
Algorithms
and involve
Applications
(September
3,
involve
combinations
ofweighted
pixels
incombinations
small
47 65 neighborhoods.
96 2010
115 of
119draft)
123 135 in
137small neighborhoods.
0.1 0.1 0.1
68 92 110 120 126 1
operators,
which
pixels
ors, which
involve
weighted
combinations
of pixels
in smallatneighborhoods.
❖
Typically,
the
same
set
of
weights
is
applied
0.1 distance
0.2 0.1
66 86 104 114 124 1
47
63
91
113 122 each
138 134 pixel.
* and
=
tInnon-linear
operators
such
as
morphological
filters
and
distance
Section 3.3, we look at non-linear operators such as107morphological
filters
we
look at non-linear operators such as morphological
filters and distance 0.1 0.1 0.1
50 59 80 97 110 123 133 134
62 78 94 108 120 1
137 133
transforms.
49 129
68 83 97 113 128 133
57 69 83 98 112 1
131 133
116 125
132
❖
The
patternused
ofoperator
weights
isa95called
a53linear
filter.is an
used
typemost
of neighborhood
is neighborhood
linear
filter,
in which
The
commonly
type of69
operator
a linear filter, in which an
50 126
50
58 linear
70 84 filter,
102 116 in
126 which an
53 60 71 85 100 1
135 137
0.1 0.1of0.1neighborhood
68 92
110 120
commonly
used type
operator
is a132
rmined
as a weighted
sum of inputaspixel
values
(Figure
3.10),pixel values (Figure 3.10),
output
pixel’s
value
is determined
a66 weighted
sum
of input
50 124
50 132
52 58 69 86 101 120
0.1 0.2 0.1
86 104 114
138 134
*
=
value is determined
as
a weighted
sum
of input
pixel
valuesthis
(Figure
3.10),
❖
When
applied
at
all
locations
in
the
image,
can
be expressed as a correlation:
X
X
133 134
0.1 0.1 0.1
62 78 94 108 120 129
j) =l). f (i + k, j + l)h(k,(3.12)
l).
(3.12)
g(i, j) =
f (i + k,X
j +g(i,
l)h(k,
128 133
83 98 l).
112 124 f (x,y )
g (x,y )
f (i + k,57jk,l
+69 l)h(k,
(3.12) h (x,y )
k,lg(i, j) =
Linear Filters
116 126
53
60
71
85
100 114
MATLAB
function
101 entries
120 or mask
The
in the
weight
mask
h(k,
l)
are
often
called
the
filter
coefficients.
The
kernel
h(k,
l) arekernel
often or
called
the
filter
coefficients.
The
Figure 3.10 Neighborhood filtering (convolution):
The image on the left is co
xcorr
or
the
kernel
or mask
h(k,
l)more
are compactly
often called
the filter
coefficients. The
above
operator
can
beas
notated
as
canweight
becorrelation
more
compactly
notated
Linear
filter
the filter in the middle to yield the image on the right. The light blue pixels indica
g (x,y
(x,y ) compactly notated
on operator can beh more
as)
neighborhood
for the light
green destination pixel.(3.13)
g = f ⌦ h.
g = f ⌦ h.
(3.13)
g =image
f ⌦ h.on the left is convolved with
(3.13)
d filtering (convolution): The
ld the image
on
the
right. The light
blue pixels
indicate
source
where
the sign
of thethe
offsets
in f has been reversed. This is called the convoluti
❖
or
alternatively
as
a
convolution
A
common
variant
on
this
formula
is
this formula is
reen destination pixel. is
X
X variant on this formulaX
X
⇤ (3.14)
h,
functions
j) =
k,l)h(i
j X
l)h(k,
l)h(i k, j l),g = fMATLAB
f (i k,X
j g(i,
l)h(k,
l) = f (if (k,
k, jl) =l), f (k, (3.14)
(i, j) =
f (i k, j k,l l)h(k,
l) =
f (k, l)h(ik,l k, j l),
(3.14) conv, conv2, convn
k,l
4
in f has been reversed.
This is called
the
convolution
operator,
or
and
h
is
then
called
the
impulse
response
function.
The reason for this name is th
k,l
k,l
function, h, convolved with an impulse signal, (i, j) (an image that is 0 every
g = f ⇤ h,
(3.15)
response
function:
atImpulse
the origin)
reproduces
itself, h ⇤ = h, whereas correlation produces the refl
4
4422/5323
Computer
Vision
0 that it is so.)
J. Elder
thisname
yourself
to verify
lseEECS
response
function.
The
reason(Try
for this
is that
the!1kernel
k,l
at the origin) reproduces itself, h ⇤ = h, whereas correlation produces 2the
. . signal.
.
722
3
3
2 superposition
1 . . . (addition)
72 of shifted im-2 1reflected
t, Equation (3.14) can be interpreted as the
6
7
6
7
. 88 772
(Try this yourself to verify that6 it is so.)
7 6 88 7
1 22 11 .. . . 7 6
6
1
2
1
.
.
onse functions h(i k, j l) multiplied
6 by the input pixel values
7 6 f (k,
7 l). Convolu7 6 788 7
6 6 1 of2shifted
76
1(addition)
1
.
.
1
1
1
In1 fact,1Equation
(3.14)
can
be
interpreted
as
the
superposition
im6
7
6
7
6
7
⇤
,
.
1
2
1
.
626 7 7
72
88
62
52
37
/
/
/
1
6
7
6
4.
2 624
4 6the6
,
.
1
2
1
2 37 ⇤nice/4properties,
/2 1/e.g.,
6
7
6
7
dditional
it is both
commutative
and
associative.
As
well,
4
7
67 7
7
6
1
1
1
7
6
7
,
pulse response functions
k, j 52l) multiplied
the
input
pixel
values
f
(k,
l).
Convolu.
1
2
1
.
72 h(i
884 662
37 ⇤ 1/by
/
/
6
7
4 4. . 1 2 1 5 4 526 562 7
4
2
4
5
4
5
ansform of two convolved images is 4
the.product
of
their
individual
Fourier
trans6
76
7
. 1 2 1
52
correlation
and convolution
linear shift
which
tion❖hasBoth
additional
nice properties,
e.g., it is bothare
commutative
andinvariant
associative.
. well,
1 5
52 5
. 4 .. operators,
.As
11 22the
374 obey
ction 3.4).
. . . 1 2
37
Fourier transform of two convolved images is the product of their individual. Fourier
. . trans1 2
37
correlation and convolution are linear shift-invariant (LSI) operators, which obey
Figure
signal convolution as a sparse matrix-vector multiply, g =
forms
(Section
3.4).3.12 asOne-dimensional
mensional
signal
convolution
a
sparse
matrix-vector
multiply, g =
uperposition principle
(3.5),
Superposition
๏ Hf
Figure
3.12convolution
One-dimensional
convolution
asoperators,
a sparse matrix-vector
.
Both correlation
and
are linearsignal
shift-invariant
(LSI)
which obey multiply, g =
both the superposition
hHf
(f.0 + fprinciple
f0 + h f1 ,
(3.16)
1 ) = h (3.5),
Linear Shift Invariant Operators
Occasionally, a shift-variant version of correlation or convolution may be used, e.g.,
ft-variant
version
of correlation or convolution
h (f0 + f1 )may
= hbefused,
f1 ,
(3.16)
hift
invariance
principle,
0 + he.g.,
X
Occasionally, a shift-variant version of correlation or convolution may be used, e.g.,
X
g(i, j) =
f (i k, j l)h(k, l; i, j),
(3.18)
X
and
shift
invariance
invariance
g(i,
+ fk,Shift
j + l)
(h g)(i,
+ k, j + l),
g(i,j)
j)=the
=f (i๏
(i
k,
j,principle,
l)h(k,
l; i,j)
j),= (h f )(i
(3.18) (3.17)
k,l
g(i, j) =
f (i k, j l)h(k, l; i, j),
(3.18)
k,l
k,lj)
ans that shifting a signal
commutes
with
applying
( =stands
the
LSI
g(i, j)
= f (i
k, is
j+
l) convolution
, the
(hoperator
g)(i,
ffor
)(i(i,
+j).
k, jFor
+ l),example,
(3.17)
where
h(k,
l; i,+j)
the
kernel
at(hpixel
such a spatially
heAnother
convolution
at kernel
pixelinvariance
(i, j).
For
example,
such
a“behaves
spatially
way tokernel
think
of
shift
is that
the operator
thetosame
varying
can
be
used
to
model
blur
in
an
image
due
variable
depth-dependent
defocus.
where
h(k,
l;
i,
j)
is
the
convolution
kernel
at
pixel
(i,
j).
For
example,
such
a spatially
which
means
that
shifting
a
signal
commutes
with
applying
the
operator
(
stands
for
the
LSI
used
image dueand
to variable
depth-dependent
defocus.as a matrix-vector multiply, if we first
re”. to model blur in an
Correlation
convolution
can
both
be
written
varyingway
kernel
can be
model blur
anthe
image
due to“behaves
variable the
depth-dependent
defocus.
operator).
Another
to
think
ofused
shift
is in
that
operator
same
R toinvariance
onvolution
can
both
be
written
as
a
matrix-vector
multiply,
if
we
first
convert
thebetwo-dimensional
f (i,both
j) and
g(i,
j) into
raster-ordered
vectors
f andifg,we first
ntinuous version of convolution
can
written as
g(x)
= fimages
(x u)h(u)du.
Correlation
and
convolution
can
be
written
as
a
matrix-vector
multiply,
sional everywhere”.
images f (i, j) and g(i, j) into raster-ordered vectors f and g,
convert the two-dimensional images f (i, Rj) and, g(i, j) into raster-ordered vectors (3.19)
f and g,
4 The continuous version of convolution can be written as g(x)g== Hf
f (x u)h(u)du.
g = Hf ,
(3.19)
g = Hf , kernels. Figure 3.12 shows how a(3.19)
where the (sparse) H matrix contains the convolution
matrix contains one-dimensional
the convolution convolution
kernels. Figure
shows how
a
can be3.12
represented
in matrix-vector
114
Computer
Vision:form.
Algorithms and Applications (September 3, 2010 draft)
where the (sparse) H matrix contains the convolution kernels. Figure 3.12 shows how a
olution can be represented in matrix-vector form.
2
32
3
one-dimensional convolution can be represented in matrix-vector form.
2 1 . . .
72
Padding (border effects)
6 1 2 1 . . 7 6 88 7
6
76
7
76
7
1 6
1
1
1
72 88 62 52 37 ⇤ /4 /2 /4 , 4 6 . 1 2 1 . 7 6 62 7
cts)
6
76
7
The
astute reader
notice that the matrix multiply shown in Figure 3.12 suffers
Padding
(borderwill
effects)
4 . .from
1 2 1 5 4 52 5
. . .of 1 2
37
boundary
effects, shown
i.e., thein
results
of filtering
the image
l notice that the matrix
multiply
Figure
3.12 suffers
fromin this form will lead to a darkening
astute
reader
notice the
thatoriginal
the matrix
multiply
shown
in Figure
from
theThe
corner
pixels.
Thiswill
is because
image
is
being
paddedas3.12
with
0suffers
values
he results of filtering
the image
in this
form
will lead to
aFigure
darkening
ofeffectively
3.12
One-dimensional
signal
convolution
a sparse
matrix-vector multiply, g =
boundary
effects,
i.e., the results
of filtering
the
in this
form
will lead to a darkening of J. Elder
EECS 4422/5323
Computer
Vision
.
wherever
the
beyond
theimage
original
image
boundaries.
is because
the original
image
isconvolution
effectivelykernel
being extends
paddedHf
with
0!11values
Handling Borders
❖ What do we do near the border of the image, where the kernel (filter) ‘falls off’ the
edge?
kernel h
Image f
EECS 4422/5323 Computer Vision
!12
J. Elder
Handling Borders
❖ Padding options
๏ Zero-padding - ignore kernel weights that fall outside image
๏ Clamp - extend boundary values of image
๏ Cyclic - toroidally wrap around
๏ Mirror - reflect pixels across image edge
❖ Alternatively, we can crop the image and return only the ‘valid’ portion
๏ e.g., MATLAB conv2(...,shape) returns a subsection of the two-dimensional convolution, as
specified by the shape parameter:
✦ ‘full' Returns the full two-dimensional convolution (default).
✦ ‘same' Returns the central part of the convolution of the same size as A.
✦ ‘valid' Returns only those parts of the convolution for which the kernel lies entirely within the image.
EECS 4422/5323 Computer Vision
!13
J. Elder
Separable Filters
❖ Given a general 2D kernel of size (m, n) pixels, application at each pixel of the image
involves m*n multiplies.
❖ For an M*N image, the total number of multiplies for the convolution is M*N*m*n.
❖ However, certain special 2D kernels can be decomposed into 2 1D kernels, reducing
the number of multiples at a pixel to m + n.
❖ Example: 2D axis-aligned Gaussian kernel
⎛ 1 ⎛ x 2 y2 ⎞ ⎞ ⎛ 1
⎛ y2 ⎞ ⎞
⎛ x2 ⎞ ⎞ ⎛ 1
1
h(x, y) =
exp ⎜ − ⎜ 2 + 2 ⎟ ⎟ = ⎜
exp ⎜ − 2 ⎟ ⎟ ⎜
exp ⎜ − 2 ⎟ ⎟
2πσ xσ y
⎝ 2σ x ⎠ ⎠ ⎝ 2πσ x
⎝ 2σ y ⎠ ⎠
⎝ 2 ⎝ σ x σ y ⎠ ⎠ ⎝ 2πσ x
MATLAB function
conv2(h1, h2, A)
EECS 4422/5323 Computer Vision
!14
J. Elder
Example Separable Filters
116
1
K2
1
K
1
1
..
.
1
1
1
..
.
1
Computer Vision: Algorithms and Applications (September 3, 2010 draft)
···
···
1
1
..
.
1
1
16
1 2 1
2 4 2
1 2 1
1 1 ··· 1
1
4
1 2 1
1
···
(a) box, K = 5
(b) bilinear
1
256
1
16
1 4
4 16
6 24
4 16
1 4
6
4 1
24 16 4
36 24 6
24 16 4
6
4 1
1 4 6 4 1
(c) “Gaussian”
1
8
1 0 1
2 0 2
1 0 1
1
2
1 0 1
(d) Sobel
1
2
1
1
4
1
2
2
4
2
1
1
2
1
2 1
(e) corner
Figure 3.14 Separable linear filters: For each image (a)–(e), we show the 2D filter kernel
Smoothing
(top), the corresponding
horizontal 1D kernel (middle), and the Edge
filtereddetection
image (bottom). The
filtered Sobel and corner images are signed, scaled up by 2⇥ and 4⇥, respectively, and added
to a gray offset before display.
ure 3.14a.
In Vision
many
EECS 4422/5323
Computer
cases, this operation can !1be
5 significantly sped up by first performing a
J. Elder
sizes horizontal edges.
rotationally
invariant,not
areonly
described
Section
s youare
canmore
see however,
it responds
to the in
corners
of 4.1.
the square,
orner detector (Figure 3.14e) looks for simultaneous horizontal and vertical
al edges. Better corner detectors, or at least interest point detectors that
s. As you can see however, it responds not only to the corners of the square,
invariant,
described inand
Section
4.1.
3.2.3areBand-pass
steerable
filters
gonal edges.
Better
corner
detectors,
or
at
least
interestfilter
pointestimate
detectorslocal
that intensity gradients.
❖ Local difference filters like the Sobel
ally invariant,
described
in Section
The Sobelare
and
corner operators
are4.1.
simple examples of band-pass and oriented filters. More
andsophisticated
steerable filters
kernels can be created by first smoothing the image with a (unit area) Gaussian
❖ But the restriction to a 3x3 neighbourhood of the image makes the results noisy.
filter,
ass
and steerable
r operators
are simple filters
examples of band-pass and oriented filters.
More
2 +y 2
x
1
2 2 Gaussian
G(x,
y;
)
=
e
,
(3.24)
can be created by first smoothing the image with a (unit
area)
2
2⇡oforiented
rner operators
simple
examples
of band-pass
and
filters.
❖ Aare
more
general
and smooth
family
filters are
the More
Gaussian derivatives, which can
andbe
then
takingbythe
firstsmoothing
or second
derivatives (Marr 1982; Witkin 1983; Freeman and Adelson
2the image with a (unit area) Gaussian
nels can
created
first
x2 +y
be
derived
by
partial
spatial derivatives of the 2D Gaussian function
1 taking
2
2
y; filters
) = are known
e
,
(3.24)
1991).G(x,
Such
collectively
as band-pass filters,
since they filter out both low
2
2⇡
x2 +y 2
1
and
high
frequencies.
st or second derivatives
Witkin
2 2 , 1983; Freeman and Adelson
G(x, y; )(Marr
= 1982;
e
(3.24)
2
(undirected)
second
of a two-dimensional
image,
2⇡ derivative
e knownThe
collectively
as band-pass
filters, since
they filter out both
low
e first or second derivatives (Marr 1982; Witkin 1983;
Freeman
and Adelson
2
2
@ f
@ y
2
r f =since(LoG):
+ filter
, out both low
(3.25)
s are known
as band-pass
they
❖ collectively
Example:
Laplacian
of filters,
Gaussian
second derivative
of a two-dimensional
image,
@x2
@y 2
cies.
is known
as2 the Laplacian
Blurring
an image with a Gaussian and then taking its
@ 2aftwo-dimensional
@ 2operator.
y
ed) second
derivative
of
image,
r f=
+ 2,
(3.25)
2
Laplacian is equivalent
to
convolving
directly
with
the
Laplacian
of Gaussian (LoG) filter,
@x 2 @y 2
@ f
@ y
✓ 2
◆
2
2
r
f
=
+
,
(3.25)
x + yand then
2 taking its
acian operator. Blurring an
image
with
a
Gaussian
2
2
2
@x
r G(x,@y
y; ) =
G(x, y; ),
(3.26)
4
2
nt to convolving directly with the Laplacian of Gaussian (LoG) filter,
Laplacian operator.✓Blurring an image
with a Gaussian and then taking its
◆
2
2
x
+
y
2the Laplacian of Gaussian (LoG) filter,
2
valent
to
convolving
directly
with
r G(x, y; ) =
G(x, y; ),
(3.26)
4
2
✓ 2function
◆
MATLAB
2
x +y
2
2
r G(x, y;mvnpdf
)=
G(x, y; ),
(3.26)
4
2
Gaussian Derivatives
EECS 4422/5323 Computer Vision
!16
J. Elder
he five-point Laplacian is just a compact approximation to this more sophisticated
Steerable Filters
e, thethe
Sobel
operator
is aissimple
approximation
to atodirectional
or oriented
filter,
wise,
Sobel
operator
a simple
approximation
a directional
or oriented
filter,
by by
smoothing
with
a Gaussian
(or (or
some
other
filter)
andand
thenthen
taking
a a
nbtained
obtained
smoothing
with
a Gaussian
some
other
filter)
taking
@ @
❖
To
detect
contours
in thebyimage,
use
oriented
derivative
r
=
, which
is obtained
taking we
the typically
dot product
between
the Gaussian derivative filters,
al derivativeûrû =
@ û @ û , which is obtained by taking the dot product between the
formed
by
directional
d r and
a unit
direction
û taking
=û (cos
✓,
sin
✓), ✓), derivatives of the Gaussian function:
field
r and
a unit
direction
= (cos
✓, sin
û · û
r(G
⇤ f ⇤) =
(G (G
⇤ f ⇤) =
(r
G) G)
⇤ f.⇤ f.
· r(G
f )r
=ûr
f
)
=
(r
û
û
û
(3.27)
(3.27)
❖ Note
that
othed directional
derivative
filter,
moothed
directional
derivative
filter,
@G@G @G@G
∂G
∂G
GûG= uG
+
vG
=
u
+
v
,
(3.28)
where û = (u,v) = (cosθ ,sin θ )
=
cos
θ
+
sin
θ
x
y
u
+@y
v
,
(3.28)
û = uGx + vGy =@x
∂x
∂y
@x
@y
(u,
v), v),
is an
of words,
aofsteerable
since
the the
value
offilter
an
convolved
= (u,
is example
anInexample
a steerable
filter,
since
value
of image
anin
image
convolved
❖
other
the filter,
Gaussian
derivative
direction
u is a weighted sum of the
ncan
be be
computed
by by
firstfirst
convolving
with
pair
of filters
(Gx(G
, Gy, )Gand
then
computed
convolving
the
of filters
Gaussian
derivatives
inwith
xthe
and
ypair
directions.
x ∂G
y ) and then ∂G
filter
(potentially
locally)
by by
multiplying
thisthis
gradient
fieldfield
withwith
a unit
vector
û û
Gû
he
filter
(potentially
locally)
multiplying
gradient
a unit
vector
∂x
∂y
dand
Adelson
1991).
The
advantage
of
this
approach
is
that
a
whole
family
of
filters
Adelson 1991). The advantage of this approach is that a whole family of filters
ated with
very
little
cost.
aluated
with
very
little
cost.
ut steering
a directional
second
derivative
filter
rûr· rû
GûG
, which
is the
result
about
steering
a directional
second
derivative
filter
·
r
,
which
is
the
result
û
û û
smoothed)
directional
derivative
andand
thenthen
taking
the the
directional
derivative
again?
a (smoothed)
directional
derivative
taking
directional
derivative
again?
, GxxGis the
second directional derivative in the x direction.
∂G
∂G
ple,
xx is the second directional derivative in the x direction.
∗f
Gû ∗ f
∗ f dif
glance,
it
would
appear
that
the
steering
trick
will
not
work,
since
for
every
∂xfor every di- ∂y
st glance, it would appear that the steering trick will not work, since
need
to compute
a different
firstfirst
directional
derivative.
Somewhat
surprisingly,
,e we
need
to compute
a different
directional
derivative.
Somewhat
surprisingly,
dand
Adelson
(1991)
showed
that,that,
for for
directional
Gaussian
derivatives,
it isitpossible
Adelson
(1991)
showed
directional
Gaussian
derivatives,
is possible
EECS 4422/5323 Computer Vision
!17
û
J. Elder
Fourth-order steerable filter (Freeman and Adelson 1991) c 1991 IEEE: (a)
ntaining bars (lines) and step edges at different orientations; (b) average oriented
ominant orientation; (d) oriented energy as a function of angle (polar plot).
What filters are steerable?
❖ It turns out that Gaussian derivatives of all orders are steerable with a finite number of
basis functions.
rder of derivative with a relatively small number of basis functions. For example,
sis functions
for the
second-order
directional
derivative,
❖ are
Forrequired
example,
a Gaussian
2nd
derivative
requires 3 basis functions:
Gûû = u2 Gxx + 2uvGxy + v 2 Gyy .
(3.29)
each of the basis filters, while not itself necessarily separable, can be computed
❖ Moreover, the basis functions are separable (or superpositions of separable functions).
ar combination of a small number of separable filters (Freeman and Adelson
arkable result makes it possible to construct directional derivative filters of inreater directional selectivity, i.e., filters that only respond to edges that have
consistency in orientation (Figure 3.15). Furthermore, higher order steerable
spond to potentially more than a single edge orientation at a given location, and
pond to both bar edges (thin lines) and the classic step edges (Figure 3.16). In
his, however, full Hilbert transform pairs need to be used for second-order and
, as described in (Freeman and Adelson 1991).
filters are often used to construct both feature descriptors (Section 4.1.3) and
rs (Section 4.2). While the filters developed by Freeman and Adelson (1991)
d for detecting linear (edge-like) structures, more recent work by Koethe (2003)
combined 2 ⇥ 2 boundary tensor can be used to encode both edge and junction
atures. Exercise 3.12 has you implement such steerable filters and apply them to
edge and corner features.
EECS 4422/5323 Computer Vision
!18
J. Elder
Application: Edge Detection
Elder & Zucker 1998
EECS 4422/5323 Computer Vision
!19
J. Elder
End of Lecture
Sept 26, 2018
EECS 4422/5323 Computer Vision
!20
J. Elder
gative
values
in italics.
rectangle
corners
in bold
and
1984),(purple)
which is(3.32).
just thePositive
runningvalues
sum ofare
allshown
the pixel
values
from the origin,
lics.
j
i X
Integral
Images
X
84), which is just the running sum of all the pixel values from the origin,
s(i, j) =
f (k, l).
(3.30)
❖ sum
If a of
diversity
boxvalues
filtersfrom
are tothe
beorigin,
employed,
it can be very efficient to derive these
he running
all the of
pixel
k=0 l=0
j
i X
from the integral image s(i, j),X
which
is the 2D analog of a 1D cumulative sum:
j
This can be
recursive
iefficiently
s(i,computed
j) = using fa (k,
l). (raster-scan) algorithm, (3.30)
X
X
s(i, j) =
f (k, l).
(3.30)
k=0 l=0
s(i, j) = s(i 1, j) + s(i, j 1) s(i 1, j 1) + f (i, j).
(3.31)
k=0 l=0
s can be efficiently computed using a recursive (raster-scan) algorithm,
❖ This
isaefficiently
computed using
a raster-scan algorithm:
y computed
using
recursive
(raster-scan)
algorithm,
The image s(i, j) is also often called an integral image (see Figure 3.17) and can actually be
s(i, j) = using
s(i only
1, j)two
+ s(i,
j 1)per s(i
j 1) +
f (i,
j).are used (Viola
(3.31)and Jones
computed
additions
pixel if1,separate
row
sums
) = s(i 1, j) + s(i, j 1) s(i 1, j 1) + f (i, j).
(3.31)
2004). To find the summed area (integral) inside a rectangle [i0 , i1 ] ⇥ [j0 , j1 ], we simply
e image s(i,
j) is for
alsofour
often
called
an integral
(see
3.17)
actually
❖ Now,
example,
a rectangular
boximage
average
of Figure
arbitrary
sizeand
and can
shape
can be be
combine
samples
from
the
summed
area
table,
also often called an integral image (see Figure 3.17) and can actually be
mputed usingcomputed
only twousing
additions
pixel if separate row
sums
are used
(Viola and Jones
just 4 per
additions/subtractions
on the
integral
image:
j1
two additions per pixel if separate row
are used (Viola and Jones
i1 sums
X
X
04). To find the summed area (integral) inside a rectangle [i0 , i1 ] ⇥ [j0 , j1 ], we simply
=
1) s(i0 1, j1 ) + s(i0 1, j0 1).
0 . . . i1 , jinside
0 . . . j1a) rectangle
1 ) [j0s(i
0 simply
ummed areaS(i
(integral)
[i0s(i
, i11,] j⇥
, j11],, jwe
mbine four samples from the summed
i=i0 area
j=j0 table,
s from the summed area table,
(3.32)
3.2 Linear filtering
121
j1Image f
i1 X
Integral
image
s they require
Integral
image
s N extra bits
X
j
i
A
potential
disadvantage
of
summed
area
tables
is
that
log
M
+
log
1
1
i0 . X
. . i1X
, j0 . . . j1 ) =
s(i1 , j1 ) s(i1 , j0 1) s(i0 1, j1 ) + s(i0 1, j0 1).
=
s(i
, j1accumulation
) s(i
1) 2 s(i
)+
1, j17
1).where
in 1the
compared
the
original
image,
are the image
3 1 ,2j0 7image
3 0 1, j31to
5 s(i
12
0 14
0
3 5 M12and14N 17
i=i0 j=j0
width and height.
can also4be 11
used19to approximate
other
1 5 Extensions
1 3 4 of summed
4 11area
24 31
24 (3.32)
31
19 tables
(3.32)
convolutionof
kernels
(Wolberg
(1990,is9Section
6.5.2)
contains
a 9review).
potential
disadvantage
summed
area
tables
that
they
require
log
M
log 28
N extra
bits
5
1
5
1
28
38
46
38 46
3
17
age of summed area tables is that they require log M + log N extra bits + 17
In image
computer
vision, to
summed
area tables
have
beenM
used
inNface
detection
(Viola and
he
accumulation
compared
the
original
image,
where
and
are
the
image
4
3
2
1
6
13
24
37
48
62
13
24
37
62
48
image compared to the original image, where M and N are the image
Jones 2004) to compute simple multi-scale low-level features. Such features, which consist of
dth
and
height.
Extensions
tables
to15approximate
2 tables
4of summed
1can4 also
8area
15 to30can
44also
59 be81used
30 44 59 other
81
xtensions of summed
area
be
used
approximate
other
adjacent rectangles of positive and negative values, are also known as boxlets (Simard, Bottou,
nvolution
kernelsSection
(Wolberg
(1990,
Section
6.5.2) contains a review).
Wolberg
(1990,
6.5.2)
contains
a
review).
EECS 4422/5323 Computer Vision
!21
J. Elder
i=i0 j=j0
(a) S = 24
(b) s = 28
(c) S = 24
Application: Face Detection
Viola & Jones 2001
EECS 4422/5323 Computer Vision
!22
J. Elder
er input element, it is easy to see how well our apbe used in schemes where N can be very large.
edges at multiple resolutions is a typical application
filters are required. It is our belief that the use of the
ursive filtering
we presented
is very computation
useful in
❖ structure
The efficient
raster-scan
used to compute the integral image is an
ext. It is only necessary
one parameter
exampletoofchange
a recursive
filter. 01to
arge operators without changing time execution. Of
important problem of sorting out the relevant changes
olution and combining them into a representation that
known asis infinite-impulse
d effectively❖byAlso
later processes
not addressed here.response (IIR) filters
s have been devoted to this probem [3], [6], [2] [8].
defined a multiresolution technique of regularly spaced
❖ Unfortunately
Gaussian
derivatives
do not have a recursive implementation.
evels and considered
some applications
on motion,
texge detection analysis. Crowey [8] proposed a represend on the difference of low-pass transform of an image.
ucted by detecting
peaks and
ridges
each bandpass
Fig.approximations
1. Shapesof the recursivesmoothing operator (solid line) and the
❖ However,
there
areinefficient
recursive
in the entire 3-D space defined by the dfference of the
Gaussianfilter (dashedline).
Recursive approximation
83
CE, VOL. This
12. NO.representation
I. JANUARY 1990
ansform.
has been applied effie task of comparing the structure of two shapes to deGaussian 1st derivative
correspondence of their components Witkin [3] pronvolve the original image with a Gaussian filter over a
of scalesand to use the map describing the displacement
crossings over the scales as an effective description of
image.
Recursive
approximation
evel
vision algorithms
described here can also be come spectral domain through the use of the fast Fourier
nd the convolution theorem. An interesting point, then,
re the spectral domain approach to the one proposed
filters used are separable, the comparison can be made
one-dimensional case since the one-dimensional filter
d twice, once oriented horizontally and once vertically.
FT approach, the 2-D convolution would be performed
ransforming the input image, multiplying the result by
transform of the 2-D filter considered and back transGaussian
product to obtain the result required.
Since the proFig.and
2. the
Shapesof the recursivederivativeoperator (solid line) and the first
Fig.
1.
Shapes
of
the
recursive
smoothing
operator
(solid
line)
EECS
Computer Vision
!23 derivativeof the Gaussianfilter (dashedline).
J. Elder
s can
be4422/5323
Fourier transformed
analytically,
only
two
Gaussianfilter (dashedline).
Recursive Filters
Optimal Linear Filters
❖ For some problems and under some conditions, it can be proven that linear filtering
yields an optimal solution.
๏ Example: estimation of the mean irradiance from a surface in the scene.
Let f (x, y) = g(x, y) + n(x, y) be a noisy image patch,
where g(x, y) is the true irradiance from the patch
and n(x, y) is random noise added by the sensor.
If n(x, y) is additive Gaussian, independent and identically distributed (IID), then
f =
1
1
f
(x,
y)
is
an
optimal
(unbiased
and
efficient)
estimator
of
g
=
g(x, y),
∑
∑
n x,y
n x,y
where n is the number of pixels in the patch.
๏ Notes:
This is a box filter, which can be implemented using integral images.
(
1
f minimizes the mean squared deviation: f = arg min ∑ fˆ − f (x, y)
n x,y
fˆ
EECS 4422/5323 Computer Vision
!24
)
2
J. Elder
Outline
❖ Point Operators
❖ Linear Filters
❖ Nonlinear Filters
EECS 4422/5323 Computer Vision
!25
J. Elder
Nonlinear Filters
❖ For many problems/conditions, linear filtering is provably sub-optimal.
(a)
(b)
๏ Example: shot noise.
Image + shot noise
After linear filtering with a Gaussian lowpass filter
(e)
(f)
๏ Can we do better than this?
Figure 3.18 Median and bilateral filtering: (a) o
EECS 4422/5323 Computer Vision
!26
J. Elder
to remove the shot noise because the noisy pixels are too different from their neighbors.
Median Filters
❖ A median filter simply replaces the pixel value with the median value in its
neighbourhood.
2
.
1
0
1
2
1
2
1
2
4
1
2
1
2
4
2
0.1 0.3 0.4 0.3 0.1
0.0 0.0 0.0 0.0 0.2
2
1
3
5
8
2
1
3
5
8
1
0.3 0.6 0.8 0.6 0.3
0.0 0.0 0.0 0.4 0.8
1
3
7
6
9
1
3
7
6
9
0
0.4 0.8 1.0 0.8 0.4
0.0 0.0 1.0 0.8 0.4
3
4
8
6
7
3
4
8
6
7
1
0.3 0.6 0.8 0.6 0.3
0.0 0.2 0.8 0.8 1.0
4
5
7
8
9
4
5
7
8
9
2
0.1 0.3 0.4 0.3 0.1
0.2 0.4 1.0 0.8 0.4
MATLAB function
medfilt2
❖ It is(a) amedian
good
for shot
(heavy-tailed)
noise,
as filter
the median value is not affected by
= 4 choice
(b) α-mean=
4.6
(c) domain filter
(d) range
extreme noise values
Figure 3.19 Median and bilateral filtering: (a) median pixel (green); (b) selected ↵-trimmed
mean pixels; (c) domain filter (numbers along edge are pixel distances); (d) range filter.
❖ Can be computed in linear time.
(a) blurring of edges
❖ Reduces
Image + shot noise
(e)
EECS 4422/5323 Computer Vision
(b)
Gaussian lowpass filter
(f)!27
(c)
Median filter
(g)
J. Elder
Median Filters
❖ While averaging minimizes the squared deviation, median filtering minimizes the
absolute (L1) error:
1
f = arg min ∑ fˆ − f (x, y)
n x,y
fˆ
EECS 4422/5323 Computer Vision
!28
J. Elder
Bilateral Filters
❖ Gaussian linear filters provide a nice way of grading the weights of neighbouring
pixels so that closer pixels have more influence than more distant pixels.
❖ Median filters provide a nice way of reducing the influence of outlier values.
❖ Can we somehow combine these two things?
EECS 4422/5323 Computer Vision
!29
J. Elder
values differ
too much
from
thediffer
central
pixel
value?
This
is the essential
ideaThis
in is the essential idea in
pixels
whose
values
too
much
from
the
central
pixel
value?
pixels
whose
values
differ
too
much
from
the
central
pixel
value?
This
is the essential
idea
in
which wasbilateral
firstManduchi
popularized
in
theChen,
computer
vision
community
byand
Tomasi
and
(1998).
Paris,popularized
and Durand
(2007)
Paris,
Kornprobst,
Tumblin
et al.
filtering,
which
was
first
in
the
computer
vision
community
by
Tomasi
ring, whichbilateral
was firstfiltering,
popularized
in was
the computer
vision community
by Tomasi
which
first
popularized
in
the
computer
vision
community
by
Tomasi
bilateral
filtering,
which
was
first
popularized
in
the
computer
vision
community
by
Tomasi
98). Chen,and
Paris,
and
Durand
(2007)
and
Paris,
Kornprobst,
Tumblin
et
al.
(2008)
cite
similar
earlier
work
(Aurich
and
Weule
1995;
Smith
and
Brady
1997)
as
well
as
(a)Bilateral
(b)
(c)
(d)
Manduchi
(1998).
Chen,
Paris,
and
Durand
(2007)
and
Paris,
Kornprobst,
Tumblin
et
al.
Filters
i (1998). Chen,
Paris, and(1998).
DurandChen,
(2007)Paris,
and Paris,
Kornprobst,
Tumblin
et Kornprobst,
al.
and
Manduchi
and
Durand
(2007)
and
Paris,
(e)
(f)
(g)Tumblin
and(Aurich
Manduchi
(1998).
Chen,
Paris, and
Durand
(2007)
and
Paris,
Kornprobst,
Tumblin et
et al.
al.(h)
earlier work
and
Weule
1995;
Smith
Brady
1997)
as
well
as
the
wealth
of
subsequent
applications
in
computer
vision
and
computational
photography.
(2008)
similar
earlier
work
(Aurich
and
1997) as well as
milar earlier
workcite
(Aurich
and
Weule
1995;
Smithand
andWeule
Brady1995;
1997)Smith
as well
as Brady
(2008)
cite
similar
earlier
work
(Aurich
and
Weule
1995;
Smith
and
Brady
1997)
as
well
as
(2008)
cite
similarfilter,
earlier
work
(Aurich
and3.18Weule
1995;
Smith
and(a)Brady
1997)
well
as (b) Gau
equent applications
in
computer
vision
and
computational
photography.
In
the
bilateral
the
output
pixel
value
depends
a
weighted
combination
of
neighFigure
Median
and on
bilateral
filtering:
original image
withas
Gaussian
noise;
the
wealth
of
subsequent
applications
in
computer
vision
and
computational
photography.
subsequentthe
applications
in computerapplications
vision and computational
photography.
sian
filtered;
(c)
median
filtered;
bilaterally filtered; (e)photography.
original image with shot noise;
wealth
of
subsequent
in
computer
vision
and
the wealth
of
subsequent
applications
in computer
vision
and(d)computational
computational
photography.
filter, the output
pixel
value
depends
on
a
weighted
combination
of
neighboring
pixel
values
Pvaluefiltered;
the bilateral
filter,depends
the output
pixel
depends
on filtered;
a of
weighted
combination
Gaussian
(g) median
(h) bilaterally
filtered. Note of
thatneighthe bilateral filter fa
teral filter, theIn
output
pixel
value
on
a
weighted
combination
neighIn
the
bilateral
filter,
the
output
pixel
value
depends
on
a
weighted
combination
of
f
(k,
l)w(i,
j,
k,
l)
In the P
bilateral filter, the output pixel
value
depends
on a weighted
combination
of neighneighk,l
to
remove
the
shot
noise
because
the
noisy
pixels
are
too
different
from
their neighbors.
boring
pixel
values
P
g(i,
j)
=
.
(3.34)
values
(f)
(g)
(h)
P
boring
values
P
f (k, l)w(i, j, k, (e)
l)
w(i,
j,
k,
l)
boring pixel
pixel k,l
values
(k, l)w(i, j, (3.34)
k, l)
Pk,l fk,l
P k,l f (k, l)w(i, j,.k, l) P
g(i, j) =
f
(k,
l)w(i,
j,
k,
l)
g(i,
= .and P
.(3.34)
(3.34)
f
(k,
l)w(i,
j,
k,
l)image
k,l
3.18 j)
Median
bilateral
filtering:
(a)
original
with Gaussian noise; (b) Gausw(i, j,Figure
k, l)
g(i, j) = k,l P
k,l
P
g(i,
j)
=
.
(3.34)
w(i,
j,
k,filtered;
l) of(e).aoriginal
P
The weighting coefficient
w(i,
j,
k,
l)
depends
on
the
product
domain
kernel
(Figure
3.19c),
g(i,
j)
=
(3.34)
w(i,
j,
k,
l)
k,l
sian
filtered;
(c)
median
filtered;
(d)
bilaterally
image
with
shot
noise;
(f)
k,l
w(i,
j,
k,
l)
j, k, l) filtered. Note that the bilateral filter fails
k,l
k,l w(i,
Gaussian filtered;
(g) median
filtered;
(h)
bilaterally
fficient w(i,
j,
k,
l)
depends
on
the
product
of
a
domain
kernel
(Figure
3.19c),
✓ on the product
◆
The
weighting
coefficient
w(i,
j,
k,
l)
depends
of
a
domain
kernel
(Figure 3.19c),
2
2
g coefficient
w(i,
j,
k,
l)
depends
on
the
product
of
a
domain
kernel
(Figure
3.19c),
to
remove
the
shot
noise
because
the
noisy
pixels
are
too
different
from
their
neighbors.
(i
k)
+
(j
l)
The
w(i,
j,
k,
l)
depends
on
the
product
of
aa domain
kernel
(Figure
3.19c),
The weighting
weighting
coefficient
w(i,
j,
k,
l)
depends
on
the
product
of
domain
kernel
(Figure
3.19c),
✓ coefficient
◆
2
1
0
1
2
d(i,
j,
k,
l)
=
exp
,
(3.35)
.
2
2
2
✓
◆
(i✓ k) + (j2 l)
◆
2 0.1 0.3 0.4 0.3 0.1
1 (i
2 1 k)
2 422+d(j
1 2 1l)2
2 ◆
4
0.0 0.0 0.0 0.0 0.2
✓
2
✓
◆
d(i, j, k, l) = exp
,
(3.35)
2
2
(id(i,k)
+
(j
l)
2 k, l) = exp 2 1(i
1 0.3 0.6 0.8 0.6 0.3
3 5 k)
8 2 + 2(j1 3 l)52 8 ,
0.0 0.0 0.0 0.4 0.8
j,
(3.35)
(i
k)
+
(j
l)
2
d(i, j, k, l) = exp
,
(3.35)
2
d
d(i,
=
(3.35)
0 0.4 0.8 1.0 0.8 0.4
1 3 7 6 92 d21 3 7 6 9 ,
0.0
0.0 1.0 0.8 0.4
d(i, j,
j,2k,
k, d2l)
l) (Figure
= exp
exp 3.19d),
,
(3.35)
and a data-dependent range
kernel
2
2
1 0.3 0.6 0.8 0.6 0.3
3 4 8 6 7 2 d
0.0 0.2 0.8 0.8 1.0
d3 4 8 6 7
ent range kernel
(Figure 3.19d), range kernel (Figure✓3.19d),
2 0.1 0.3 0.4 0.3 0.1
4 5 7 8 9
4 5 7 8 ◆
9
0.2 0.4 1.0 0.8 0.4
and
a
data-dependent
2
pendent range
kernel
(Figure
3.19d),
kf (i, j) f (k,2 l)k
1
0
1
2
and
range
(Figure
3.19d),
. kernel
and aa data-dependent
data-dependent
range
kernel
(Figure
3.19d),
✓
◆
r(i, j, 1k,2l) 1=22 exp
.
(3.36)
(a)
median
= 44
α-mean=
4.6 0.1
(c)0.0
domain
filter0.2
(d) range filter
✓
◆
2 (b)
2
4
1
2
1
2
0.1
0.3
0.4
0.3
0.0
0.0
0.0
kf
(i,
j)
f
(k,
l)k
✓
◆
2
2
✓ 1 3(i,5 j)8 fr1(3.36)
◆
(k,
◆0.3 0.0 0.0 0.0 0.4 0.8
2 1 f 3(k,
5 l)k
8. 2 ✓2 kf
0.3 l)k
0.6 0.820.6
r(i, j, k, l) = exp
kf
(i,
j)
2
2
kf
(i,
j)
f
(k,
l)k
r(i,
j,
k,
l)
=
exp
.(a) median
(3.36)
f
l)k
2
Median
filtering:
(green);
(b)
selected ↵-trimm
r(i, j, k, l) = exp
.
(3.36)
0 (k,
1 3 7 6 Figure
9
13.19
3kf
7 (i,
6 j)
9 and bilateral
0.4 0.8
1.0 0.8 0.4
0.0 0.0pixel
1.0 0.8
0.4
2
r
r(i,
j,
k,
l)
=
exp
.
(3.36)
2
2
r(i,
j,
k,
l)
=
exp
.
(3.36)
r2
When multiplied together, these
data-dependent
function
1 bilateral
32 4yield
7
3 4 (c)
8 domain
6 7 2filter
0.3 0.6 0.8 along
0.6weight
0.3 edge0.0
0.2
0.8 0.8
1.0
mean
pixels;
(numbers
are
pixel
distances);
(d) range filter.
r 8 6 the
rr2
2
2 0.1 0.3 0.4 0.3 0.1
4 5 7 8 9
4 5 7 8 9
0.2 0.4 1.0 0.8 0.4
ogether, these
yield
the
data-dependent
bilateral
weight
function
✓
◆
When
multiplied
together,
these(iyield
the
bilateral
weight
2 data-dependent
2function
2function
ied together,
these
yield
the
data-dependent
bilateral
weight
k)the
+data-dependent
(j l)
kf (i,
j) f (k,
l)k function
together,
these
yield
bilateral
weight
When multiplied
multiplied
together,
these
yield
the
data-dependent
bilateral
weight
function
(a)
median
=
4
(b)
α-mean=
4.6
(c)
domain
filter
(d)
range
✓When
◆
w(i,
. filter
(3.37)
2 j, k, l) =
2 exp ✓
2
2
◆
2
(i✓ k) + (j2 l)
kf
(i,
j)
f
(k,
l)k
2 fr (k, l)k2 ◆
22 d
22 ◆
2✓
(i
k)
+
(j
l)
kf
(i,
j)
✓
◆ ↵-trimmed
) = exp
.
(3.37)
2
2
(i
k)
+l)(j= exp
l)
kf
(i,
j)
fbilateral
(k, l)k
Figure 3.19
Median
filtering:
(a)
median
pixel f
(green);
(b) 2
selected
2and
2
2
2
2
(i
k)
+
(j
l)
kf
(i,
j)
(k,
l)k
w(i,
j,
k,
.
(3.37)
(i
k)
+
(j
l)
kf
(i,
j)
f
(k,
l)k
2
2
j, k, l) = exp
.
(3.37)
2
2
r
d
w(i,
j,
exp
.. filter.
(3.37)
22 (numbers along edge are pixel
22 filter
2 step
mean pixels;
(c) domain
distances);
(d)Note
range
w(i,
j,2k,
k, d2l)
l)
=example
exp
(3.37)
r2
d
Figure 3.20
shows
an=
of the
bilateral
filtering
of
a
noisy
edge.
how
the
dor
2
22 d
22 rr2
d
an example
of
the
bilateral
filtering
of
a
noisy
step
edge.
Notemeasures
how
the
domain
kernel
is
the
usual
Gaussian,
the
range
kernel
appearance
(intensity)
similarity
Figure
3.20
shows
an
example
of
the
bilateral
filtering
of
a
noisy
step
edge.
Note
how
dohows an example
of
theshows
bilateral
filtering of
athe
noisy
step edge.
Note
how
thestep
do- edge. Note howthe
Figure
3.20
an
example
of
bilateral
filtering
of
a
noisy
the
Figure
3.20
shows
an measures
example
of
thefilter
bilateral
filtering
of a noisy
steptwo.
edge. Note how the dodousual Gaussian,
the
range
kernel
appearance
(intensity)
similarity
to
the
center
pixel,
and
the
bilateral
kernel
is
a
product
of
these
main
kernel
is
the
usual
Gaussian,
the
range
kernel
measures
appearance
(intensity)
similarity
s the usual Gaussian,
theisrange
kernel
measures
appearance
(intensity)
similarity
main
kernel
the
usual
Gaussian,
the
range
kernel
measures
appearance
(intensity)
similarity
main
kernel
isVision
the
usual
Gaussian,
the range
kernel
measures
appearance
(intensity)
similarity
and
the4422/5323
bilateral
filter
kernel
is
a
product
of
these
two.
EECS
Computer
!3the
0
Notice
that
the
range
filter
(3.36)
uses
vector
distance
between
the
center
and theJ. Elder
to the center pixel, and the bilateral filter kernel is a product of these two.
Bilateral Filters - Example
Example pixel
input
spatial kernel f
Domain kernel d
influence g in the intensity
domain for the central pixel
weight f g
for the central pixel
output
Figure 6: Bilateral filtering. Colors are used only to convey shape.
between the two connected components, while extended diffusion
the extended definition of intensity is not quite natural. Elad also
does. Depending on the application, this property will be either
discusses the relation between bilateral filtering, anisotropic diffuinputrobust
spatial
kernel
influence
in the
the intensity
intensity
weight
output
beneficial
deleterious.
case of toneoutput
mapping, for examand
statistics,
the
a gglinearinput sion, input
spatial
kernelbut
f he address
influence
g in influence
thefrom
intensity
weight
f gorweight
spatial
kernel
ff question
in
ff gg In theoutput
domain
for
central
pixel
for
theof
central
pixel
ple,central
the notion
connectedness
is not
important,
as only
algebra
pointf of view [to appear]. Indomain
this paper,
we
propose
athe
diffor the
central
pixel
for the
pixel
Input
image
Bilateral
weight
function
wpixel
Output
image
g spatial
domain
for
the
central
pixel
for
the
central
neighborhoods matter.
ferent unified viewpoint based on robust statistics that extends the
Figure
6:
Bilateral
filtering.
Colors
are
used
only
to convey
convey
shape.
Figure
6:
Bilateral
filtering.
Colors
are
used
only
to
convey
shape.
only
to
shape.
now
come to
the robust statistical interpretation of bilateral
work by Black et al. [1998]. Figure 6: Bilateral filtering. Colors are used We
filtering. Eq. 6 defines an estimator based on a weighted average of
the data. It is therefore a W -estimator [Hampel et al. 1986]. The
between
the two
two
connectediswhile
components,
while
extended
diffusion least
the extended
extended
definitionisof
of
intensity
issmoothing
not quite
quite
natural.
Elad
also the two
4 Edge-preserving
robust
between
connected
components,
extended
diffusion
ded definition
of intensity
not
quite natural.
Elad natural.
also as
iterative
formulation
an instance
of
iteratively
reweighted
between
the
connected
components,
while
extended
diffusion
the
definition
intensity
is
not
Elad
also
spatial
kernel
ffiltering,
influence
g indiffuthe
intensity
weight
gthe application,
output
does.fsquares.
Depending
ontaxonomy
thethis
application,
this
property
willbecause
be either
either
discussesbetween
the
relation
between
bilateral
filtering,
anisotropic
diffu-Depending
does.
onDepending
property
will property
beimportant
eitherwill
the discusses
relation
bilateral
anisotropic
Thison
is extremely
it was
does.
the
application,
this
be
the
relation
between
bilateral
filtering,
anisotropic
diffustatistical
estimation
domain
for
the
central
pixel
for
the
central
pixel
beneficial
orthat
deleterious.
Intone
theand
case
of tone
tone
mapping,
for examexamsion,
and robust
robust
statistics,
but
he
address
the question
question
from
linear- or deleterious.
In
the
case ofIn
mapping,
for mapping,
examrobust
statistics,
but he
addressbut
thehe
question
from
a linear- from
Range
kernel
r aabeneficial
shownor
M-estimators
W-estimators
are essentially
equivbeneficial
deleterious.
the
case
of
for
sion,
and
statistics,
address
the
linearple,
the
notion
of
connectedness
is
not
important,
as
only
spatial
algebra
point
of
view
[to
appear].
In
this
paper,
we
propose
a
difple,
thebilatnotionple,
of the
connectedness
isthe
notsame
important,
asminimization
only spatialasproblem
oint algebra
of view
[to appear].
In[tothis
paper,
we
propose
a we
difalent
and solve
energy
[Hampel
In point
their
paper,
et al. filtering.
only
outlined
the
principle
of
notion
of connectedness
is not
important,
only spatial
ofFigure
viewTomasi
appear].
In
this
paper,
propose
a dif6: Bilateral
Colors
are
used only
to convey
shape.
neighborhoods
matter.
ferent
unified
viewpoint
based
on robust
robust
statistics
that
extends
the two neighborhoods
neighborhoods
matter.
fied ferent
viewpoint
based
on they
robust
statistics
that
extends
the
et al. 1986],
p. 116:
eral
filters,
and
then
focused
on statistics
the results
obtained
using
matter.
unified
viewpoint
based
on
that
extends
the
We
nowrobust
come statistical
to the
the robust
robust
statistical interpretation
interpretation
of bilateral
bilateral
work
by[1998].
Black et
et In
al. this
[1998].
We of
now
to now
the
interpretation
of bilateral of
Blackwork
et al.
Gaussians.
section, we provide a principled study
thecomeWe
come
to
statistical
by
Black
al.
[1998].
Tomasi
&
Manduci,
1998
min
ρ on
Is a weighted
Ip of average
filtering.
Eq.
6
defines
an
estimator
based
of (9)
∑
∑
filtering.
Eq. 6filtering.
defines an
estimator
based
on
a
weighted
average
properties
of
this
family
of
filters.
In
particular,
we
show
that
bilatEq.
6
defines
an
estimator
based
on
a
weighted
average
of
betweenthe
thedata.
two It
connected
components,
while
extended
nition of intensity is not quite natural. Elad also
s Ωetpdiffusion
Ω[Hampel
the
data.
Ita is
is
therefore
a [Hampel
W
-estimator
et al.
al. 1986].
1986]. The
The
isthe
therefore
Wtherefore
-estimator
al.
1986]. The
eral
filtering
is
a
robust
statistical
estimator,
which
allows
us
to
put
data.
It
a
W
-estimator
[Hampel
et
does.robust
Depending
oniterative
the application,
this isproperty
will reweighted
be iteratively
either least
ation
bilateral filtering,
anisotropic
4between
Edge-preserving
smoothing
as
ge-preserving
smoothing
asdiffurobust
formulation
aniteratively
instance
of
reweighted least
least
iterative formulation
is
an
instance
ofan
4
Edge-preserving
smoothing
as
robust
or
for
each
pixel
s:
empirical
results
into
a
wider
theoretical
context.
iterative
formulation
is
instance
of
iteratively
reweighted
EECS
4422/5323
Computer
Vision
3
!
1
J.
Elder
beneficial
or deleterious.
In theThis
case
of tone mapping,
forbecause
examtatistics, but he address the question from a lineartaxonomy
is extremely
important
because it was
squares.
Thissquares.
taxonomy
is extremely
important
it was
estimation
tisticalstatistical
estimation
Anisotropic Diffusion
erated adaptive smoothing and anisotropic diffusion
lateral (and other) filters can also be applied in an iterative fashion, especially if an appear❖ Iterative application of bilateral filtering leads to a smoothing process equivalent to a
ce more like apopular
“cartoon”
is desired (Tomasi
and Manduchi
1998).
iterated
edge-preserving
smoothing
technique
dueWhen
to Perona
& filtering
Malik called
applied, a much
smaller neighborhood
anistropic
diffusion. can often be used.
Consider, for example, using only the four nearest neighbors, i.e., restricting |k i| + |l
 1 in (3.34).
Observe
❖ e.g.,
for athat
4-neighbourhood:
✓
◆
2
2
(i k) + (j l)
d(i, j, k, l) = exp
(3.38)
2
2 d
3.3 More neighborhood (
operators
127
1,
|k i| + |l j| = 0,
=
(3.39)
1/2 d2
, |k i| + |l j| = 1.
η=e
More
neighborhood
127
We can3.3
thus
re-write
(3.34) as operators
❖ and so
P
(t)
f
(i,
j)
+
⌘
f
(k, l)r(i, j, k, l)
k,l
(t+1)
We
can
thus
re-write
(3.34)
as
P
f
(i, j) =
(3.40)
1 + ⌘ k,l r(i, j, k, l)
P
(t)
X f (t) (k, l)r(i, j, k, l)
⌘
f
(i,
j)
+
⌘
(t)
(t)
(t)
k,l
(t+1) = f
(i,
j)
+
r(i,
j,
k,
l)[f
(k,
l)
f
(i, j)],
P
f
(i, j) =
(3.40)
1 +1 ⌘R
+ ⌘ k,lk,l r(i, j, k, l)
X
⌘
(t)
P
= f (i, j) +
r(i, j, k, l)[f (t) (k, l) f (t) (i, j)],
where R = (k,l) r(i, j, k, l), (k, l) are the1N+4 ⌘R
neighbors of (i, j), and we have made the
k,l
iterative nature of the filtering explicit.
P
= (k,l)
r(i,(3.40)
j, k, l),
(k, l)
are as
thethe
N4discrete
neighbors
of (i, j),diffusion
and we equation
have made the
Aswhere
BarashR(2002)
notes,
is the
same
anisotropic
iterativeby
nature
of the
explicit.6 Since its original introduction, anisotropic diffirst proposed
Perona
andfiltering
Malik (1990b).
Perona & Malik, 1990
fusion hasAs
been
extended
andnotes,
applied
to a is
wide
(Nielsen,
Florack,
and DeBarash
(2002)
(3.40)
the range
same of
as problems
the discrete
anisotropic
diffusion
equation
6
riche 1997;
Black, Sapiro,
Marimont
et al.(1990b).
1998; Weickert,
Haar Romeny,
and Viergever
first proposed
by Perona
and Malik
Since itsteroriginal
introduction,
anisotropic dif-
EECS 4422/5323 Computer Vision
(t)
!32
J. Elder
Anisotropic Diffusion Example
❖ But note that
lim f (t ) (i, j) = constant
t→∞
t
EECS 4422/5323 Computer Vision
!33
J. Elder
End of Lecture
Oct 1, 2018
EECS 4422/5323 Computer Vision
!34
J. Elder
niceselect
simplea way
to unify
the
processes.)
Theons
element and athen
binary
output
value
depending
Figure
3.21
shows
a close-up
of the convol
a
simple
3
⇥
3
box
filter,
to
more
complicated
coperat
= fd
convolution. (This
is
not
the
usual
way
in
which
these
turing element
s and
the
resulting
images
forim
th
particular
shape
that
is
being
sought
for
in
the
a nice simple way to unify the processes.) The structuring el
beFigure
the integer-valued
count
of theof
number
of 1s
in
3.21
shows
a
close-up
the
convolut
128
Computer
Visi
simple 3 ⇥ 3 box filter, to more complicated disc structure
c=f
❖ Binary image processing often involvesamorphological
filtering:
over
the
image
and
S
be
the
size
of
the
structuri
turing
and thefor
resulting
images for the
particular shape
thatelement
is beings sought
in the image.
operations used in binary morphology include:
the integer-valued
of the number
1s i
Figure 3.21beshows
a close-up ofcount
the convolution
ofc =
aofbina
f⌦
element
๏ Convolve with local filter s called a structuring
over
the
image
and
S
be
the
size
of
the
structur
• dilation:
dilate(f,
s) for
= ✓(c,
1);
turing element s and
the resulting
images
the operations
d
128used in binary
Computer
Vision:include:
Algorithm
c = f ∗s
operations
morphology
be
the
integer-valued
count
of
the
number
of
1s ins
⎧ 1
• erosion: erode(f, s) c==✓(c,
S);
if c ≥ t
f
⌦
s structurin
(a)
(b)
๏ Threshold result: θ c,t = ⎨
over the
image
and
S
be
the
size
of
the
•
dilation:
dilate(f,
s)
=
✓(c,
1);
⎪⎩ 0 otherwise
operations
usedofinthe
binary
morphology
include:
• majority:
maj(f,
s)3.21
= 1s
✓(c,
S/2);
Figure
Binary
image
mor
be the integer-valued
count
number
of
inside
each stru
• erosion: erode(f,
s) (e)
= ✓(c,
S); (f) closing
majority;
opening;
size
S
=
9
๏ Example: Boxcar structuring element of
over the image
Sopening:
be theComputer
size
of the
structuring
(ns
128 and
Vision:
and Appli
• •dilation:
dilate(f,
s)
==Algorithms
✓(c,
1); element
open(f,
s)
dilate(erode(f,
The effects(September
of majority
are adraft)
subt
128
Computer Vision: Algorithms and Applications
3, 2010
operations
used in•binary
morphology
include:
majority:
maj(f,
s)
=
✓(c,
S/2);
⎡ 1 1 1 ⎤
dot, since it is
wide enough.
(a)
(b)not
(c)
• erosion: erode(f,
s) = ✓(c,
S);
⎢
⎥
• dilation: dilate(f,
s) =3.21
✓(c,Binary
1); s) image
s=⎢ 1 1 1 ⎥
• opening:
open(f,
= dilate(erode(f,
Figure
morphology: (a)s
• majority:
maj(f,
s) = ✓(c, S/2);
majority; (e) operation,
opening; (f) closing. The structur
⎢⎣ 1 1 1 ⎥⎦
128 • erosion: erode(f,
Computer s)
Vision:
Algorithms
and Applications (Sept
= ✓(c,
S);
The effects of majority are a subtle rounding o✓
• opening: open(f, s) = dilate(erode(f, s)
dot, since (d)
it is
not wide
(a) • majority:
(b)
(c)
(e)enough.
(f) (d)
(a)
(b)S/2);
(c)
maj(f,
s) = ✓(c,
Morphological Filters
( )
MATLAB functions
imdilate
imerode
imopen
inclose
EECS 4422/5323 Computer Vision
e.g., converting a scanned graysc
Figure 3.21 Binary image
morphology:
(a) original
image;
(b) dilation;(a)(c)original
erosion; ima
(d)
Figure
3.21 Binary
image
morphology:
as optical
character
recognition.
• opening:
open(f,
s) =element
dilate(erode(f,
s),iss);
operation,
majority; (e) opening;
(f) closing.
The
structuring
for
all
examples
a
5 ⇥element
5 square
(
majority; (e) opening; (f) closing. The structuring
The
most
common
binary
im
The effects of majority are a subtle rounding of sharp corners. Opening fails to eliminate the
The
effectsAlgorithms
of majorityand
are Applications
a subtle rounding
of
sharp
128
Computer
Vision:
(September
3,
201
✓(f,
=corne
they change the
shape
oft)the
und
dot, since it is not wide enough.
dot, since it (b)
is not
wide enough.
(c)
(d)
(e)
3.3 More (a)
neighborhood
operators
To perform such an operation, w
e.g., converting a scanned grayscale document
element
and then
select(b)
a binary
operation, Figure 3.21 Binary image(morphology:
(a) original
image;
dilati
as
optical
character
recognition.
operation,
convolution.
(This
is not
usu
f structuring
t,
( s).
• closing:
close(f,
s)1 =ifThe
erode(dilate(f,
s),
majority;
(e) opening;
(f)
closing.
element
for
all the
exampl
✓(f,
t) =
(3.41)
The
most common binary image
operatio
1 unify
if
f thet,p
0 else,
a niceofsimple
to
The effects of majority are a subtle
rounding
sharp way
corners.
Opening
✓(f,oft)the
= underlying binar
they change the shape
0filter,
else,
a
simple
3
⇥
3grows
boxprocessing
to such
mor
As we
can
seedocument
from
Figure
3.21,image
dilation
(thickens
e.g., converting
scanned
into a binary
for further
dot, asince
it isgrayscale
not wide
enough.
To perform such
an operation,
we first(f)
convo
(a)erosion
(b)
(c)
(d)
(e)
particular
shape
that
is
being
sou
as optical character
recognition.
shrinks
(thins)element
them.
The
opening
and
closing
opera
e.g., converting
a scanned
grayscale
document
into
a
binary
!35
J.
Elder
and thenmorphological
select a binary
output since
valu
The most common binary image operations are called Figure
operations,
3.21 shows
a close-up
distance
is
useful
in 2008).
quickly
the1966;
distance
to a curve
or
set(Secof
points
using
aettransform
two-pass
raster
algorithm
(Rosenfeld
and applications,
Pfaltz
Danielsson
1980;
Borge2004a;
Fabbri,
Costa,
et al.
It precomputing
has many
including
level
sets
Breu, Gil,The
Kirkpatrick
al.
1995;Torelli
Felzenszwalb
and
Huttenlocher
points
using
afast
two-pass
raster
algorithm
(Rosenfeld
and
Pfaltz (Huttenlocher,
1966; Danielsson
1980;
Borgefors
1992;
Breu,
Gil,
Kirkpatrick
al. 1995;
Felzenszwalb
and
Huttenlocher
tion1986;
chamfer
matching
(binary
image
alignment)
Klanderman,
and
i et al. 2008).
It5.1.4),
hasPaglieroni
many
applications,
including
level
setset
(SecThe
Distance
Transform
fors
1986;
Paglieroni
1992;
Breu,
Gil,
Kirkpatrick
et
al.
1995; (Section
Felzenszwalb
Huttenlocher
2004a;
Fabbri,
Costa,
Torelli
et
al.
2008).
It
has
many
applications,
including
level
sets (SecRucklidge
1993),
feathering
in
image
stitching
and
9.3.2),and
and
nearest
point
tching (binary image alignment) (Huttenlocher, Klanderman, blending
and
Fabbri,
Costa,
Torelli
et al.
2008).
has many
applications,
including
level sets (Section
5.1.4),
fast
chamfer
matching
(binary
alignment)
(Huttenlocher,
Klanderman,
and
alignment
(Section
12.2.1).
in image2004a;
stitching
and
blending
(Section
9.3.2),
andItimage
nearest
point
tion
5.1.4),
fast chamfer
matching
(binary
image
alignment)
(Huttenlocher,
Klanderman,
andl)
Rucklidge
1993),
feathering
in
image
and
blending
9.3.2),
and nearest
point
The distance
transform
D(i,
j)
ofstitching
a binary
image
b(i, j)(Section
is defined
as follows.
Let d(k,
1993),
feathering
in image
stitching
and
blending
(Section
andinclude
nearestthe
point
alignment
(Section
12.2.1).
be
some image
distance
metric
between
offsets.
Two
commonly
used9.3.2),
metrics
city
D(i, j) ofRucklidge
a binary
b(i,
j) is defined
aspixel
follows.
Let
d(k,
l)
alignment
(Section
12.2.1).
Theordistance
transform
D(i, metrics
j) of a binary
b(i, j) is defined as follows. Let d(k, l)
block
Manhattan
distance
ween pixel
offsets.
Two commonly
used
includeimage
the city
The distance
D(i, j)pixel
of d
a binary
image
b(i,
as follows.
Letthe
d(k,
l)
be some
distancetransform
metric between
Two
commonly
used metrics
include
city
(k, l) =
|k| +
|l| j) is defined
(3.43)
e
1offsets.
be
some
distance metric
between pixel offsets. Two commonly used metrics include the city
or
Manhattan
distance
dblock
(k,
l)
=
|k|
+
|l|
(3.43)
1
and
the
Euclidean
distance
block or Manhattan distance
d1 (k, l) = p
|k| +
|l| 2
(3.43)
2
dd21(k,
(3.44)
(k, l)
l) =
= |k|k+ +
|l|l .
(3.43)
p
d2and
(k, the
l) =Euclidean
k 2 + l2distance
.
p (3.44)
The
distance
transform
is
then
defined
as
and the Euclidean distance
d2 (k, l) = p k 2 + l2 .
(3.44)
en defined as
d2 (k, l) = k 2 + l2 .
(3.44)
D(i,
j) = as min d(i k, j l),
(3.45)
The distance transform is then
defined
k,l:b(k,l)=0
transform
then defined as
D(i, j) =The distance
min d(i
k, j isl),
(3.45)
k,l:b(k,l)=0
j) =
min pixel
d(i whose
k, j value
l), is 0.
(3.45)
i.e., it is the distance to the D(i,
nearest
background
k,l:b(k,l)=0
D(i,
j)
=
min
d(iefficiently
k, j l),
(3.45)
The Dpixel
blockvalue
distance
can
be
computed
using
a
forward
and
nearest background
is 0.transform
1 citywhose
k,l:b(k,l)=0
i.e.,
it can
is the
distance
to the
nearest
background
pixel
whoseinvalue
is 0.
backward
pass
of a simple
raster-scan
algorithm,
as and
shown
Figure
3.22. During the forward
ance transform
be
efficiently
computed
using
a forward
it iseach
the
distance
to
the
nearest
background
pixel
whose value
0. distance
The
D1 as
city
block
distance
can
be
efficiently
using a of
forward
andor
pass,
non-zero
pixel
in
b3.22.
istransform
replaced
minimum
ofcomputed
1 +isthe
its north
aster-scani.e.,
algorithm,
shown
in
Figure
Duringby
thethe
forward
The
Dpass
cityofDuring
block
computed
a forward
1 minimum
backward
a simple
algorithm,
asefficiently
shown
3.22.
During
the forward
west
neighbor.
the
pass,
the
same
occurs,
except
thatusing
the
minimum
is and
both
b is replaced
by
the
ofdistance
1 raster-scan
+backward
thetransform
distance
ofcan
its be
north
or in Figure
pass
of
a simple
raster-scan
shown
3.22.
During
the
pass,
non-zero
pixel
in
b 1that
is+replaced
by the
minimum
ofFigure
1east
+ the
distance
offunction
its forward
north
or
over each
current
value
D and
the
distance
ofas
the
southinand
neighbors
(Figure
3.22).
backward backward
pass,
the same
occurs,
except
thealgorithm,
minimum
is
both
MATLAB
eachofnon-zero
pixel
inbackward
b neighbors
is replaced
by
minimum
1 + the
distance
of its north
or
bwdist
west neighbor.
During
theeast
pass,
thethe
same
occurs,ofexcept
that
the minimum
is both
d 1 + the pass,
distance
the south
and
(Figure
3.22).
west
neighbor.
theand
backward
pass, theof
same
occurs,and
except
that the minimum
is both
over the
currentDuring
value D
1 + the distance
the south
east neighbors
(Figure 3.22).
over the current value D and 1 + the distance of the south and east neighbors (Figure 3.22).
EECS 4422/5323 Computer Vision
!36
J. Elder
Computing the Distance Transform
130
Computer Vision: Algorithms and Applications (Sep
.
❖ City block
๏ Forward-backward two-pass raster scan
✦ Initialize:
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
1
1
1
0
0
0
0
1
1
2
0
0
0
0
1
1
2
0
0
0
0
1
1
1
1
1
0
0
1
2
2
3
1
0
0
1
2
2
3
1
0
0
0
1
1
1
1
1
0
0
1
2
3
0
1
2
2
1
1
0
0
0
1
1
1
0
0
0
0
1
2
1
0
0
0
0
Input image
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
(a)
(b)
(c)
130
Computer
Algorithms
and Applications
(September
3, 2010
b(find(b(:))) = ∞
FigureVision:
3.22 City
block distance
transform:
(a) original
binarydraft)
imag
(forward) raster sweep: green values are used to compute the orange va
✦ Forward pass
.
va
0 0 0 0 1 0 0 (backward)
0 0 0 0raster
1 0 sweep:
0
0green
0 0 values
0 1 0are0 merged
0 0 with
0 0old
1 orange
0 0
for j = 2:n
0 0 1 1 1 0 0
0 0 1 1 2 0 0
0 0 1 1 2 0 0
0 0 1 1 1 0 0
transform.
0 1 1 1 1 1 0
0 1 2 2 3 1 0
0 Forward
1 2 2 3 1 0 pass
0 1 2 2 2 1 0
if b(1,j) > 0
0 1 1 1 1 1 0
0 1 2 3
0 1 2 2 1 1 0
0 1 2 2 1 1 0
0
1
1
1
0
0
0
0
1
2
1
0
0
0
0 1 2 1 0 0 0
b(1,j) = 1 + b(1,j-1)
Efficiently
computing
the
Euclidean
distance
transform is more co
0 0 1 0 0 0 0
0 0 1 0 0 0 0
0 0 1 0 0 0 0
for i = 2:m
0 0 0 0 0 0 0 keeping the minimum scalar
0 0distance
0 0 0 to0the
0 boundary
0 0 0during
0 0 the
0 0two p
Instead, a vector-valued
distance consisting
of both the x(d)
and y coord
if b(i,1) > 0
(a)
(b)
(c)
to the boundary must be kept and compared using the squared distance
b(i,1) = 1 + b(i-1,1)
FigureVision:
3.22 City
block
transform:
(a)need
original
binary
image;reasonable
(b) top toresults.
bottomR
well,distance
larger
search
regions
to be
to
obtain
130
Computer
Algorithms
and
Applications
(September
3, used
2010
draft)
for j = 2:n
(forward) raster sweep:the
green
values (Danielsson
are used to compute
the orange
value;
bottom
to we
top l
algorithm
1980; Borgefors
1986)
in (c)
more
detail,
.
(backward) raster sweep:
green
values are
merged
with 3.13).
old orange value; (d) final distance
if b(i, j) > 0
for the
motivated
reader
(Exercise
0 0 0 0 1 0 0
0 0 0 0 1 0 0
0 0 0 0 1 0 0
0 0 0 0 1 0 0
transform.
b(i, j) = 1 + min(b(i-1, 0j),b(i,
a0 distance
computed from a binary
0 1 1 k-1))
1 0 0
0 0 1 1 2 0 0
0Figure
0 1 3.11g
1 2 0shows
0
0 1 1 transform
1 0 0
1 2 2 3 1 0
0 1 2 2 2 1 0
the 0values
grow away from
the black (ink) regions and form ridges in
0 1 2 2 1 1 0
0 1 2 2 1 1 0
✦ Backward pass
original
image.
Because
of0 transform
this
linear growth
from
the starting
bounda
Efficiently computing
complicated.
Here,
just
0 1 1 1 0 0 0
0the1 Euclidean
2 1 0 0 distance
0
1 2 1 0 is 0more
0
0 0 1 0 0 0 0 keeping the minimum transform
0 0distance
1 is0 also
0to 0sometimes
0 0known
1during
0 as
0 the
0 0two
the
grassfire
since it
scalar
the0 boundary
passestransform,
is not sufficient.
for j = n-1:-1:1
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
which
a fireconsisting
starting inside
thethe
black
region
would consume
given
Instead, a vector-valued
distance
of both
x and
y coordinates
of the any
distance
if b(m,j) > 0
(a)
(b) must be
(c)
(d)
because
it resembles
similar
shapes
used
in woodworking
andrule.
industri
to the boundary
kept and
compared
using the
squared
distance
(hypotenuse)
As
b(m,j) = 1 + min(b(m,j),b(m,j+1))
in the
distance
transform
become
the skeleton
medial
axisexplaining
transform
well, larger search regions
need
to be used
to obtain
reasonable
results.(orRather
than
FigureVision:
3.22 City
block distance
transform:(September
(a) original
binarydraft)
image; (b) top to bottom
130
Computer
Algorithms
and
Applications
3,
2010
where
the transform
computed,
anddetail,
consist
pixels
thatanare
of equ
for i = m-1:-1:1
the algorithm (Danielsson
1980;
Borgeforsis1986)
in more
weofleave
it as
exercise
(forward) raster sweep: green values are used to compute the orange value; (c) bottom to top
more)
boundaries
for the motivated reader
(Exercise
3.13). (Tek and Kimia 2003; Sebastian and Kimia 2005).
if b(i,n) > 0.
(backward) raster sweep: green values are merged with old orange value; (d) final distance
0 0 0 1 0 0
0 0 0 0 1 0 0
0 0 0 0 1 0 0
0 useful
0 0 0extension
1 0 0 of the basic distance transform is the signed dista
Figure
3.11g shows a A
distance
transform
computed from a binary image. Notice how
b(i,n) = 1 +00 min(b(i,n),b(i+1,n))
transform.
0 1 1 1 0 0
0 0 1 1 2 0 0
0 0 1 1 2 0 0
0 0 1 1 1 0 0
distances
boundary
for all in
thethe
pixels
and
the values
grow away computes
from
the black
(ink)toregions
andpixels
form ridges
white(Lavallée
area of the
0 1 1 1 1 1 0
0 1 2 2 3 1 0
0 1 2 2 3 1 0
0 1 2 2 2 1 0
for j = n-1:-1:1
create
this isthetostarting
compute
the distance
for
0 1 1 1 1 1 0
0 1 2 3
0 1 image.
2 2 1 Because
1 0 simplest
1 way
2linear
2 to
1 growth
1 0
original
of0 this
from
boundary
pixels,transforms
the distance
transform
complicated.
just
0the1 Euclidean
2 1 0 0 distance
0 nary0 image
1 2 and
1 0 is
0more
0
if b(i, j) > 00 1 1 1 0 0 0 Efficiently computing
complement
and toHere,
negate
one
of them the
before
transform
is also sometimes
known
asits
the
grassfire
transform,
since
it describes
timecom
at
0 0 1 0 0 0 0
0 0 1 0 0 0 0
0 0 1 0 0 0 0
keeping thej),1+b(i,
minimum which
scalar
distance
to the boundary
duringtend
the to
twobepasses
is not
distance
fields
smooth,
it issufficient.
possible to store them more c
b(i, j) = min(b(i,j),1+b(i+1,
j+1))
0 0 0 0 0 0 0
0 0a fire
0 0starting
0 0 0inside
0 the
0 0black
0 0 region
0 0 would consume any given pixel, or a chamfer,
Instead, a vector-valued
distance
consistingsimilar
of both
the
x used
and yin
coordinates
of the
distance
mal
loss
in relative
accuracy)
using
aand
spline
defineddesign.
over a The
quadtree
because
it resembles
shapes
woodworking
industrial
ridgesor
(a)
(b)
(c)
(d)
to the boundary must!3be
and compared
using
the squared
distance
rule.Lavallée
As (MAT))
(Lavallée
and
1995;
Szeliski
1996;of
Pe
EECS 4422/5323 Computer Vision
7 kept
J.Frisken,
Elder
in
the distance
transform
become
the Szeliski
skeleton
(or(hypotenuse)
medial
axis and
transform
the
region
well, larger search regions need to be used to obtain reasonable results. Rather than explaining
0
1
1
1
1
1
0
0
1
2
2
0
1
1
1
1
1
0
0
1
2
3
3
1
0
Backward pass
Distance transform
Outline
❖ Point Operators
❖ Linear Filters
❖ Nonlinear Filters
EECS 4422/5323 Computer Vision
!38
J. Elder
Download