Document

advertisement
Chapter 5
Segmentation
周立晴
d99922027
Movie Special effects
Segmentation (1/2)
•
•
•
•
•
5.1 Active Contours
5.2 Split and Merge
5.3 Mean Shift and Mode Finding
5.4 Normalized Cuts
5.5 Graph Cuts and Energy-based Methods
Segmentation (2/2)
5.1 Active Contours
• Snakes
• Scissors
• Level Sets
5.1.1 Snakes
• Snakes are a two-dimensional generalization
of the 1D energy-minimizing splines.
• One may visualize the snake as a rubber
band of arbitrary shape that is deforming
with time trying to get as close as possible to
the object contour.
Snakes
Snakes
• Minimize energy
constraint
• Parametric equations
• PDE
Snakes
• Internal spline energy:
contour
curvature
o s: arc length
o vs, vss: first-order and second-order derivatives of curve
o α, β: first-order and second-order weighting functions
• Discretized form of internal spline energy
Snakes
• External spline energy:
constrain
o Line term: attracting to dark ridges
o Edge term: attracting to strong gradients
o Term term: attracting to line terminations
• In practice, most systems only use the edge term,
which can be directly estimated by gradient.
Snakes
• Line Functional
o The simplest useful image functional is the image intensity itself.
• 𝐸𝑙𝑖𝑛𝑒 = 𝐼(π‘₯, 𝑦)
o The snake will be attracted either to light lines or dark lines depending on
the sign of 𝑀𝑙𝑖𝑛𝑒 .
Snakes
• Edge functional
o The snake is attracted to contours with large image gradients.
Snakes
• Termination Function
o Use the curvature of level lines in a slightly smoothed image.
Snakes
• User-placed constraints can also be added.
o f: the snake points
o d: anchor points
Snakes
• Internal spline energy:
contour
curvature
o s: arc length
o vs, vss: first-order and second-order derivatives of curve
o α, β: first-order and second-order weighting functions
• Discretized form of internal spline energy
Snakes
• External spline energy:
constrain
o Line term: attracting to dark ridges
o Edge term: attracting to strong gradients
o Term term: attracting to line terminations
• In practice, most systems only use the edge term,
which can be directly estimated by gradient.
Snakes
• To minimize the energy function, we must solve
• Where fx (i) = πœ•Eext /πœ•xi and fy (i) = πœ•Eext /πœ•yi
Snakes
• The above Euler equations can be written in matrix
form as
Snakes
• Assume that 𝑓π‘₯ and 𝑓𝑦 are constant during a time
step.
• 𝛾 is step size
Snakes
• The matrix (A + 𝛾I) is a pentadiagonal banded
matrix, so its inverse can be calculated by LU
decompositions in O(n).
Snakes
• Because regular snakes have a tendency to shrink,
it is usually better to initialize them by drawing the
snake outside the object of interest to be tracked.
B-spline Approximations
• Snakes sometimes exhibit too many degrees of
freedom, making it more likely that they can get
trapped in local minima during their evolution.
• Use B-spline approximations to control the snake
with fewer degrees of freedom.
Shape Prior
5.1.2 Dynamic snakes and
CONDENSATION
• In many applications of active contours, the object
of interest is being tracked from frame to frame as it
deforms and evolves.
• In this case, it make sense to use estimates from the
previous frame to predict and constrain the new
estimates.
Elastic Nets and Slippery
Springs
• Applying to TSP (Traveling Salesman Problem):
Elastic Nets and Slippery
Springs (cont’d)
• Probabilistic interpretation:
o
o
o
o
i: each snake node
j: each city
σ: standard deviation of the Gaussian
dij: Euclidean distance between a tour point f(i) and a city location d(j)
Elastic Nets and Slippery
Springs (cont’d)
• The tour f(s) is initialized as a small circle around the
mean of the city points and σ is progressively
lowered.
• Slippery spring: this allows the association between
constraints (cities) and curve (tour) points to evolve
over time.
Snakes
5.1.3 Scissors
5.1.3 Scissors
• Scissors can draw a better curve (optimal curve
path) that clings to high-contrast edges as the user
draws a rough outline.
• Semi-automatic segmentation
• Algorithm:
o Step 1: Associate edges that are likely to be boundary elements.
o Step 2: Continuously recompute the lowest cost path between the
starting point and the current mouse location using Dijkstra’s algorithm.
Scissors
• Let 𝑙(𝑝, π‘ž) represents the local cost on the directed
link from pixel 𝑝 to a neighboring pixel π‘ž.
𝑙 𝑝, π‘ž = 𝑀𝑍 βˆ™ 𝑓𝑍 π‘ž + 𝑀𝐷 βˆ™ 𝑓𝐷 𝑝, π‘ž + 𝑀𝐺 βˆ™ 𝑓𝐺 π‘ž
• Weights of 𝑀𝑍 = 0.43, 𝑀𝐷 = 0.43, and 𝑀𝐺 = 0.14 seem
to work well in a wide range of images.
Scissors
• 𝑓𝑍 : Laplacian zero-crossing
• βˆ†πΌ =
πœ•2 𝐼
πœ•π‘₯ 2
+
πœ•2 𝐼
πœ•π‘¦ 2
Scissors
• 𝐼𝐿 π‘ž is the Laplacian of an image I at pixel π‘ž
Scissors
• 𝑓𝐺 : Gradient Magnitude
• The gradient magnitude G is
πœ•πΌ 2
πœ•πΌ 2
( ) +( )
πœ•π‘₯
πœ•π‘¦
• The gradient is scaled and inverted so high
gradients produce low costs and vice-versa.
Scissors
• 𝑓𝐷 : Gradient Direction
• Let 𝐷 𝑝 be the unit vector perpendicular to the
gradient direction at point 𝑝.
𝐼𝑦 𝑝
𝐷 𝑝 =(
𝐼π‘₯ 𝑝
2
+ 𝐼𝑦 𝑝
,
2
−𝐼π‘₯ 𝑝
𝐼π‘₯ 𝑝
2
+ 𝐼𝑦 𝑝
)
2
Scissors
• Cut along the gradient direction -> minimize angle
between cut direction and gradient direction
Scissors
• Let 𝑙(𝑝, π‘ž) represents the local cost on the directed
link from pixel 𝑝 to a neighboring pixel π‘ž.
𝑙 𝑝, π‘ž = 𝑀𝑍 βˆ™ 𝑓𝑍 π‘ž + 𝑀𝐷 βˆ™ 𝑓𝐷 𝑝, π‘ž + 𝑀𝐺 βˆ™ 𝑓𝐺 π‘ž
Scissors
• Solve optimal path by Dijkstra’s algorithm.
Scissors
Scissors
1
(a)
∞
10
1
(b)
∞
∞
10
10
9
s
2
9
3
6
4
0
s
2
3
7
5
∞
7
∞
2
6
4
0
5
∞
5
2
Scissors
1
(c)
8
10
1
(d)
14
8
10
11
9
s
2
3
6
4
0
5
9
7
5
7
2
s
2
3
5
6
4
0
7
5
7
2
Scissors
1
(e)
8
10
s
2
9
3
6
4
7
5
7
2
8
10
9
0
5
1
(e)
s
2
9
3
9
5
6
4
0
7
5
7
2
Scissors
5.1.4 Level Sets
• If the active contours based on parametric curves
of the form f(s), as the shape changes dramatically,
(topology changes) curve reparameterization may
also be required.
5.1.4 Level Sets
5.1.4 Level Sets
• Level sets use 2D embedding function πœ™ π‘₯, 𝑑
instead of the curve f(s).
Level Sets
Level Sets
• Given an interface Γ in 𝑅𝑛 of codimension one,
bounding an open region Ω, we wish to analyze
and compute its subsequence motion under a
velocity field v.
• The idea is merely to define a smooth function
πœ™ π‘₯, 𝑑 , that represents the interface as the set
where πœ™ π‘₯, 𝑑 = 0.
• Here π‘₯ = π‘₯1 , β‹― , π‘₯𝑛 ∈ 𝑅𝑛
Level Sets
• The level set function πœ™ has the following properties:
πœ™ π‘₯, 𝑑 < 0
πœ™ π‘₯, 𝑑 = 0
πœ™ π‘₯, 𝑑 > 0
for π‘₯ ∈ Ω
for π‘₯ ∈ πœ•Ω = Γ(𝑑)
for π‘₯ ∉ Ω
Level Sets
• Again, our contour is at the 0 level set
πœ™ π‘₯, 𝑑 = 0
• The motion is analyzed by convecting the πœ™ values
with velocity field 𝑣.
πœ•πœ™
+ 𝑣 βˆ™ π›»πœ™ = 0
πœ•π‘‘
Level Sets
• Actually, only the normal component of 𝑣 is needed,
π›»πœ™
𝑣𝑁 = 𝑣 βˆ™
π›»πœ™
•
πœ•πœ™
πœ•π‘‘
+ 𝑣𝑁 βˆ™ π›»πœ™ = 0
• Here π›»πœ™ =
𝑛
2
πœ™
𝑖=1 π‘₯𝑖
Level Sets
• An example is the geodesic active contour:
o g(I): snake edge potential (gradient)
o φ: signed distance function away from the curve
o div: divergent
Level Sets
Level Sets
• An example is the geodesic active contour:
o g(I): snake edge potential (gradient)
o φ: signed distance function away from the curve
o div: divergent
Level Sets
• the main goal of 𝑔(𝐼) is actually to stop the evolving
curve when it arrives to the objects boundaries.
𝑔=
1
1 + 𝛻𝐼
2
• Where 𝐼 is a smoothed version of 𝐼 (Gaussian blur)
Level Sets
•
Geometric interpretation of the attraction force in 1D. The original
edge signal , its smoothed version , and the derived stopping
function are given. The evolving contour is attracted to the valley
created by 𝛻𝑔 βˆ™ π›»πœ™ .
Level Sets
• According to g(I), the first term can straighten the curve
and the second term encourages the curve to migrate
towards minima of g(I).
• Level-set is still susceptible to local minima.
• An alternative approach is to use the energy
measurement inside and outside the segmented regions.
Level Sets
5.2 Split and Merge
•
•
•
•
•
Watershed
Region splitting and merging
Graph-based Segmentation
k-means clustering
Mean Shift
5.2.1 Watershed
• An efficient way to compute such regions is to start
flooding the landscape at all of the local minima
and to label ridges wherever differently evolving
components meet.
• Watershed segmentation is often used with the user
manual marks corresponding to the centers of
different desired components.
Watershed
5.2.2 Region Splitting
(Divisive Clustering)
• Step 1: Computes a histogram for the whole image.
• Step 2: Finds a threshold that best separates the
large peaks in the histogram.
• Step 3: Repeated until regions are either fairly
uniform or below a certain size.
5.2.3 Region Merging
(Agglomerative Clustering)
• The various criterions of merging regions:
o Relative boundary lengths and the strength of the visible edges at these
boundaries
o Distance between closest points and farthest points
o Average color difference or whose regions are too small
5.2.4 Graph-based
Segmentation
• We define an undirected graph 𝐺 = (𝑉, 𝐸), where
each image pixel 𝑝𝑖 has a corresponding vertex 𝑣𝑖
∈ 𝑉.
• The edge set 𝐸 is constructed by connecting pairs
of pixels that are neighbors in an 8-connected
sense.
• 𝑀
𝑣𝑖 , 𝑣𝑗
= 𝐼 𝑣𝑖 − 𝐼 𝑣𝑗
Graph-based
Segmentation
• This algorithm uses relative dissimilarities between
regions to determine which ones should be merged.
• Internal difference for any region R:
o MST(R): minimum spanning tree of R
o w(e): intensity differences of an edge in MST(R)
Graph-based
Segmentation
• Difference between two adjacent regions:
• Minimum internal difference of these two regions:
o τ(R): heuristic region penalty
Graph-based
Segmentation
• This algorithm uses relative dissimilarities between
regions to determine which ones should be merged.
• If Dif(R1, R2) < Mint(R1, R2) then merge these two
adjacent regions.
• The input is a graph 𝐺 = (𝑉, 𝐸), with 𝑛 vertices and π‘š
edges.
• The output is a segmentation of 𝑉 into component
𝑆 = (𝐢1 , β‹― , πΆπ‘Ÿ )
Graph-based
Segmentation
1.
2.
3.
4.
5.
Sort 𝐸 into πœ‹ = (π‘œ1 , β‹― , π‘œπ‘š ), by non-decreasing edge
weight.
Start with a segmentation 𝑆 0 , where each vertex 𝑣𝑖 is in
its own component.
Repeat Step 4 for π‘ž = 1, β‹― , π‘š.
π‘ž−1
Construct 𝑆 π‘ž given 𝑆 π‘ž−1 as follows. Let 𝐢𝑖
be the
π‘ž−1
component of 𝑆 π‘ž−1 containing 𝑣𝑖 and 𝐢𝑗
the
π‘ž−1
π‘ž−1
component containing 𝑣𝑗 . If 𝐢𝑖
≠ 𝐢𝑗
and
𝑀(π‘œπ‘ž )
π‘ž−1 π‘ž−1
≤ 𝑀𝐼𝑛𝑑(𝐢𝑖 , 𝐢𝑗 ) then 𝑆 π‘ž is obtained from 𝑆 π‘ž−1 by
π‘ž−1
π‘ž−1
merging 𝐢𝑖
and 𝐢𝑗 . Otherwise 𝑆 π‘ž = 𝑆 π‘ž−1 .
Return 𝑆 = 𝑆 π‘š
Graph-based
Segmentation
5.2.5 Probabilistic
Aggregation
Gray level similarity:
• Minimal external difference between Ri and Rj:
o βˆ†i+ = mink| βˆ†ik|
o βˆ†π‘–π‘— = 𝐼𝑖 − 𝐼𝑗 , where 𝐼𝑖 and 𝐼𝑗 are average intensities of regions Ri and Rj
respectively
• Average intensity difference:
o βˆ†i- = Σk(τik βˆ†ik) / Σk(τik) and τik is the boundary length between regions Ri and
Rk
Probabilistic Aggregation
• The pairwise statistics σlocal+ and σlocal- are used to
compute the likelihoods pij that two regions should
be merged.
• 𝑝𝑖𝑗 =
𝐿+
𝑖𝑗
−
𝐿+
𝑖𝑗 +𝐿𝑖𝑗
• 𝐿±π‘–𝑗 ∼ 𝑁 0, πœŽπ‘–π‘—±
±
• πœŽπ‘–π‘—± = πœŽπ‘™π‘œπ‘π‘Žπ‘™
+ πœŽπ‘ π‘π‘Žπ‘™π‘’
• πœŽπ‘ π‘π‘Žπ‘™π‘’ =
πœŽπ‘›π‘œπ‘–π‘ π‘’
min( Ω𝑖 , Ω𝑗 )
• Ω𝑖 is the number of pixels in 𝑅𝑖
Probabilistic Aggregation
• Definition of strong coupling:
o C: a subset of V
o φ: usually set to 0.2
Probabilistic Aggregation
Probabilistic Aggregation
5.3 Mean Shift and Mode
Finding
5.3.1 K-means
• K-means:
o Step 1: Guess center. Give the number of clusters k it is supposed to find.
Then choose k samples as the centers of clusters. We call the set of
centers Y.
o Step 2: Given center, find groups. Use fixed Y to compute the square error
for all pixels, then we can get the clusters U which has least square error
Emin.
o Step 3: Given groups, find new centers. Use fixed Y and U to compute the
square error Emin’. If Emin = Emin’ then stop and we get the final clusters.
o Step 4: Repeat until centers do not change. If Emin ≠ Emin’ then use U to find
new cluster centers Y’. Go to Step 2 and find new cluster U’, iteratively.
5.3.2 Mean Shift
• Mean shift segmentation is the inverse of the
watershed algorithm => find the peaks (modes) and
then expand the region.
Mean Shift
• Step 1: Use kernel density estimation to estimate the
density function given a sparse set of samples.
o
o
o
o
f(x): density function
xi: input samples
k(r): kernel function or Parzen window
h: width of kernel
Mean Shift
• Step 2: Starting at some guess for a local maximum
yk, mean shift computes the gradient of the density
estimate f(x) at yk and takes an uphill step in that
direction.
• 𝐺 π‘₯ = −π‘˜ ′ ( π‘₯ 2 )
Mean Shift
The location of yk in iteration can be expressed in
the following formula:
Repeat Step 2 until completely converge or after
finite steps.
• Step 3: The remaining points can then be classified
based on the nearest evolution path.
Mean Shift
Mean Shift
• There are still some kernels to be used:
o Epanechnikov kernel (converge in finite steps)
o Gaussian (normal) kernel (slower but result better)
Mean Shift
• Joint domain: use spatial domain and range
domain to segment color image.
• Kernel of joint domain (five-dimensional):
o xr: (L*, u*, v*) in range domain
o xs: (x, y) in spatial domain
o hr, hs: color and spatial widths of kernel
Mean Shift
o M: a region has pixels under the number threshold will be eliminated
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
Objective : Find the densest region
Distribution of identical billiard balls
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
Objective : Find the densest region
Distribution of identical billiard balls
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
Objective : Find the densest region
Distribution of identical billiard balls
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
Objective : Find the densest region
Distribution of identical billiard balls
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
Objective : Find the densest region
Distribution of identical billiard balls
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
Objective : Find the densest region
Distribution of identical billiard balls
Intuitive Description
Region of
interest
Center of
mass
Objective : Find the densest region
Distribution of identical billiard balls
5.4 Normalized Cuts
• Normalized cuts examine the affinities between
nearby pixels and try to separate groups that are
connected by weak affinities.
• Not min-cut, otherwise single pixels will be isolated
Normalized Cuts
• To find the minimum cut between two groups A and
B:
• A better measure of segmentation is to find
minimum normalized cut:
o π‘Žπ‘ π‘ π‘œπ‘ 𝐴, 𝑉 =
𝑖∈𝐴,𝑗∈𝑉 𝑀𝑖𝑗
5.4 Normalized Cuts
• Pixel-wise affinity weight for pixels within a radius
βˆ₯xi - xjβˆ₯ < r :
o Fi, Fj: feature vectors that consist of intensities, colors, or oriented filter
histograms
o xi, xj: pixel locations
Normalized Cuts
Normalized Cuts
• But computing the optimal normalized cut is NPcomplete. The following is a faster method.
• Minimize the cut can be expressed as a Rayleigh
quotient:
o x is the indicator vector where xi = +1 iff i ∈ A and xi = -1 iff i ∈ B.
o y = ((1 + x) - b(1 - x)) / 2
• (Compare equation for Rayleigh Quotient)
o (Global minimum/maximum can be solved)
Normalized Cuts
o x is the indicator vector where xi = +1 iff i ∈ A and xi = -1 iff i ∈ B.
o y = ((1 + x) - b(1 - x)) / 2
• X as the indicator vector
X
1+x -b(1-x)
Y
1
+1
2
0
1
A
B
2
-1
0
-2b
-b
1
2
3
+1
2
0
1
3
5
4
+1
2
0
1
4
6
5
-1
0
-2b
-b
6
-1
0
-2b
-b
7
+1
2
0
1
7
Normalized Cuts
o
o
o
o
x is the indicator vector where xi = +1 iff i ∈ A and xi = -1 iff i ∈ B.
y = ((1 + x) - b(1 - x)) / 2
W: weight matrix [wij]
D: diagonal matrix, diagonal entries are the number of corresponding
row sums in W
• It is equivalent to solving a regular eigenvalue
problem:
o N = D-1/2WD-1/2 and N is called normalized affinity matrix.
o z = D1/2y
Normalized Cuts
Normalized Cuts
5.5 Graph Cuts
Graph Cuts
Graph Cuts
Graph Cuts
• Assume that 𝑂 and 𝐡 denote the subsets of pixels
marked as “object” and “background” seeds.
• Define G = (V, 𝐸)
• V = 𝑃 ∪ 𝑆, 𝑇 , where 𝑃 is the
set of nodes corresponding
to pixels of the image.
• Each pixel 𝑝 of neighboring
pixels (𝑝, π‘ž) in 𝑁 is
connected by an 8-link.
• 𝐸 = 𝑁 𝑝∈𝑃{ 𝑝, 𝑆 , (𝑝, 𝑇)}
Graph Cuts
• 𝑀 𝑝, π‘ž = 𝐡 𝑝, π‘ž for (𝑝, π‘ž) ∈ 𝑁
• 𝐡 𝑝, π‘ž ∝ exp(−
(𝐼𝑝 −πΌπ‘ž )2
2𝜎2
)βˆ™
1
𝑑𝑖𝑠𝑑(𝑝,π‘ž)
…(weights between pixels)
πœ† βˆ™ 𝑅𝑝 bkg , 𝑝 ∈ 𝑃, 𝑝 ∉ 𝑂 ∪ 𝐡
• 𝑀 𝑝, 𝑆 =
𝐾, 𝑝 ∈ 𝑂
0, 𝑝 ∈ 𝐡
πœ† βˆ™ 𝑅𝑝 obj , 𝑝 ∈ 𝑃, 𝑝 ∉ 𝑂 ∪ 𝐡
• 𝑀 𝑝, 𝑇 =
0, 𝑝 ∈ 𝑂
𝐾, 𝑝 ∈ 𝐡
• 𝐾 = 1 + π‘šπ‘Žπ‘₯
𝐡(𝑝, π‘ž)
𝑝∈𝑃 π‘ž:(𝑝,π‘ž)∈𝑁
• πœ† is a user defined variable
• 𝑅𝑝 obj = − 𝑙𝑛 π‘ƒπ‘Ÿ(𝐼𝑝 |𝑂)
• 𝑅𝑝 bkg = − 𝑙𝑛 π‘ƒπ‘Ÿ(𝐼𝑝 |𝐡)
Graph Cuts
• Finding out minimum cut on 𝐺 = (𝑉, 𝐸) will segment
image into “object” and “background” two
partitions.
Graph Cuts (min-cut)
Min-cut is the same as max-flow
6
b
5
S
d
3
6
zs
a
3
5
c
e
1
6
T
Graph Cuts (min-cut)
0/6
b
0/5
S
d
0/3
0/6
zs
a
0/3
0/5
c
e
0/1
0/6
T
Graph Cuts (min-cut)
0/6
b
3/5
S
d
3/3
0/6
zs
a
0/3
0/5
c
e
0/1
3/6
T
Graph Cuts (min-cut)
0/6
b
3/5
S
d
3/3
0/6
zs
a
0/3
0/5
c
e
0/1
3/6
T
Graph Cuts (min-cut)
1/6
b
3/5
S
d
2/3
1/6
zs
a
0/3
1/5
c
e
1/1
3/6
T
Graph Cuts (min-cut)
1/6
b
3/5
S
d
2/3
1/6
zs
a
0/3
1/5
c
e
1/1
3/6
T
Graph Cuts (min-cut)
1/6
b
3/5
S
d
2/3
4/6
zs
a
3/3
4/5
c
e
1/1
3/6
T
Graph Cuts (min-cut)
3/6
b
5/5
S
d
2/3
6/6
zs
a
3/3
4/5
c
e
1/1
3/6
T
Graph Cuts (min-cut)
3/6
b
5/5
S
d
2/3
6/6
zs
a
3/3
4/5
c
e
1/1
3/6
T
Review
• 5.1 Active Contours
o Snake
o Scissors
o Level sets
• 5.2 Split and Merge
o
o
o
o
o
Watershed
Region splitting (histogram)
Region merging
Graph-based
K-mean
• 5.3 Mean Shift and Mode Finding
• 5.4 Normalized Cuts
• 5.5 Graph Cuts and Energy-based Methods
Representations
• Image
o Surface function in 3D
o Graph
o Set of values (histogram)
• Segments
o
o
o
o
Line/path
Surface function in 3D
Groups of Nodes
Clusters
Segmentation Criteria
• Object Edge
o Gradient, Laplacian
• Location
o Connected shape
o close proximity
• Shape of segment
o Derivative of line
• Similarity within segment
o K-means, mean-shift
o Weights for Normalized cuts, Graph cut
• Difference with external area
• User-defined criteria
Optimization of Criteria
•
•
•
•
•
•
PDE: Gradient-descent / Calculus of variations
Dijkstra’s algorithm
K-means
Mean-shift
Rayleigh’s Quotient
Min-cut
Evaluation
• The Berkeley Segmentation Dataset
Medical Segmentation
• END
Download