Chapter 5 Segmentation ε¨η«ζ΄ d99922027 Movie Special effects Segmentation (1/2) • • • • • 5.1 Active Contours 5.2 Split and Merge 5.3 Mean Shift and Mode Finding 5.4 Normalized Cuts 5.5 Graph Cuts and Energy-based Methods Segmentation (2/2) 5.1 Active Contours • Snakes • Scissors • Level Sets 5.1.1 Snakes • Snakes are a two-dimensional generalization of the 1D energy-minimizing splines. • One may visualize the snake as a rubber band of arbitrary shape that is deforming with time trying to get as close as possible to the object contour. Snakes Snakes • Minimize energy constraint • Parametric equations • PDE Snakes • Internal spline energy: contour curvature o s: arc length o vs, vss: first-order and second-order derivatives of curve o α, β: first-order and second-order weighting functions • Discretized form of internal spline energy Snakes • External spline energy: constrain o Line term: attracting to dark ridges o Edge term: attracting to strong gradients o Term term: attracting to line terminations • In practice, most systems only use the edge term, which can be directly estimated by gradient. Snakes • Line Functional o The simplest useful image functional is the image intensity itself. • πΈππππ = πΌ(π₯, π¦) o The snake will be attracted either to light lines or dark lines depending on the sign of π€ππππ . Snakes • Edge functional o The snake is attracted to contours with large image gradients. Snakes • Termination Function o Use the curvature of level lines in a slightly smoothed image. Snakes • User-placed constraints can also be added. o f: the snake points o d: anchor points Snakes • Internal spline energy: contour curvature o s: arc length o vs, vss: first-order and second-order derivatives of curve o α, β: first-order and second-order weighting functions • Discretized form of internal spline energy Snakes • External spline energy: constrain o Line term: attracting to dark ridges o Edge term: attracting to strong gradients o Term term: attracting to line terminations • In practice, most systems only use the edge term, which can be directly estimated by gradient. Snakes • To minimize the energy function, we must solve • Where fx (i) = πEext /πxi and fy (i) = πEext /πyi Snakes • The above Euler equations can be written in matrix form as Snakes • Assume that ππ₯ and ππ¦ are constant during a time step. • πΎ is step size Snakes • The matrix (A + πΎI) is a pentadiagonal banded matrix, so its inverse can be calculated by LU decompositions in O(n). Snakes • Because regular snakes have a tendency to shrink, it is usually better to initialize them by drawing the snake outside the object of interest to be tracked. B-spline Approximations • Snakes sometimes exhibit too many degrees of freedom, making it more likely that they can get trapped in local minima during their evolution. • Use B-spline approximations to control the snake with fewer degrees of freedom. Shape Prior 5.1.2 Dynamic snakes and CONDENSATION • In many applications of active contours, the object of interest is being tracked from frame to frame as it deforms and evolves. • In this case, it make sense to use estimates from the previous frame to predict and constrain the new estimates. Elastic Nets and Slippery Springs • Applying to TSP (Traveling Salesman Problem): Elastic Nets and Slippery Springs (cont’d) • Probabilistic interpretation: o o o o i: each snake node j: each city σ: standard deviation of the Gaussian dij: Euclidean distance between a tour point f(i) and a city location d(j) Elastic Nets and Slippery Springs (cont’d) • The tour f(s) is initialized as a small circle around the mean of the city points and σ is progressively lowered. • Slippery spring: this allows the association between constraints (cities) and curve (tour) points to evolve over time. Snakes 5.1.3 Scissors 5.1.3 Scissors • Scissors can draw a better curve (optimal curve path) that clings to high-contrast edges as the user draws a rough outline. • Semi-automatic segmentation • Algorithm: o Step 1: Associate edges that are likely to be boundary elements. o Step 2: Continuously recompute the lowest cost path between the starting point and the current mouse location using Dijkstra’s algorithm. Scissors • Let π(π, π) represents the local cost on the directed link from pixel π to a neighboring pixel π. π π, π = π€π β ππ π + π€π· β ππ· π, π + π€πΊ β ππΊ π • Weights of π€π = 0.43, π€π· = 0.43, and π€πΊ = 0.14 seem to work well in a wide range of images. Scissors • ππ : Laplacian zero-crossing • βπΌ = π2 πΌ ππ₯ 2 + π2 πΌ ππ¦ 2 Scissors • πΌπΏ π is the Laplacian of an image I at pixel π Scissors • ππΊ : Gradient Magnitude • The gradient magnitude G is ππΌ 2 ππΌ 2 ( ) +( ) ππ₯ ππ¦ • The gradient is scaled and inverted so high gradients produce low costs and vice-versa. Scissors • ππ· : Gradient Direction • Let π· π be the unit vector perpendicular to the gradient direction at point π. πΌπ¦ π π· π =( πΌπ₯ π 2 + πΌπ¦ π , 2 −πΌπ₯ π πΌπ₯ π 2 + πΌπ¦ π ) 2 Scissors • Cut along the gradient direction -> minimize angle between cut direction and gradient direction Scissors • Let π(π, π) represents the local cost on the directed link from pixel π to a neighboring pixel π. π π, π = π€π β ππ π + π€π· β ππ· π, π + π€πΊ β ππΊ π Scissors • Solve optimal path by Dijkstra’s algorithm. Scissors Scissors 1 (a) ∞ 10 1 (b) ∞ ∞ 10 10 9 s 2 9 3 6 4 0 s 2 3 7 5 ∞ 7 ∞ 2 6 4 0 5 ∞ 5 2 Scissors 1 (c) 8 10 1 (d) 14 8 10 11 9 s 2 3 6 4 0 5 9 7 5 7 2 s 2 3 5 6 4 0 7 5 7 2 Scissors 1 (e) 8 10 s 2 9 3 6 4 7 5 7 2 8 10 9 0 5 1 (e) s 2 9 3 9 5 6 4 0 7 5 7 2 Scissors 5.1.4 Level Sets • If the active contours based on parametric curves of the form f(s), as the shape changes dramatically, (topology changes) curve reparameterization may also be required. 5.1.4 Level Sets 5.1.4 Level Sets • Level sets use 2D embedding function π π₯, π‘ instead of the curve f(s). Level Sets Level Sets • Given an interface Γ in π π of codimension one, bounding an open region Ω, we wish to analyze and compute its subsequence motion under a velocity field v. • The idea is merely to define a smooth function π π₯, π‘ , that represents the interface as the set where π π₯, π‘ = 0. • Here π₯ = π₯1 , β― , π₯π ∈ π π Level Sets • The level set function π has the following properties: π π₯, π‘ < 0 π π₯, π‘ = 0 π π₯, π‘ > 0 for π₯ ∈ Ω for π₯ ∈ πΩ = Γ(π‘) for π₯ ∉ Ω Level Sets • Again, our contour is at the 0 level set π π₯, π‘ = 0 • The motion is analyzed by convecting the π values with velocity field π£. ππ + π£ β π»π = 0 ππ‘ Level Sets • Actually, only the normal component of π£ is needed, π»π π£π = π£ β π»π • ππ ππ‘ + π£π β π»π = 0 • Here π»π = π 2 π π=1 π₯π Level Sets • An example is the geodesic active contour: o g(I): snake edge potential (gradient) o φ: signed distance function away from the curve o div: divergent Level Sets Level Sets • An example is the geodesic active contour: o g(I): snake edge potential (gradient) o φ: signed distance function away from the curve o div: divergent Level Sets • the main goal of π(πΌ) is actually to stop the evolving curve when it arrives to the objects boundaries. π= 1 1 + π»πΌ 2 • Where πΌ is a smoothed version of πΌ (Gaussian blur) Level Sets • Geometric interpretation of the attraction force in 1D. The original edge signal , its smoothed version , and the derived stopping function are given. The evolving contour is attracted to the valley created by π»π β π»π . Level Sets • According to g(I), the first term can straighten the curve and the second term encourages the curve to migrate towards minima of g(I). • Level-set is still susceptible to local minima. • An alternative approach is to use the energy measurement inside and outside the segmented regions. Level Sets 5.2 Split and Merge • • • • • Watershed Region splitting and merging Graph-based Segmentation k-means clustering Mean Shift 5.2.1 Watershed • An efficient way to compute such regions is to start flooding the landscape at all of the local minima and to label ridges wherever differently evolving components meet. • Watershed segmentation is often used with the user manual marks corresponding to the centers of different desired components. Watershed 5.2.2 Region Splitting (Divisive Clustering) • Step 1: Computes a histogram for the whole image. • Step 2: Finds a threshold that best separates the large peaks in the histogram. • Step 3: Repeated until regions are either fairly uniform or below a certain size. 5.2.3 Region Merging (Agglomerative Clustering) • The various criterions of merging regions: o Relative boundary lengths and the strength of the visible edges at these boundaries o Distance between closest points and farthest points o Average color difference or whose regions are too small 5.2.4 Graph-based Segmentation • We define an undirected graph πΊ = (π, πΈ), where each image pixel ππ has a corresponding vertex π£π ∈ π. • The edge set πΈ is constructed by connecting pairs of pixels that are neighbors in an 8-connected sense. • π€ π£π , π£π = πΌ π£π − πΌ π£π Graph-based Segmentation • This algorithm uses relative dissimilarities between regions to determine which ones should be merged. • Internal difference for any region R: o MST(R): minimum spanning tree of R o w(e): intensity differences of an edge in MST(R) Graph-based Segmentation • Difference between two adjacent regions: • Minimum internal difference of these two regions: o τ(R): heuristic region penalty Graph-based Segmentation • This algorithm uses relative dissimilarities between regions to determine which ones should be merged. • If Dif(R1, R2) < Mint(R1, R2) then merge these two adjacent regions. • The input is a graph πΊ = (π, πΈ), with π vertices and π edges. • The output is a segmentation of π into component π = (πΆ1 , β― , πΆπ ) Graph-based Segmentation 1. 2. 3. 4. 5. Sort πΈ into π = (π1 , β― , ππ ), by non-decreasing edge weight. Start with a segmentation π 0 , where each vertex π£π is in its own component. Repeat Step 4 for π = 1, β― , π. π−1 Construct π π given π π−1 as follows. Let πΆπ be the π−1 component of π π−1 containing π£π and πΆπ the π−1 π−1 component containing π£π . If πΆπ ≠ πΆπ and π€(ππ ) π−1 π−1 ≤ ππΌππ‘(πΆπ , πΆπ ) then π π is obtained from π π−1 by π−1 π−1 merging πΆπ and πΆπ . Otherwise π π = π π−1 . Return π = π π Graph-based Segmentation 5.2.5 Probabilistic Aggregation Gray level similarity: • Minimal external difference between Ri and Rj: o βi+ = mink| βik| o βππ = πΌπ − πΌπ , where πΌπ and πΌπ are average intensities of regions Ri and Rj respectively • Average intensity difference: o βi- = Σk(τik βik) / Σk(τik) and τik is the boundary length between regions Ri and Rk Probabilistic Aggregation • The pairwise statistics σlocal+ and σlocal- are used to compute the likelihoods pij that two regions should be merged. • πππ = πΏ+ ππ − πΏ+ ππ +πΏππ • πΏ±ππ ∼ π 0, πππ± ± • πππ± = ππππππ + ππ ππππ • ππ ππππ = πππππ π min( Ωπ , Ωπ ) • Ωπ is the number of pixels in π π Probabilistic Aggregation • Definition of strong coupling: o C: a subset of V o φ: usually set to 0.2 Probabilistic Aggregation Probabilistic Aggregation 5.3 Mean Shift and Mode Finding 5.3.1 K-means • K-means: o Step 1: Guess center. Give the number of clusters k it is supposed to find. Then choose k samples as the centers of clusters. We call the set of centers Y. o Step 2: Given center, find groups. Use fixed Y to compute the square error for all pixels, then we can get the clusters U which has least square error Emin. o Step 3: Given groups, find new centers. Use fixed Y and U to compute the square error Emin’. If Emin = Emin’ then stop and we get the final clusters. o Step 4: Repeat until centers do not change. If Emin ≠ Emin’ then use U to find new cluster centers Y’. Go to Step 2 and find new cluster U’, iteratively. 5.3.2 Mean Shift • Mean shift segmentation is the inverse of the watershed algorithm => find the peaks (modes) and then expand the region. Mean Shift • Step 1: Use kernel density estimation to estimate the density function given a sparse set of samples. o o o o f(x): density function xi: input samples k(r): kernel function or Parzen window h: width of kernel Mean Shift • Step 2: Starting at some guess for a local maximum yk, mean shift computes the gradient of the density estimate f(x) at yk and takes an uphill step in that direction. • πΊ π₯ = −π ′ ( π₯ 2 ) Mean Shift The location of yk in iteration can be expressed in the following formula: Repeat Step 2 until completely converge or after finite steps. • Step 3: The remaining points can then be classified based on the nearest evolution path. Mean Shift Mean Shift • There are still some kernels to be used: o Epanechnikov kernel (converge in finite steps) o Gaussian (normal) kernel (slower but result better) Mean Shift • Joint domain: use spatial domain and range domain to segment color image. • Kernel of joint domain (five-dimensional): o xr: (L*, u*, v*) in range domain o xs: (x, y) in spatial domain o hr, hs: color and spatial widths of kernel Mean Shift o M: a region has pixels under the number threshold will be eliminated Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Objective : Find the densest region Distribution of identical billiard balls 5.4 Normalized Cuts • Normalized cuts examine the affinities between nearby pixels and try to separate groups that are connected by weak affinities. • Not min-cut, otherwise single pixels will be isolated Normalized Cuts • To find the minimum cut between two groups A and B: • A better measure of segmentation is to find minimum normalized cut: o ππ π ππ π΄, π = π∈π΄,π∈π π€ππ 5.4 Normalized Cuts • Pixel-wise affinity weight for pixels within a radius β₯xi - xjβ₯ < r : o Fi, Fj: feature vectors that consist of intensities, colors, or oriented filter histograms o xi, xj: pixel locations Normalized Cuts Normalized Cuts • But computing the optimal normalized cut is NPcomplete. The following is a faster method. • Minimize the cut can be expressed as a Rayleigh quotient: o x is the indicator vector where xi = +1 iff i ∈ A and xi = -1 iff i ∈ B. o y = ((1 + x) - b(1 - x)) / 2 • (Compare equation for Rayleigh Quotient) o (Global minimum/maximum can be solved) Normalized Cuts o x is the indicator vector where xi = +1 iff i ∈ A and xi = -1 iff i ∈ B. o y = ((1 + x) - b(1 - x)) / 2 • X as the indicator vector X 1+x -b(1-x) Y 1 +1 2 0 1 A B 2 -1 0 -2b -b 1 2 3 +1 2 0 1 3 5 4 +1 2 0 1 4 6 5 -1 0 -2b -b 6 -1 0 -2b -b 7 +1 2 0 1 7 Normalized Cuts o o o o x is the indicator vector where xi = +1 iff i ∈ A and xi = -1 iff i ∈ B. y = ((1 + x) - b(1 - x)) / 2 W: weight matrix [wij] D: diagonal matrix, diagonal entries are the number of corresponding row sums in W • It is equivalent to solving a regular eigenvalue problem: o N = D-1/2WD-1/2 and N is called normalized affinity matrix. o z = D1/2y Normalized Cuts Normalized Cuts 5.5 Graph Cuts Graph Cuts Graph Cuts Graph Cuts • Assume that π and π΅ denote the subsets of pixels marked as “object” and “background” seeds. • Define G = (V, πΈ) • V = π ∪ π, π , where π is the set of nodes corresponding to pixels of the image. • Each pixel π of neighboring pixels (π, π) in π is connected by an 8-link. • πΈ = π π∈π{ π, π , (π, π)} Graph Cuts • π€ π, π = π΅ π, π for (π, π) ∈ π • π΅ π, π ∝ exp(− (πΌπ −πΌπ )2 2π2 )β 1 πππ π‘(π,π) …(weights between pixels) π β π π bkg , π ∈ π, π ∉ π ∪ π΅ • π€ π, π = πΎ, π ∈ π 0, π ∈ π΅ π β π π obj , π ∈ π, π ∉ π ∪ π΅ • π€ π, π = 0, π ∈ π πΎ, π ∈ π΅ • πΎ = 1 + πππ₯ π΅(π, π) π∈π π:(π,π)∈π • π is a user defined variable • π π obj = − ππ ππ(πΌπ |π) • π π bkg = − ππ ππ(πΌπ |π΅) Graph Cuts • Finding out minimum cut on πΊ = (π, πΈ) will segment image into “object” and “background” two partitions. Graph Cuts (min-cut) Min-cut is the same as max-flow 6 b 5 S d 3 6 zs a 3 5 c e 1 6 T Graph Cuts (min-cut) 0/6 b 0/5 S d 0/3 0/6 zs a 0/3 0/5 c e 0/1 0/6 T Graph Cuts (min-cut) 0/6 b 3/5 S d 3/3 0/6 zs a 0/3 0/5 c e 0/1 3/6 T Graph Cuts (min-cut) 0/6 b 3/5 S d 3/3 0/6 zs a 0/3 0/5 c e 0/1 3/6 T Graph Cuts (min-cut) 1/6 b 3/5 S d 2/3 1/6 zs a 0/3 1/5 c e 1/1 3/6 T Graph Cuts (min-cut) 1/6 b 3/5 S d 2/3 1/6 zs a 0/3 1/5 c e 1/1 3/6 T Graph Cuts (min-cut) 1/6 b 3/5 S d 2/3 4/6 zs a 3/3 4/5 c e 1/1 3/6 T Graph Cuts (min-cut) 3/6 b 5/5 S d 2/3 6/6 zs a 3/3 4/5 c e 1/1 3/6 T Graph Cuts (min-cut) 3/6 b 5/5 S d 2/3 6/6 zs a 3/3 4/5 c e 1/1 3/6 T Review • 5.1 Active Contours o Snake o Scissors o Level sets • 5.2 Split and Merge o o o o o Watershed Region splitting (histogram) Region merging Graph-based K-mean • 5.3 Mean Shift and Mode Finding • 5.4 Normalized Cuts • 5.5 Graph Cuts and Energy-based Methods Representations • Image o Surface function in 3D o Graph o Set of values (histogram) • Segments o o o o Line/path Surface function in 3D Groups of Nodes Clusters Segmentation Criteria • Object Edge o Gradient, Laplacian • Location o Connected shape o close proximity • Shape of segment o Derivative of line • Similarity within segment o K-means, mean-shift o Weights for Normalized cuts, Graph cut • Difference with external area • User-defined criteria Optimization of Criteria • • • • • • PDE: Gradient-descent / Calculus of variations Dijkstra’s algorithm K-means Mean-shift Rayleigh’s Quotient Min-cut Evaluation • The Berkeley Segmentation Dataset Medical Segmentation • END