Document

Chapter 5 Segmentation 周立晴 d99922027 Movie Special effects Segmentation (1/2) • • • • • 5.1 Active Contours 5.2 Split and Merge 5.3 Mean Shift and Mode Finding 5.4 Normalized Cuts 5.5 Graph Cuts and Energy-based Methods Segmentation (2/2) 5.1 Active Contours • Snakes • Scissors • Level Sets 5.1.1 Snakes • Snakes are a two-dimensional generalization of the 1D energy-minimizing splines. • One may visualize the snake as a rubber band of arbitrary shape that is deforming with time trying to get as close as possible to the object contour. Snakes Snakes • Minimize energy constraint • Parametric equations • PDE Snakes • Internal spline energy: contour curvature o s: arc length o vs, vss: first-order and second-order derivatives of curve o α, β: first-order and second-order weighting functions • Discretized form of internal spline energy Snakes • External spline energy: constrain o Line term: attracting to dark ridges o Edge term: attracting to strong gradients o Term term: attracting to line terminations • In practice, most systems only use the edge term, which can be directly estimated by gradient. Snakes • Line Functional o The simplest useful image functional is the image intensity itself. • 𝐸𝑙𝑖𝑛𝑒 = 𝐼(𝑥, 𝑦) o The snake will be attracted either to light lines or dark lines depending on the sign of 𝑤𝑙𝑖𝑛𝑒 . Snakes • Edge functional o The snake is attracted to contours with large image gradients. Snakes • Termination Function o Use the curvature of level lines in a slightly smoothed image. Snakes • User-placed constraints can also be added. o f: the snake points o d: anchor points Snakes • Internal spline energy: contour curvature o s: arc length o vs, vss: first-order and second-order derivatives of curve o α, β: first-order and second-order weighting functions • Discretized form of internal spline energy Snakes • External spline energy: constrain o Line term: attracting to dark ridges o Edge term: attracting to strong gradients o Term term: attracting to line terminations • In practice, most systems only use the edge term, which can be directly estimated by gradient. Snakes • To minimize the energy function, we must solve • Where fx (i) = 𝜕Eext /𝜕xi and fy (i) = 𝜕Eext /𝜕yi Snakes • The above Euler equations can be written in matrix form as Snakes • Assume that 𝑓𝑥 and 𝑓𝑦 are constant during a time step. • 𝛾 is step size Snakes • The matrix (A + 𝛾I) is a pentadiagonal banded matrix, so its inverse can be calculated by LU decompositions in O(n). Snakes • Because regular snakes have a tendency to shrink, it is usually better to initialize them by drawing the snake outside the object of interest to be tracked. B-spline Approximations • Snakes sometimes exhibit too many degrees of freedom, making it more likely that they can get trapped in local minima during their evolution. • Use B-spline approximations to control the snake with fewer degrees of freedom. Shape Prior 5.1.2 Dynamic snakes and CONDENSATION • In many applications of active contours, the object of interest is being tracked from frame to frame as it deforms and evolves. • In this case, it make sense to use estimates from the previous frame to predict and constrain the new estimates. Elastic Nets and Slippery Springs • Applying to TSP (Traveling Salesman Problem): Elastic Nets and Slippery Springs (cont’d) • Probabilistic interpretation: o o o o i: each snake node j: each city σ: standard deviation of the Gaussian dij: Euclidean distance between a tour point f(i) and a city location d(j) Elastic Nets and Slippery Springs (cont’d) • The tour f(s) is initialized as a small circle around the mean of the city points and σ is progressively lowered. • Slippery spring: this allows the association between constraints (cities) and curve (tour) points to evolve over time. Snakes 5.1.3 Scissors 5.1.3 Scissors • Scissors can draw a better curve (optimal curve path) that clings to high-contrast edges as the user draws a rough outline. • Semi-automatic segmentation • Algorithm: o Step 1: Associate edges that are likely to be boundary elements. o Step 2: Continuously recompute the lowest cost path between the starting point and the current mouse location using Dijkstra’s algorithm. Scissors • Let 𝑙(𝑝, 𝑞) represents the local cost on the directed link from pixel 𝑝 to a neighboring pixel 𝑞. 𝑙 𝑝, 𝑞 = 𝑤𝑍 ∙ 𝑓𝑍 𝑞 + 𝑤𝐷 ∙ 𝑓𝐷 𝑝, 𝑞 + 𝑤𝐺 ∙ 𝑓𝐺 𝑞 • Weights of 𝑤𝑍 = 0.43, 𝑤𝐷 = 0.43, and 𝑤𝐺 = 0.14 seem to work well in a wide range of images. Scissors • 𝑓𝑍 : Laplacian zero-crossing • ∆𝐼 = 𝜕2 𝐼 𝜕𝑥 2 + 𝜕2 𝐼 𝜕𝑦 2 Scissors • 𝐼𝐿 𝑞 is the Laplacian of an image I at pixel 𝑞 Scissors • 𝑓𝐺 : Gradient Magnitude • The gradient magnitude G is 𝜕𝐼 2 𝜕𝐼 2 ( ) +( ) 𝜕𝑥 𝜕𝑦 • The gradient is scaled and inverted so high gradients produce low costs and vice-versa. Scissors • 𝑓𝐷 : Gradient Direction • Let 𝐷 𝑝 be the unit vector perpendicular to the gradient direction at point 𝑝. 𝐼𝑦 𝑝 𝐷 𝑝 =( 𝐼𝑥 𝑝 2 + 𝐼𝑦 𝑝 , 2 −𝐼𝑥 𝑝 𝐼𝑥 𝑝 2 + 𝐼𝑦 𝑝 ) 2 Scissors • Cut along the gradient direction -> minimize angle between cut direction and gradient direction Scissors • Let 𝑙(𝑝, 𝑞) represents the local cost on the directed link from pixel 𝑝 to a neighboring pixel 𝑞. 𝑙 𝑝, 𝑞 = 𝑤𝑍 ∙ 𝑓𝑍 𝑞 + 𝑤𝐷 ∙ 𝑓𝐷 𝑝, 𝑞 + 𝑤𝐺 ∙ 𝑓𝐺 𝑞 Scissors • Solve optimal path by Dijkstra’s algorithm. Scissors Scissors 1 (a) ∞ 10 1 (b) ∞ ∞ 10 10 9 s 2 9 3 6 4 0 s 2 3 7 5 ∞ 7 ∞ 2 6 4 0 5 ∞ 5 2 Scissors 1 (c) 8 10 1 (d) 14 8 10 11 9 s 2 3 6 4 0 5 9 7 5 7 2 s 2 3 5 6 4 0 7 5 7 2 Scissors 1 (e) 8 10 s 2 9 3 6 4 7 5 7 2 8 10 9 0 5 1 (e) s 2 9 3 9 5 6 4 0 7 5 7 2 Scissors 5.1.4 Level Sets • If the active contours based on parametric curves of the form f(s), as the shape changes dramatically, (topology changes) curve reparameterization may also be required. 5.1.4 Level Sets 5.1.4 Level Sets • Level sets use 2D embedding function 𝜙 𝑥, 𝑡 instead of the curve f(s). Level Sets Level Sets • Given an interface Γ in 𝑅𝑛 of codimension one, bounding an open region Ω, we wish to analyze and compute its subsequence motion under a velocity field v. • The idea is merely to define a smooth function 𝜙 𝑥, 𝑡 , that represents the interface as the set where 𝜙 𝑥, 𝑡 = 0. • Here 𝑥 = 𝑥1 , ⋯ , 𝑥𝑛 ∈ 𝑅𝑛 Level Sets • The level set function 𝜙 has the following properties: 𝜙 𝑥, 𝑡 < 0 𝜙 𝑥, 𝑡 = 0 𝜙 𝑥, 𝑡 > 0 for 𝑥 ∈ Ω for 𝑥 ∈ 𝜕Ω = Γ(𝑡) for 𝑥 ∉ Ω Level Sets • Again, our contour is at the 0 level set 𝜙 𝑥, 𝑡 = 0 • The motion is analyzed by convecting the 𝜙 values with velocity field 𝑣. 𝜕𝜙 + 𝑣 ∙ 𝛻𝜙 = 0 𝜕𝑡 Level Sets • Actually, only the normal component of 𝑣 is needed, 𝛻𝜙 𝑣𝑁 = 𝑣 ∙ 𝛻𝜙 • 𝜕𝜙 𝜕𝑡 + 𝑣𝑁 ∙ 𝛻𝜙 = 0 • Here 𝛻𝜙 = 𝑛 2 𝜙 𝑖=1 𝑥𝑖 Level Sets • An example is the geodesic active contour: o g(I): snake edge potential (gradient) o φ: signed distance function away from the curve o div: divergent Level Sets Level Sets • An example is the geodesic active contour: o g(I): snake edge potential (gradient) o φ: signed distance function away from the curve o div: divergent Level Sets • the main goal of 𝑔(𝐼) is actually to stop the evolving curve when it arrives to the objects boundaries. 𝑔= 1 1 + 𝛻𝐼 2 • Where 𝐼 is a smoothed version of 𝐼 (Gaussian blur) Level Sets • Geometric interpretation of the attraction force in 1D. The original edge signal , its smoothed version , and the derived stopping function are given. The evolving contour is attracted to the valley created by 𝛻𝑔 ∙ 𝛻𝜙 . Level Sets • According to g(I), the first term can straighten the curve and the second term encourages the curve to migrate towards minima of g(I). • Level-set is still susceptible to local minima. • An alternative approach is to use the energy measurement inside and outside the segmented regions. Level Sets 5.2 Split and Merge • • • • • Watershed Region splitting and merging Graph-based Segmentation k-means clustering Mean Shift 5.2.1 Watershed • An efficient way to compute such regions is to start flooding the landscape at all of the local minima and to label ridges wherever differently evolving components meet. • Watershed segmentation is often used with the user manual marks corresponding to the centers of different desired components. Watershed 5.2.2 Region Splitting (Divisive Clustering) • Step 1: Computes a histogram for the whole image. • Step 2: Finds a threshold that best separates the large peaks in the histogram. • Step 3: Repeated until regions are either fairly uniform or below a certain size. 5.2.3 Region Merging (Agglomerative Clustering) • The various criterions of merging regions: o Relative boundary lengths and the strength of the visible edges at these boundaries o Distance between closest points and farthest points o Average color difference or whose regions are too small 5.2.4 Graph-based Segmentation • We define an undirected graph 𝐺 = (𝑉, 𝐸), where each image pixel 𝑝𝑖 has a corresponding vertex 𝑣𝑖 ∈ 𝑉. • The edge set 𝐸 is constructed by connecting pairs of pixels that are neighbors in an 8-connected sense. • 𝑤 𝑣𝑖 , 𝑣𝑗 = 𝐼 𝑣𝑖 − 𝐼 𝑣𝑗 Graph-based Segmentation • This algorithm uses relative dissimilarities between regions to determine which ones should be merged. • Internal difference for any region R: o MST(R): minimum spanning tree of R o w(e): intensity differences of an edge in MST(R) Graph-based Segmentation • Difference between two adjacent regions: • Minimum internal difference of these two regions: o τ(R): heuristic region penalty Graph-based Segmentation • This algorithm uses relative dissimilarities between regions to determine which ones should be merged. • If Dif(R1, R2) < Mint(R1, R2) then merge these two adjacent regions. • The input is a graph 𝐺 = (𝑉, 𝐸), with 𝑛 vertices and 𝑚 edges. • The output is a segmentation of 𝑉 into component 𝑆 = (𝐶1 , ⋯ , 𝐶𝑟 ) Graph-based Segmentation 1. 2. 3. 4. 5. Sort 𝐸 into 𝜋 = (𝑜1 , ⋯ , 𝑜𝑚 ), by non-decreasing edge weight. Start with a segmentation 𝑆 0 , where each vertex 𝑣𝑖 is in its own component. Repeat Step 4 for 𝑞 = 1, ⋯ , 𝑚. 𝑞−1 Construct 𝑆 𝑞 given 𝑆 𝑞−1 as follows. Let 𝐶𝑖 be the 𝑞−1 component of 𝑆 𝑞−1 containing 𝑣𝑖 and 𝐶𝑗 the 𝑞−1 𝑞−1 component containing 𝑣𝑗 . If 𝐶𝑖 ≠ 𝐶𝑗 and 𝑤(𝑜𝑞 ) 𝑞−1 𝑞−1 ≤ 𝑀𝐼𝑛𝑡(𝐶𝑖 , 𝐶𝑗 ) then 𝑆 𝑞 is obtained from 𝑆 𝑞−1 by 𝑞−1 𝑞−1 merging 𝐶𝑖 and 𝐶𝑗 . Otherwise 𝑆 𝑞 = 𝑆 𝑞−1 . Return 𝑆 = 𝑆 𝑚 Graph-based Segmentation 5.2.5 Probabilistic Aggregation Gray level similarity: • Minimal external difference between Ri and Rj: o ∆i+ = mink| ∆ik| o ∆𝑖𝑗 = 𝐼𝑖 − 𝐼𝑗 , where 𝐼𝑖 and 𝐼𝑗 are average intensities of regions Ri and Rj respectively • Average intensity difference: o ∆i- = Σk(τik ∆ik) / Σk(τik) and τik is the boundary length between regions Ri and Rk Probabilistic Aggregation • The pairwise statistics σlocal+ and σlocal- are used to compute the likelihoods pij that two regions should be merged. • 𝑝𝑖𝑗 = 𝐿+ 𝑖𝑗 − 𝐿+ 𝑖𝑗 +𝐿𝑖𝑗 • 𝐿±𝑖𝑗 ∼ 𝑁 0, 𝜎𝑖𝑗± ± • 𝜎𝑖𝑗± = 𝜎𝑙𝑜𝑐𝑎𝑙 + 𝜎𝑠𝑐𝑎𝑙𝑒 • 𝜎𝑠𝑐𝑎𝑙𝑒 = 𝜎𝑛𝑜𝑖𝑠𝑒 min( Ω𝑖 , Ω𝑗 ) • Ω𝑖 is the number of pixels in 𝑅𝑖 Probabilistic Aggregation • Definition of strong coupling: o C: a subset of V o φ: usually set to 0.2 Probabilistic Aggregation Probabilistic Aggregation 5.3 Mean Shift and Mode Finding 5.3.1 K-means • K-means: o Step 1: Guess center. Give the number of clusters k it is supposed to find. Then choose k samples as the centers of clusters. We call the set of centers Y. o Step 2: Given center, find groups. Use fixed Y to compute the square error for all pixels, then we can get the clusters U which has least square error Emin. o Step 3: Given groups, find new centers. Use fixed Y and U to compute the square error Emin’. If Emin = Emin’ then stop and we get the final clusters. o Step 4: Repeat until centers do not change. If Emin ≠ Emin’ then use U to find new cluster centers Y’. Go to Step 2 and find new cluster U’, iteratively. 5.3.2 Mean Shift • Mean shift segmentation is the inverse of the watershed algorithm => find the peaks (modes) and then expand the region. Mean Shift • Step 1: Use kernel density estimation to estimate the density function given a sparse set of samples. o o o o f(x): density function xi: input samples k(r): kernel function or Parzen window h: width of kernel Mean Shift • Step 2: Starting at some guess for a local maximum yk, mean shift computes the gradient of the density estimate f(x) at yk and takes an uphill step in that direction. • 𝐺 𝑥 = −𝑘 ′ ( 𝑥 2 ) Mean Shift The location of yk in iteration can be expressed in the following formula: Repeat Step 2 until completely converge or after finite steps. • Step 3: The remaining points can then be classified based on the nearest evolution path. Mean Shift Mean Shift • There are still some kernels to be used: o Epanechnikov kernel (converge in finite steps) o Gaussian (normal) kernel (slower but result better) Mean Shift • Joint domain: use spatial domain and range domain to segment color image. • Kernel of joint domain (five-dimensional): o xr: (L*, u*, v*) in range domain o xs: (x, y) in spatial domain o hr, hs: color and spatial widths of kernel Mean Shift o M: a region has pixels under the number threshold will be eliminated Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Mean Shift vector Objective : Find the densest region Distribution of identical billiard balls Intuitive Description Region of interest Center of mass Objective : Find the densest region Distribution of identical billiard balls 5.4 Normalized Cuts • Normalized cuts examine the affinities between nearby pixels and try to separate groups that are connected by weak affinities. • Not min-cut, otherwise single pixels will be isolated Normalized Cuts • To find the minimum cut between two groups A and B: • A better measure of segmentation is to find minimum normalized cut: o 𝑎𝑠𝑠𝑜𝑐 𝐴, 𝑉 = 𝑖∈𝐴,𝑗∈𝑉 𝑤𝑖𝑗 5.4 Normalized Cuts • Pixel-wise affinity weight for pixels within a radius ∥xi - xj∥ < r : o Fi, Fj: feature vectors that consist of intensities, colors, or oriented filter histograms o xi, xj: pixel locations Normalized Cuts Normalized Cuts • But computing the optimal normalized cut is NPcomplete. The following is a faster method. • Minimize the cut can be expressed as a Rayleigh quotient: o x is the indicator vector where xi = +1 iff i ∈ A and xi = -1 iff i ∈ B. o y = ((1 + x) - b(1 - x)) / 2 • (Compare equation for Rayleigh Quotient) o (Global minimum/maximum can be solved) Normalized Cuts o x is the indicator vector where xi = +1 iff i ∈ A and xi = -1 iff i ∈ B. o y = ((1 + x) - b(1 - x)) / 2 • X as the indicator vector X 1+x -b(1-x) Y 1 +1 2 0 1 A B 2 -1 0 -2b -b 1 2 3 +1 2 0 1 3 5 4 +1 2 0 1 4 6 5 -1 0 -2b -b 6 -1 0 -2b -b 7 +1 2 0 1 7 Normalized Cuts o o o o x is the indicator vector where xi = +1 iff i ∈ A and xi = -1 iff i ∈ B. y = ((1 + x) - b(1 - x)) / 2 W: weight matrix [wij] D: diagonal matrix, diagonal entries are the number of corresponding row sums in W • It is equivalent to solving a regular eigenvalue problem: o N = D-1/2WD-1/2 and N is called normalized affinity matrix. o z = D1/2y Normalized Cuts Normalized Cuts 5.5 Graph Cuts Graph Cuts Graph Cuts Graph Cuts • Assume that 𝑂 and 𝐵 denote the subsets of pixels marked as “object” and “background” seeds. • Define G = (V, 𝐸) • V = 𝑃 ∪ 𝑆, 𝑇 , where 𝑃 is the set of nodes corresponding to pixels of the image. • Each pixel 𝑝 of neighboring pixels (𝑝, 𝑞) in 𝑁 is connected by an 8-link. • 𝐸 = 𝑁 𝑝∈𝑃{ 𝑝, 𝑆 , (𝑝, 𝑇)} Graph Cuts • 𝑤 𝑝, 𝑞 = 𝐵 𝑝, 𝑞 for (𝑝, 𝑞) ∈ 𝑁 • 𝐵 𝑝, 𝑞 ∝ exp(− (𝐼𝑝 −𝐼𝑞 )2 2𝜎2 )∙ 1 𝑑𝑖𝑠𝑡(𝑝,𝑞) …(weights between pixels) 𝜆 ∙ 𝑅𝑝 bkg , 𝑝 ∈ 𝑃, 𝑝 ∉ 𝑂 ∪ 𝐵 • 𝑤 𝑝, 𝑆 = 𝐾, 𝑝 ∈ 𝑂 0, 𝑝 ∈ 𝐵 𝜆 ∙ 𝑅𝑝 obj , 𝑝 ∈ 𝑃, 𝑝 ∉ 𝑂 ∪ 𝐵 • 𝑤 𝑝, 𝑇 = 0, 𝑝 ∈ 𝑂 𝐾, 𝑝 ∈ 𝐵 • 𝐾 = 1 + 𝑚𝑎𝑥 𝐵(𝑝, 𝑞) 𝑝∈𝑃 𝑞:(𝑝,𝑞)∈𝑁 • 𝜆 is a user defined variable • 𝑅𝑝 obj = − 𝑙𝑛 𝑃𝑟(𝐼𝑝 |𝑂) • 𝑅𝑝 bkg = − 𝑙𝑛 𝑃𝑟(𝐼𝑝 |𝐵) Graph Cuts • Finding out minimum cut on 𝐺 = (𝑉, 𝐸) will segment image into “object” and “background” two partitions. Graph Cuts (min-cut) Min-cut is the same as max-flow 6 b 5 S d 3 6 zs a 3 5 c e 1 6 T Graph Cuts (min-cut) 0/6 b 0/5 S d 0/3 0/6 zs a 0/3 0/5 c e 0/1 0/6 T Graph Cuts (min-cut) 0/6 b 3/5 S d 3/3 0/6 zs a 0/3 0/5 c e 0/1 3/6 T Graph Cuts (min-cut) 0/6 b 3/5 S d 3/3 0/6 zs a 0/3 0/5 c e 0/1 3/6 T Graph Cuts (min-cut) 1/6 b 3/5 S d 2/3 1/6 zs a 0/3 1/5 c e 1/1 3/6 T Graph Cuts (min-cut) 1/6 b 3/5 S d 2/3 1/6 zs a 0/3 1/5 c e 1/1 3/6 T Graph Cuts (min-cut) 1/6 b 3/5 S d 2/3 4/6 zs a 3/3 4/5 c e 1/1 3/6 T Graph Cuts (min-cut) 3/6 b 5/5 S d 2/3 6/6 zs a 3/3 4/5 c e 1/1 3/6 T Graph Cuts (min-cut) 3/6 b 5/5 S d 2/3 6/6 zs a 3/3 4/5 c e 1/1 3/6 T Review • 5.1 Active Contours o Snake o Scissors o Level sets • 5.2 Split and Merge o o o o o Watershed Region splitting (histogram) Region merging Graph-based K-mean • 5.3 Mean Shift and Mode Finding • 5.4 Normalized Cuts • 5.5 Graph Cuts and Energy-based Methods Representations • Image o Surface function in 3D o Graph o Set of values (histogram) • Segments o o o o Line/path Surface function in 3D Groups of Nodes Clusters Segmentation Criteria • Object Edge o Gradient, Laplacian • Location o Connected shape o close proximity • Shape of segment o Derivative of line • Similarity within segment o K-means, mean-shift o Weights for Normalized cuts, Graph cut • Difference with external area • User-defined criteria Optimization of Criteria • • • • • • PDE: Gradient-descent / Calculus of variations Dijkstra’s algorithm K-means Mean-shift Rayleigh’s Quotient Min-cut Evaluation • The Berkeley Segmentation Dataset Medical Segmentation • END

Document

Related documents

Products

Support

Document

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib