Interactive Graph Cuts for Segmentation in N-D Images Yuri Boykov, Marie-Piere Jolly Mohit Gupta 02/15/2006 Advanced Perception Image Segmentation Images: Label Me Database Focus of this work: Foreground vs. Background **The framework presented can be easily extended to multi-label segmentation though **Multi-label segmentation solved by iteratively solving many 2-label sub-problems (Boykov, Veksler, Zabih) Images: Yin Li et al (Lazy Snapping) Focus of this work: Foreground vs. Background **The framework presented can be easily extended to multi-label segmentation though Supervised Method **Multi-label segmentation solved by iteratively solving many 2-label sub-problems (Boykov, Veksler, Zabih) Images: Yin Li et al (Lazy Snapping) Taxonomy + Previous Work Fully Automatic Methods: EigenVectors Based approaches (Normalized Cuts, Perona and Freeman Algorithm etc.) Supervised Methods Boundary Based approaches: use of an ‘evolving curve’ (snakes, intelligent scissor, image snapping) Region Based approaches: hints about foreground vs. background (intelligent paint, GrabCut) Edge Weights depend on the ‘boundary-ness’ High probability of boundary Low weight Taxonomy + Previous Work Fully Automatic Methods: EigenVectors Based approaches (Normalized Cuts, Perona and Freeman Algorithm etc.) Supervised Methods Boundary Based approaches: use of an ‘evolving curve’ (snakes, intelligent scissor, image snapping) Region Based approaches: hints about foreground vs. background (intelligent paint, GrabCut) Boundary = Shortest Path between graph vertices Taxonomy + Previous Work Fully Automatic Methods: EigenVectors Based approaches (Normalized Cuts, Perona and Freeman Algorithm etc.) Supervised Methods Boundary Based approaches: use of an ‘evolving curve’ (snakes, intelligent scissor, image snapping) Region Based approaches: hints about foreground vs. background (intelligent paint, GrabCut) Taxonomy + Previous Work Fully Automatic Methods: EigenVectors Based approaches (Normalized Cuts, Perona and Freeman Algorithm etc.) Supervised Methods Boundary Based approaches: use of an ‘evolving curve’ (snakes, intelligent scissor, image snapping) Region Based approaches: hints about foreground vs. background (intelligent paint, GrabCut) Case for a Supervised, Region Based Method Fully Automatic Never perfect – Hard to get crisp boundaries for inherently ambiguous, low contrast images Boundary Based Supervised Complex Object: User’s nightmare "Whenever something can be done in two ways, someone will be confused. Whenever something is a matter of taste, discussions can drag on forever." -- Bjarne Stroustrup Interactive Graph Cuts: Overview Supervised Region based segmentation technique What user does?: Provide clues as to the ‘desired’ segmentation Foreground and Background ‘seed pixels’ Energy of the segmentation: Lower the energy, better the segmentation Problem Formulation Input Set of pixels P, Neighborhood system N {p,q} Hard Constraints: ‘clues’ for segmentation Objective function: Soft Constraint (Energy) Output Assignment vector A = (A1, … , Ap , … , A|P|), where Ai e {0,1} A defines a segmentation of the image Objective Function: Energy of the Segmentation TO DO: Minimize Energy E(A) while satisfying the hard constraints • Regional Term R(A) Cost for assigning labels to individual pixels • Boundary Term B(A) Cost for making the boundary pass between two given pixels Graph Cuts and Image Segmentation Image with seeds Background Terminal t n-links w pq Object Terminal s Corresponding Graph A node for every pixel Two terminal nodes n-links and t-links Edge weights: Min-Cut Energy Minimization Graph Cut Image with seeds Background Terminal t n-links w pq Object Terminal Segmentation Results a cut t n-links w pq s Corresponding Graph s Graph Cut Segmentation A (A1, … , Ap , … , A|P|) Min-Cut Energy Minimization Graph Cut Image with seeds Background Terminal t n-links w pq Object Terminal Segmentation Results a cut t n-links **|C| = E(A)** Min |C| = min E(A) w pq s Corresponding Graph Segmentation A (A1, … , Ap , … , A|P|) s Graph Cut Min-Cut Energy Minimization Graph Cut Image with seeds Segmentation Results Min-cut Background Terminal t n-links w pq Object Terminal ‘Optimal’ Segmentation! Segmentation a cut t n-links **|C| = E(A)** Min |C| = min E(A) w pq s Corresponding Graph A (A1, … , Ap , … , A|P|) s Graph Cut Results • Low values of l •Boundary term dictates •Region “Shrinking” • High values of l • Region term dictates •Neighborhood info ignored We don’t want toy results… •Start with a few ‘obj’ and ‘bckg’ seeds •Refinement in trouble places Images: Yin Li et al (Lazy Snapping) Interactivity New ‘obj’/’bckg’ pixel added/deleted Only two links change: Real time re-computation Possible Interactive! More Results Images: Yin Li et al (Lazy Snapping), Boykov and Jolly Performance Efficient polynomial time algorithms available for min-cut Segmentation Problem Sparse graphs Computations in the “blink of an eye” Efficiency measured in terms of amount of human effort Bell Example takes about a minute Markov Random Fields (MRF’s) S = discrete set of sites S = {1, …, m} Ld = discrete set of labels, eg. {1, … L}. Neighborhood system N A labeling assigns a label to every site, f = {f1, … fm}. fi is the label of site i. Neighborhood conditional independence F is an MRF on S w.r.t. N iff: P(f) > 0 P(fi | fS-{i}) = P(fi | fNi) Lazy Snapping (SIGGRAPH’2004) Yin Li, Jian Sun, Chi-Keung Tang, Heung-Yeung Shum(MSRA) •‘Refine’ the coarse results returned by Graph-Cut Based methods (Boykov and Jolly) •Extremely Fast •Commercial System User Friendly Modus Operandi: How does it work? Supervised Region Based Method Three Step Process Pre-segmentation ( super pixels ) Object Marking Step ( a la Boykov and Jolly) Local Refinement Step Pre-segmentation •Exploit Spatial Coherency Replace Pixels by Super-Pixels •‘Aggressive’ unsupervised segmentation preserves color coherency within regions Pre-segmentation Boundary Editing •Object Boundary from previous step represented as polygon •Polygon can be edited for local refinement •Local refinement New constraints •Another optimization for better fit Boundary Editing •Dij term penalizes a node pair far away from the polygon brings boundary closer to the polygon b = 1 boundary snaps to polygon **Important: The optimized boundary snaps to the object boundary even though the polygon vertices may not be on it. citius altius fortius… •Faster, easier, accurat’er’ The pretty SIGGRAPH video Multi-way graph cuts (Boykov, Veksler, Zabih) Images: Label Me Database Multi-way graph cuts Slides: Yuri Boykov Multi-way graph cuts BAD NEWS: NP-hard problem (3 or more labels) two labels can be solved in polynomial time via s-t cuts GOOD NEWS: a-expansion approximation algorithm guaranteed approximation quality (2-approx) Slides: Yuri Boykov a-expansion move Basic idea: break multi-way cut computation into a sequence of binary s-t cuts a other labels Slides: Yuri Boykov a-expansion move Basic idea: break multi-way cut computation into a sequence of binary s-t cuts a other labels **Iteratively, each label competes with other labels for space in the image a-expansion algorithm 1. Start with any initial solution 2. For each label “a” in any (e.g. random) order • • Compute optimal a-expansion move (s-t graph cuts) Decline the move if there is no energy decrease 3. Stop when no expansion move would decrease energy Slides: Yuri Boykov a-expansion algorithm 1. Start with any initial solution 2. For each label “a” in any (e.g. random) order • • Compute optimal a-expansion move (s-t graph cuts) Decline move if there is no energy Athe 2-approx algorithm for decrease an NP-complete problem! 3. Stop when no expansion move would decrease energy Converges in a few iterations! Video Segmentation Video Object Cut and Paste (SIGGRAPH’2004) Yin Li, Jian Sun, Heung-Yeung Shum(MSRA) Climbing the dimension ladder 2D 3D Video Segmentation: 3D graph on volume of frames Edges for ‘temporal coherence’ in addition to ‘spatial coherence’ ‘obj’ + ‘bckg’ seeds on a few key-frames Image: Yin Li et al What’s new? Temporal Coherence Our old friend… Additional term for temporal neighbors Local Refinement: Problem •‘obj’ / ‘bckg’ color models built globally from key frames •Might get confused between the two ! Local Refinement: Solution 1. Mark windows around problem areas in key frames 2. Windows propagate through middle frames using a feature tracking algorithm 3. Apply 2D pixel level graph cut segmentation • Important: Seeds generated automatically Video Segmentation: Results Images: Boykov and Jolly Another (obviously pretty) SIGGRAPH video Conclusion Merits Globally Optimum Segmentation, when cost function is clearly defined (normalized cuts only give an approximate solution) Very Fast: Interactive Rates Natural Extension to N-Dimensional images Easily Editable: More user friendly Possible Improvement Automatic Seed Selection: From Coarse (LabelMe) kind of segmentation