
By: Rahul Suresh
Dr.Stan Birchfield
Dr.Adam Hoover
Dr.Brian Dean
Related work
Background theory:
◦ Image as a graph
◦ Kruskals’ Minimum Spanning Tree
◦ MST based segmentation
Our algorithm
Conclusion and future work
Dividing an image to disjoint regions such
that similar pixels are grouped together
Image Courtesy: [3]
Image Segmentation involves division of
image I into K regions: R1, R2, R3, … RK such
Every pixel must be
assigned to a region
Regions must be
Size: 1 pixel to the
entire image itself
Pixels within a region share certain
characteristics that is not found with pixels in
another region.
f is a function that returns TRUE if the region
under consideration is homogenous
Biomedical applications
◦ Used as a preprocessing step to identify anatomical
regions for medical diagnosis/analysis.
Brain Tissue MRI
Segmentation [1]
CT Jaw
segmentation [2]
Object recognition systems:
◦ Lower level features such as color and texture are
used to segment the image
◦ Only relevant segments (subset of pixels) are fed to
the object recognition system.
Saves computational cost, especially for large
scale recognition systems
As a preprocessing step in face and iris
Face segmentation
Iris Segmentation
Astronomy: Preprocessing step before further
Segmentation of Nebula [4]
(Manual segmentations from BSDS)
Which segmentation is “correct”?
“Correctness”- Are similar pixels grouped
together and dissimilar pixels grouped
Granularity- Extent of resolution of
◦ Consider example in the previous image
There is ambiguity in defining “good”/
“optimal” segmentation.
An image can have multiple segmentations.
◦ Make evaluation /benchmarking of segmentation
algorithm hard
Some of the popular image segmentation
approaches are:
Split and Merge approaches
Mean Shift and k-means
Spectral theory and normalized cuts
Minimum spanning tree
Split and Merge
◦ Iteratively split
 If evidence of a
boundary exists
◦ Iteratively merge
 Based on similarity
Quad-tree used
Image Courtesy: [5]
Mean shift and k-means are related.
◦ Represent each pixel as a vector [color, texture,
◦ Define a window around every point.
1. Update the point to the mean of all the points
within the window.
2. Repeat until convergence.
◦ Represent each pixel as vector [color, texture, space]
◦ Choose K initial cluster centers
1. Assign every pixel to its closet cluster center.
2. Recompute the means of all the clusters
3. Repeat 1-2 until convergence.
Difference between K means and mean-shift:
◦ In K-means, K has to be known beforehand
◦ K-means sensitive to initial choice of cluster centers
Represent image as a graph.
Using graph cuts, partitions the image into
In Spectral theory and normalized cuts,
 Eigenvalues/vectors of the Laplacian matrix is used to
determine the cut
Use Minimum Spanning Tree to segment
◦ Proposed by Felzenszwalb & Huttenlocher in 2004.
◦ Uses a variant of Kruskals MST to segment images
◦ Very efficient- O(NlogN) time
Discussed in detail in the next section
Graph G=(V,E) is an abstract data type containing
a set of vertices V and edges E.
Useful operations using a graph:
◦ See if path exists between any 2 vertices
◦ Find connected components
◦ Check for cycles
◦ Find the shortest point between any 2 vertices
◦ Compute minimum spanning
◦ Graph partition based on cuts
Graph algorithms are useful in image processing
Image graph:
◦ Pixels/group of pixels form vertices.
◦ Vertices connected to form edges
◦ Edge weight represents dissimilarity between vertices
Types of image graph:
◦ Image grid
◦ Complete graph
◦ Nearest neighbor graph
Image grid:
◦ Edges: every vertex (pixel) is connected with its 4
(or 8) x-y neighbors.
◦ No of edges m= O(N) [Graph operations are
◦ Fails to capture global properties
Complete graph:
◦ Edges: Connect every vertex (pixel) with every other
◦ No of edges m= O(N2)
◦ Captures global properties
◦ Graph operations are very expensive
Nearest neighbor graph:
◦ Compromise between grid (fails to capture global
properties) and complete graph (too many edges).
◦ Represent every vertex as a combination of color and
x-y features. [e.g. (R, G, B, x, y)]
◦ Find the K=O(1) neighbors for each pixel using
Approximate nearest neighbor (ANN)
◦ Edges: Connect every pixel to K nearest neighbors
Tree is a graph which is:
◦ Connected
◦ Has no cycles
Spanning tree: contains all the vertices of graph G
◦ A graph can have multiple spanning trees
Minimum spanning tree is a spanning tree which
has the least sum of weights of edges
• Sorting: O(mlog(m)) time
• FindSet and Merge: O(mα(N)) time [very slow growing]
OVERALL TIME: O(m log(m))
Use Minimum Spanning Tree to segment image.
In Kruskal’s MST algorithm,
◦ Edges are sorted in ascending order of weights
◦ Edges are added in order to the spanning tree as long as
a cycle is not formed.
◦ All vertices added to ONE spanning tree
If Kruskal’s is applied directly to image
◦ We will end up with ONE segment (entire image)
Variant of Kruskal’s used in image segmentation.
Create an image grid graph.
Sort edges in the increasing order of weights
For every edge ei in E,
1. If FindSet(ui) ≠ FindSet(vi) AND IsSimilar(ui ,vi)=TRUE
Merge(FindSet(ui) ,FindSet(vi) )
Instead of one MST, we end up with a forest of
K trees
Each tree represents a region
We add an edge ei connecting regions Ru and
Rv to a tree only if :
• D(Ru Rv): edge weight connecting vertices u and v
• Int(Ri): maximum edge weight in region Ri
Drawback 1: LEAK
Felzenszwalb and Huttenlocher 2004
◦ Notice how granularity changes by varying k
k is arbitrary
• k is affected by the size of the image
Constructing image grid
Sort edges in ascending order
For every edge
If Merge criterion is satisfied
Improve upon the drawbacks of MST ALGORITHM:
◦ Addressing Leak:
 Represent regions as a Gaussian distribution.
 Use Bidirectional Mahalanobis distance to compare Gaussians.
◦ Overcome sensitivity to parameter k:
 Propose parameter τ that is
 independent of image size
 Works well for 2-2.5
 Provide a mathematical intuition for it.
Propose an approximation that enables real-time
Check if D(u,v) < Int(Ru) && D(u,v) < Int(Rv)
• Leak can happen
 Represent each region as a Gaussian
 Check if the Gaussians are similar:
 Mahalanobis distance is less than 2.5
Constructing image grid
Sort edges in ascending order
For every edge
If Merge criterion is satisfied
Initialize Vertices:
◦ Every pixel is mapped to a vertex
◦ Information about vertex vi is stored at the ‘i’th entry of the
disjoint set data structure D.
The ‘i’th entry in D contains following information:
◦ Root node
◦ Zeroth, first and second order moments
◦ List of all the edges connected to vertex vi
Initialize Edges:
◦ Between neighboring pixels in x-y space
◦ Number of edges m= O(N)
◦ Use List to maintain edges
Edge weight:
◦ Euclidean distance between pixels to begin with
◦ Mahalanobis distance between Gaussians as region grows
Note that Euclidean distance is a special instance of
Mahalanobis distance
Constructing image grid
Sort edges in ascending order
For every edge
If Merge criterion is satisfied
Constructing image grid
Sort edges in ascending order
For every edge
If Merge criterion is satisfied
While adding edge ei to the MST, regions Ru
and Rv are merged if the following criterion is
Forces small regions to merge
Around 2.5 is a good
Bidirectional Mahalanobis
Constructing image grid
Sort edges in ascending order
For every edge
If Merge criterion is satisfied
Merging regions Ru and Rv
◦ Update information at the root node of the disjoint set datastructure (Similar to MST)
◦ Updating information about root node, zeroth, first and second
order moment is easy
However, after merging Ru and Rv
◦ All edges connected to either Ru or Rv have to be updated w.r.t. (Ru
∪ Rv )
◦ The edges have to be re-sorted.
◦ The above operations will slow down the overall running time to
To speed up weight update that needs to be
performed after every iteration,
◦ For every region in the DSDS, we store the pointers to all
the edges connected to it.
◦ When 2 regions are merged, we merge their neighbor
lists also
◦ Assuming that the number of neighbors for every region
is constant, every iteration of merging neighbors can
also be accomplished in O(1) time
Re-sorting the edges after every merge is an
expensive operation.
◦ We use skip lists to maintain edges.
◦ Skip list is a data structure that helps maintain sorted
 Every insert, delete and search operation takes O(logN)
amortized time.
Although the asymptotic running time is
O(NplogN), it is still slower than MST
Do not update weights or re-sort edges after
every iteration.
This runs in O(NlogN) time
◦ Speed comparable to MST
Our experiments show that the approximated
algorithm still improves upon the drawback of
◦ Leak
◦ Sensitivity to parameter k
Tested the algorithm on:
◦ Synthetic images
◦ Berkeley Segmentation Dataset
Compared its performance with MST based
segmentation algorithm
Our algorithm overcomes leak!
Gradient ramp
MST merges the
entire image into
1 region
Our algorithm
has 2 stable
Effect of parameter tau on granularity
MST based
k is arbitrary! Varies
for different image
Our algorithm
• τ represents the distance
between Gaussians!
• Best results when
2< τ<2.5
We ran the segmentation
Mona Lisa
exhaustively for multiple
values of τ
Studied the effect of τ on
the number of regions
Notice that curve flattens
for τ in the range 2-2.5.
◦ Represents “stable” regions
◦ Segmentation unaffected by
parameter change
cf. Yu 2007
Notice the flat
We ran the algorithm exhaustively on
Berkeley Segmentation dataset.
Our algorithm produced more “correct”
segmentations than MST segmentations.
Segmentation was sharper in our algorithm.
Some specific example illustrated in the
subsequent slides
Notice the Man on the Hill
Notice the face of the man kneeling down!
Notice how a small leak has
merged grass with part of bison’s
Proposed a new segmentation algorithm that
improves upon the drawbacks of MST:
◦ Leak
 Represent regions as Gaussians
 Use bidirectional Mahalanobis distance to compare
◦ Sensitivity to parameter k
 τ = 2.5 works well for all images, represents
normalized distance between Gaussian distributions
 Shown experimentally to be “stable”
In the worst case scenerio,
◦ naïve version of our algorithm runs in O(N2) time.
◦ Using skip list improves to O(NlogN) but still not as fast
as MST
An approximated algorithm is proposed:
◦ Runs in O(NlogN) time and speed comparable to MST
◦ Still overcomes the two drawbacks of MST based
Pre-processing: Homographic filtering
Efficiency: Speed up the original version of
the algorithm using more sophisticated
priority queues.
Benchmarking: Mathematical study of the
accuracy of our segmentation algorithm on
BSDS dataset
[5] Stan Birchfield. Image Segmentation. Lecture Notes