Vision Topics Seminar Mean Shift

advertisement
Vision Topics Seminar
Mean Shift
Egorov Svetlana
Based on: D. Comaniciu, P. Meer: Mean Shift Analysis and Applications,
IEEE Int. Conf. Computer Vision (ICCV'99), Kerkyra , Greece ,
1197-1203, 1999
1
Presentation plan
• Motivation and Goal
• Intro: problem formulation, previous methods
overview
• Base paper on the mean-shift in details
• Recent modifications and improvements
• Recent applications
2
Presentation goals
• Present Mean-Shift algorithm used as a
common technique for two Computer Vision
tasks:
– Image filtering and discontinuity preserving smoothing
– Clustering/segmentation
• Highlight the pros/cons and tradeoffs of this
method, compare with previous methods.
• Review recent modifications and improvements.
• Present possible applications of the method,
emphasizing on one specific case.
3
Segmentation methods –
overview
• As presented in “Segmentation and low-level grouping”
by Bill Freeman, MIT, following methods exist for
segmentation:
• Background subtraction
– Estimate the background using a moving average and subtract
from the current frame to extract the foreground.
• K-means clustering
– The k-means algorithm is an algorithm to cluster n objects based
on attributes into k partitions, k < n
• Mean-shift algorithm (focus of this PPT).
• Normalized cuts
4
Mean-shift – motivation and
intuitive description
• Given a distribution of points, mean shift is a
procedure for finding the densest region.
• Example for simple 2D case (see next slide):
– Start from arbitrary point in the distribution
– Region of interest is a circle centered in this point
– On each iteration find the center of the mass for the
region of interest
– Move the circle to the center of the mass
– Continue the iterations until convergence
5
Intuitive Description
Region of
interest
Center of
mass
Objective : Find the densest region
Distribution of identical billiard balls
From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
Mean Shift
vector
6
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
7
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
8
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
9
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
10
Intuitive Description
Region of
interest
Center of
mass
Mean Shift
vector
From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
11
Intuitive Description
Region of
interest
Center of
mass
From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
12
Mean-shift – algorithm formal
definition.
• The Basic Mean Shift Algorithm is formulated
according to the following paper:
D. Comaniciu, P. Meer: Mean Shift Analysis and Applications,
IEEE Int. Conf. Computer Vision (ICCV'99), Kerkyra , Greece ,
1197-1203, 1999
13
Mean-shift – algorithm formal
definition.
• Given: set of n points in the d-dimensional
space:
{xi}i=1..n
• Model: We assume non-parametric statistical
model, i.e. there is a probability density function
(PDF) associated with the set of points, without
any assumptions on its parameters.
• Goal: for any given point find closest local mode
of the density function.
14
Non-parametric density gradient
estimation
Non-parametric
Density Estimation
Discrete PDF Representation
Data
Non-parametric
Density GRADIENT Estimation
(Mean Shift)
PDF Analysis
• Non-parametric – no assumption about PDF form (e.g. normal distribution)
• Density Gradient is estimated instead of Density itself.
From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
15
Kernels
• Kernel notion is used for PDF gradient estimation method (referred
also as Parzen windows method used in statistics)
• A kernel is a non-negative real-valued integrable function K satisfying
the following requirements
Kernel Properties:
• Normalized
 K ( x ) dx  1
Rd
• Symmetric
 xK (x)dx  0
Rd
• Exponential weight
decay
lim x K (x)  0
d
x 

xxT K (x)dx  cI
Rd
From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
16
Kernel - examples
In practice one of the following forms is used, where k( ) is a Kernel profile
K(x) = c∏k(x(j))
or
K (x)  ck  x

Where x(j) are individual coordinates
Examples:


 c 1 x
• Epanechnikov Kernel K E (x)  

 0
2

x 1
otherwise
• Uniform Kernel
c
x 1
KU (x)  
 0 otherwise
• Normal Kernel
 1 2
K N (x)  c  exp   x 
 2

From “Mean Shift Theory and Applications”, presentation for “Advanced Topics
in Computer Vision” course, Weizmann Institute.
17
Mean-shift – algorithm (cont).
• The multivariate kernel density estimate obtained
with kernel K(x) and window radius h, computed
in the point x:
• The optimum kernel yielding asymptotic minimum mean
integrated square error (AMISE) is the Epanechnikov
kernel
where cd is the volume of the unit d-dimensional sphere
18
Mean-shift – algorithm (cont).
• Density gradient estimate for Epanechnikov
kernel:
where Sh(x) is a sphere of radius h centered on x and
containing nx data points.
• The sample mean shift is given by:
The first term is the center of the mass of the points within the
sphere, when all the points are equally weighted.
19
Mean-shift – algorithm (cont).
• Mean shift relation to f(x) and its gradient:
Mean-shift vector has the same direction as the density gradient.
20
Mean-shift properties
• Estimate of the normalized gradient can be obtained by computing
the sample mean shift in a uniform kernel centered on x.
• The mean shift vector has the direction of the gradient of the density
estimate at x when this estimate is obtained with the Epanechnikov
kernel.
• The mean shift vector always points towards
the direction of the maximum increase in the density and
can define a path leading to a density mode.
• The mean shift procedure, obtained by successive computation of
the mean shift vector Mh(x) and translation of the window Sh(x) by
Mh(x), is guaranteed to converge
21
Processing in joint Spatial-Range
Domain
• An image is typically represented as a 2dimensional lattice of r-dimensional vectors
(pixels)
– r is 1 in the gray level case, 3 for color images, or r >
3 in the multi-spectral case (frequencies beyond the
visible light range)
• The space of the lattice is the spatial domain
• The gray level, color, or spectral information is
represented in the range domain.
• After a normalization with global parameters σs
and σr, the location and range vectors
concatenated to a joint spatial-range domain of
22
dimension d = r + 2.
Processing in joint Spatial-Range
Domain (cont.)
• The discussed method applies the mean
shift procedure for the data points in the
joint spatial-range domain.
• Each data point becomes associated to
a point of convergence which represents
the local mode of the density in the ddimensional space
23
Mean shift applications - Discontinuity
preserving filtering
• The output of the mean shift filter for an image
pixel is the range information carried by the point
of convergence.
• Filtering procedure:
– {xj}j=1...n - original image (normalized with σs and σr)
– {zj}j=1...n - filtered image
24
Computational complexity
• The lattice structure of the spatial domain is used
for the efficient search of the points
.
• This search can be limited to a rectangular window
of size 2x2 in the normalized space, which
corresponds to
image pixels
• The arithmetic complexity of mean shift filtering is
about
ops per image pixel.
where kc is the mean number of iterations
to convergence.
25
Filtering - example
Original image
Filtered image
26
Example from Comaniciu & Meer
Mean shift applications - Segmentation
Segmentation divides the image into segments or clusters
The arithmetic complexity of the segmentation is similar to
that of the mean shift filtering.
27
Segmentation examples
Original image
Segmented image
28
Example from Comaniciu & Meer
Corresponding contours
Segmentation examples
Original image
Segmented image
29
Example from Comaniciu & Meer
Mean Shift - recent modifications
and improvements
• One of the recent modifications to the basic mean shift, is P.A.M.S.
The path assigned mean shift algorithm: A new fast mean shift
implementation for colour image segmentation
Pooransingh, A.; Radix, C.-A.; Kokaram, A.;
15th IEEE International Conference on Image Processing, 2008. ICIP
2008.
• According to this paper, the mean shift method is effective in high
density regions but for multidimensional data sets proves to be
computationally expensive.
The goal of the method proposed in this paper is to achieve fast
mean shift methods capable of processing multidimensional data
sets easily.
30
General mean-shift (GMS) method for YUV colour
space (revised algorithm).
The main computational load is the calculation of the
mean shift vector, mc(U,V). The computational cost is
O(n2) where n is the size of the data set.
31
Fast mean-shift methods.
• A number of modifications were proposed to improve
complexity:
– Use of single metric to represent each data point
– Hierarchical clustering method: repeatedly applying
the mean shift over increasingly large bandwidth, with each
step using the results of the previous to initialize.
– Neighbourhood consistency algorithm:
Step 1: Partition: The original data set is decomposed
into a number of local subsets of similar size and centre
calculated.
Step 2: Clustering: The mean shift is calculated for each sample
rather than the whole data set to find a single class for each
sample
32
Path Assigned Mean Shift (PAMS)
– main idea
• For any random start point, the mean shift vector always
points to the mode point
• In the PAMS assignment, all points along the path
toward the mode point are assigned to that final mode
value.
– points already assigned modes are eliminated from the mean
shift process and are not traversed in the future
33
Path Assigned Mean Shift Algorithm in the Colour
Domain
• The complexity is reduced to O(φ2) where φ is the total
number of unassigned points per iteration.
34
Example illustrating GMS vs. PAMS
General mean-shift
PAMS
35
Comparison of
segmentation
results between
different
algorithms
(a),(b) Original
(c)(d) GMS
(g),(h) PAMS
(e)-(f) other fast mean shift
method.
36
Mean Shift - Recent Applications.
•
One of the recent mean shift applications is presented in the following paper:
Region-based mean shift tracking: Application to face tracking
Vilaplana, V.; Marques, F.;
15th IEEE International Conference on Image Processing, 2008. ICIP 2008.
Refer to Appendix for details
Face tracking:
•
•
Face tracking is a task required by applications such as video indexing,
visual surveillance, human-computer interaction, or facial expression
recognition. In these applications, it is necessary to detect the faces, track
them from frame to frame and analyze the tracks, e.g. to understand the
object’s behavior.
Tracking methods are organized in three groups, based on the model
selected to describe the shape
– Point tracking
– Kernel tracking
– Silhouette tracking
37
Face tracking - example
38
Examples from http://gps-tsc.upc.es/imatge/_Veronica/RegionBasedMeanShift.html
Conclusions
• Mean-shift is a useful method for low-level tasks such as filtering or
segmentation. Minor details of the background are eliminated, while
objects discontinuities are preserved
• The method is non-parametric, i.e. doesn’t assume any model for
underlying density function
• The method works in joint spatial-range domain
• The M.S. method is guaranteed to converge
• Scaling factors (σs and σr) have major impact on algorithm
performance and should be adjusted to the objects nature
• The Basic M.S. is computationally expensive. Some efficient
modifications, with improved complexity and same quality were
proposed recently. One example is Path Assigned M.S.
• Another possible application of the mean-shift is face tracking.
Consistent tracking can be achieved by combining mean-shift with
image partition into regions.
39
References
[1] D. Comaniciu, P. Meer: Mean Shift Analysis and Applications,
IEEE Int. Conf. Computer Vision (ICCV'99), Kerkyra , Greece , 1197-1203, 1999
[2] Segmentation and low-level grouping. Bill Freeman, MIT.
[3] The path assigned mean shift algorithm: A new fast mean shift implementation
for colour image segmentation
Pooransingh, A.; Radix, C.-A.; Kokaram, A.;
15th IEEE International Conference on Image Processing, 2008. ICIP 2008.
[4] Region-based mean shift tracking: Application to face tracking
Vilaplana, V.; Marques, F.;
15th IEEE International Conference on Image Processing, 2008. ICIP 2008.
[5] D. Comaniciu, P. Meer: Mean Shift: A Robust Approach toward Feature Space
Analysis, IEEE Trans. Pattern Analysis Machine Intell., Vol. 24, No. 5, 603-619,
2002
[6] “Mean Shift Theory and Applications”, PowerPoint slides for “Advanced Topics in
Computer Vision” course, Weizmann Institute.
40
Appendix – Region based
Face Tracking
41
Mean shift (revised)
•
•
•
•
•
X – n-dimensional space
S - a finite set, the sample data
Kernel: K(x)=k(||x||2)
where k( ) is kernel profile
w : S → [0,∞) a weight function
The sample mean with kernel K at a point x from X:
• Mean shift is m(x) − x
• The repeated movement of data points to the sample
mean is called mean shift algorithm.
42
Mean shift (cont.)
• Let T be a finite set, and m(T) = {m(t) : t T}.
• The full mean shift procedure iterates and evolves T until
it finds a fixed point T = m(T).
• The weights w(s) can be fixed or re-evaluated after each
iteration and may also be a function of the current set T.
• Kernels define an influence zone for each point x in T
and can be scaled to modify their spatial extent.
43
Mean shift for tracking
• In object tracking, the evolving set T typically consists
of just one point, the object centroid.
• A sample corresponds to the spatial coordinates of a
pixel x, and has an associated sample weight w(x),
which defines how likely the pixel with color I(x) belongs
to an object model.
• The mean shifts seek the mode of the kernel density
computed with these weights.
• Implementation requires defining:
–
–
–
–
The kernel (scale and shape),
An object model,
The weight function
The shape of the final object.
44
Kernel selection considerations
• The basic mean shift requires isotropic kernels (e.g.
Epanechnikov or Gaussian) and assumes constant
object scale and orientation during the tracking
• However, objects may have complex shapes whose
scale and orientation constantly change. This leads to
using generalized kernels
45
Kernel selection considerations
• Two main parameters for Kernel selection are scale and
shape, both should be adjusted to the tracked object
• Scale:
– The kernel scale determines the size of the window where
sample weights are examined and is a crucial parameter in
the mean shift algorithm.
– Changes in the object scale require adjusting the kernel
bandwidth to consistently track the object.
• Shape:
– In the basic formulation, radially symmetric kernels which
are isotropic in shape are used. However, objects often
have anisotropic structure and, therefore, anisotropic
symmetric kernels like rectangles or ellipses are frequently
used.
46
Object model and weight image
• The tracked object is modeled as a class conditional
color distribution P(I(x)/O) that estimates, for each pixel
with color I(x), the probability of the color of the pixel,
given that the pixel belongs to the tracked object O.
– The object distribution is learned off-line from training images or
during the initialization.
– The model is commonly built with histograms in a particular color
space.
• The weight function measures, for each pixel, some
feature related to its similarity to the object model.
– Example: the object histogram is compared with a histogram
of colors observed within the current mean shift target window
– To adapt to background variation, the background
model is continuously recomputed.
47
Final shape definition
• The tracking output at each frame is usually the object
centroid and a rectangle which has the size of the last
iteration window. This rectangle is used as an estimate
of the object extent.
48
Region-based mean shift for tracking
• Approach of Vilaplana& Marques combines mean shift
with the use of regions.
• Regions are useful to compute the weight image and to
define precisely the contours of the tracked objects and
provide a natural mechanism to initialize the search in
the next frame.
• The algorithm works with pixels that lie within a subimage defined by a rectangular search window W and an
image partition P.
49
Region based method – Kernel selection
• Kernel scale:
– At each frame, the size of the rectangular
search window is defined as the size of the
bounding box of the object O found in the
previous frame, scaled by a fixed factor
(constant).
– The window size is the same for all iterations
within a frame, except for occasional cases
when the search window size is
underestimated.
50
Region based method – Kernel selection
• Shape:
– The image partition P is fitted to the search window W
to define the kernel shape. The kernel extent is
defined by all the regions R in partition P that are
completely included in W:
– At each iteration, the kernel scale changes according
to the size of the tracked object and its shape takes
into account the color homogeneity observed in the
image since it is defined by the regions in the partition
51
Region based partition and kernel
52
Example by Vilaplana& Marques
Object model and weight image
• The object is modeled as a class conditional color
distribution computed with a histogram in the YCbCr
color space.
– YCbCr is a more efficient way of encoding RGB information
• Given a pixel x with color I(x), the probability of the pixel
given the object is p(I(x)/O) = hO(I(x)), where hO is the
object histogram.
• The histogram is generated from the object segmented
in the first frame.
53
Object model and weight image (cont.)
• The weight w(x) associated to the pixel x is the
probability that the pixel represents the object, given its
color
– P(O) – probability that the pixel belongs to the object
– P(I(x)) – probability that the pixel has color I:
where p(B) is the probability that the pixel is part of the background
• Each region Ri in the fitted partition is assigned a weight
value, which is computed as the average of the
individual weights of the pixels that form that region:
54
Object model and weight update
• The object model p(I(x)/O) (i.e. object histogram) is
recomputed at each frame, using the object segmented
in the previous frame
• p(I(x)) which depends on the background, is estimated,
building a histogram hW of the pixels that are within the
current search window W, which is recomputed at every
iteration
– avoids tracking failure when the background scene changes.
• The value of p(O) is estimated as the ratio between the
sizes in pixels of the object detected in the previous
frame and the kernel.
55
Final shape definition
• Once the mean shift converges, the fitted partition in the
last search window is used to define the final shape of
the object (three steps procedure):
– Initial object mask
– Shape matching
– Final object mask
56
Example by Vilaplana& Marques
Region-based mean shift trackingResults
Region-based mean shift tracking is compared with basic mean shift and
demonstrate superior performance for the new method.
57
Download