Nonlinear Mean Shift for Clustering over Analytic Manifolds

advertisement
Nonlinear Mean Shift for Clustering over Analytic Manifolds
Raghav Subbarao and Peter Meer
Department of Electrical and Computer Engineering
Rutgers University, Piscataway
NJ 08854, USA
rsubbara,meer@caip.rutgers.edu
Abstract
The mean shift algorithm is widely applied for nonparametric clustering in Euclidean spaces. Recently, mean shift
was generalized for clustering on matrix Lie groups. We
further extend the algorithm to a more general class of
nonlinear spaces, the set of analytic manifolds. As examples, two specific classes of frequently occurring parameter
spaces, Grassmann manifolds and Lie groups, are considered. When the algorithm proposed here is restricted to matrix Lie groups the previously proposed method is obtained.
The algorithm is applied to a variety of robust motion segmentation problems and multibody factorization. The motion segmentation method is robust to outliers, does not
require any prior specification of the number of independent motions and simultaneously estimates all the motions
present.
ifolds but not all analytic manifolds are Lie groups. In both
motion models mentioned above, the parameter spaces are
examples of Grassmann manifolds, but are not Lie groups.
For such problems, the algorithm of [17] is not applicable.
We propose a more general mean shift algorithm which
applies to any analytic manifold. If the manifold under consideration is a matrix Lie group, our algorithm is the same
as [17], therefore, [17] is a special case of the algorithm
proposed here.
The paper is organized as follows. In Section 2 we
briefly discuss the relevant theory of analytic manifolds.
Section 3 reviews the standard mean shift of [9] and in Section 4 we derive the general noneuclidean mean shift algorithm. We also present the computational details for two frequently occurring classes of parameter spaces, Grassmann
manifolds and matrix Lie groups. In Section 5 we apply this
algorithm to several motion segmentation problems.
2. Analytic Manifolds
1. Introduction
The mean shift algorithm [6, 9] is a popular nonparametric clustering method. It has been applied to problems
such as image segmentation [9], tracking [3, 7, 10, 12, 21]
and robust fusion [5, 8]. The theoretical properties of mean
shift are also studied, e.g., [13].
A major limitation of the original mean shift is that it can
be applied only to points in Euclidean spaces. In [17] mean
shift was extended to a particular class of nonlinear spaces,
matrix Lie groups. The resulting algorithm was used to
develop a robust motion segmentation method which simultaneously estimated the number of motions present and
their parameters. Even restricting ourselves to motion segmentation problems, a number of motion models exist for
which the parameter spaces are not Lie groups. Examples
of such motions include 3D translations viewed from calibrated cameras [18] and multibody factorization [20].
A more general class of nonlinear spaces is the set of analytic manifolds. Lie groups are examples of analytic man-
A manifold is a topological space that is locally similar
(homeomorphic) to an Euclidean space. Intuitively, we can
think of a manifold as a continuous surface lying in a higher
dimensional Euclidean space. Analytic manifolds satisfy
some further conditions of smoothness [4]. From now onwards, we restrict ourselves to analytic manifolds and by
manifold we mean analytic manifold.
The tangent space, Tx at x, is the plane tangent to the
surface of the manifold at that point. The tangent space can
be thought of as the set of allowable velocities for a point
constrained to move on the manifold. For d-dimensional
manifolds, the tangent space is a d-dimensional vector
space. An example of a two-dimensional manifold embedded in R3 with the tangent space Tx , is shown in Figure
1. The solid arrow ∆, is a tangent at x. As Tx is a vector
space, we can define an inner product gx . The inner product
induces a norm for tangents ∆ ∈ Tx as k∆k2x = gx (∆, ∆).
It should be noted that the inner product and norm vary with
x and this dependence is indicated by the subscripts.
Figure 1. Example of a manifold. The tangent space at the point x
is also shown.
The distance between two points on the manifold is given
in terms of the lengths of curves between them. The length
of any curve is defined by an integral over norms of tangents [2]. The curve with minimum length is known as the
geodesic and the length of the geodesic is the intrinsic distance. Parameter spaces occurring in computer vision problems usually have well studies geometries and closed form
formulae for the intrinsic distance are available.
Tangents (on the tangent space) and geodesics (on the
manifold) are closely related. For each tangent ∆ ∈ Tx ,
there is a unique geodesic starting at x with initial velocity ∆. The exponential map, expx , maps ∆ to the point
on the manifold reached by this geodesic. The logarithm
map is the inverse of the exponential map, logx = exp−1
x .
The exponential and logarithm operators vary as the point
x moves. These concepts are illustrated in Figure 1, where
x, y are points on the manifold and ∆ ∈ Tx . The dotted
line shows the geodesic starting at x and ending at y. This
geodesic has an initial velocity ∆ and consequently, y and
∆ satisfy expx (∆) = y and logx (y) = ∆. The specific
forms of these operators depend on the manifold and we
discuss explicit formulae for them in later sections.
The operator, expx is usually onto but not one-to-one.
For any y on the manifold, if there exist many ∆ ∈ Tx
satisfying expx (∆) = y, logx (y) is chosen as the tangent
with the smallest norm.
For a smooth, real valued function f defined on the manifold, the gradient of f at x, ∇f ∈ Tx , is defined to be the
unique tangent vector satisfying
gx (∇f, ∆) = ∂∆ f
(1)
for any ∆ ∈ Tx , where ∂∆ is the directional derivative
along ∆. This gradient has the property of representing the
tangent of maximum increase.
Note, we represent the points on manifolds by small bold
letters, e.g., x, y. In some of our examples, the manifold
consists of matrices and each point represents a matrix. Although matrices are conventionally represented by capital
bold letters, when we consider them to be points on a manifold, we denote them by small letters. This should not be a
problem, since any matrix can be represented as a vector by
rearranging its elements into a single column.
A point on the Grassmann manifold, GN,k , represents a
k dimensional subspace of RN and may be represented by
an orthonormal basis as a N × k matrix, i.e., xT x = Ik×k .
Since many basis span the same subspace, this representation of points on GN,k is not unique [11].
For Riemannian manifolds, it is possible to define gx and
logx such that
p
(2)
d(x, y) = gx (logx (y), logx (y)) = klogx (y)kx
and such a metric is known as a Riemannian metric.
Matrix Lie groups are subgroups of GL(n, R), which is
the group of n × n nonsingular matrices. Matrix Lie groups
are examples of Riemannian manifolds [15].
3. Mean Shift Algorithm
Given n data points xi , i = 1, . . . , n lying in the Euclidean space Rd , the kernel density estimate
n
ck,h X
kx − xi k2
ˆ
fk (x) =
k
(3)
n i=1
h2
based on a profile function k satisfying
k(z) > 0
z≥0
(4)
is a nonparametric estimator of the density at x. The constant ck,h is chosen to ensure that fˆ integrates to 1. Define
g(x) = −k 0 (x). Taking the gradient of (3) it can be shown
Pn
kx−xi k2
i=1 xi g
h2
∇fˆk (x)
− x (5)
mh (x) = C
= P
n
kx−xi k2
fˆg (x)
2
i=1 g
h
where, C is a positive constant and mh (x) is the mean shift
vector. The expression (5) shows that the mean shift vector
is proportional to the normalized density gradient estimate.
The iteration
x(j+1) = mh (x(j) ) + x(j)
(6)
is a gradient ascent technique converging to a stationary
point of the density. Saddle points can be detected and removed, to obtain only the modes.
4. Noneuclidean Mean Shift
The weighted sum of points on the manifold is not well
defined, and the mean shift vector of (5) is not valid. In this
section, we will derive the mean shift vector as the weighted
sum of tangent vectors. Since tangent spaces are vector
spaces, a weighted average of tangents is possible and can
be used to update the mode estimate. This method is valid
over any analytic manifold.
Consider a manifold with a metric d. Given n points on
the manifold, xi , i = 1, . . . , n, the kernel density estimate
with profile k and bandwidth h is
2
n
d (x, xi )
ck,h X
.
k
fˆk (x) =
n i=1
h2
n
=
1X
∇k
n i=1
n
1X
= −
g
n i=1
d2 (x, xi )
h2
d2 (x, xi )
h2
Given: Points on a manifold xi , i = 1, . . . , n
for i ← 1 . . . n
x ← xi
repeat
−
mh (x) ←
(7)
The bandwidth h can be included in the distance function
as a parameter. However, we write it in this form since it
gives us a parameter to tune performance in applications. If
the manifold is an Euclidean space with the Euclidean distance metric, (7) is the same as the Euclidean kernel density
estimate of (3).
Estimating ck,h is not always easy since it requires the
integral of the profile over the manifold. Since a global
scaling does not affect the position of the mode, we drop
ck,h from now onwards.
Calculating the gradient of fˆk at x, we get
∇fˆk (x)
Algorithm: M EAN S HIFT OVER A NALYTIC M ANIFOLDS
Pn
i=1
∇d2 (x,xi )g
Pn
i=1
g
d2 (x,xi )
h2
d2 (x,x ) h2
i
x ← expx (mh (x))
until kmh (x)k < Retain x as a local mode
Report distinct local modes.
A mean shift iteration is started at each data point by setting x = xi . The inner loop then iteratively updates x till
convergence.
This algorithm is valid for any manifold. A practical
implementation requires the computation of the gradient
vector ∇d2 (x, xi ) and the exponential operator expx . We
now discuss this computation for two commonly occurring
classes of manifolds.
4.1. Grassmann Manifolds
∇d2 (x, xi )
(8)
h2
where, g(x) = −k 0 (x), like before. The gradient of the
distance is taken with respect to x. Analogous to (5), which
defined the mean shift vector for Euclidean spaces, define
the noneuclidean mean shift vector as
2
Pn
i)
− i=1 ∇d2 (x, xi )g d (x,x
2
h
2
mh (x) =
. (9)
Pn
d (x,xi )
i=1 g
h2
All the operations in the above equation are well defined.
The gradient terms, ∇d2 (x, xi ) lie in the tangent space Tx ,
and the kernel terms g(d2 (x, xi )/h2 ) are scalars. The mean
shift vector is a weighted sum of tangent vectors, and is
itself a tangent vector in Tx . The algorithm proceeds by
moving the point along the geodesic defined by the mean
shift vector. The noneuclidean mean shift iteration is
x(j+1) = expx(j) mh (x(j) ) .
(10)
The iteration (10), updates x(j) by moving along the
geodesic defined by the mean shift vector, to get the next
estimate, x(j+1) . The complete algorithm is shown below.
As mentioned before, the Grassmann manifold, GN,k ,
consists of N × k orthonormal matrices. Let f be a differentiable function defined on GN ×k . A closed form formula
exists for the gradient of f , in terms of its Jacobian fx [11]
∇f (x) = fx − xxT fx = IN ×N − xxT fx .
(11)
As a metric, we use the arc length [1]
d2 (x, xi ) = k − tr(xT xi xTi x)
(12)
where, k comes from the Grassmann manifold and tr is the
trace. If we consider the distance function defined in (12)
as function of x, and use it to replace f in (11), we get
∇d2 (x, xi ) = − IN ×N − xxT xi xTi x.
(13)
Using (13) to substitute for ∇d(x, xi )2 in the noneuclidean
mean shift equation (9), we obtain
2
Pn
d (x,xi )
IN ×N − xxT xi xTi x
i=1 g
h2
2
mh (x) =
. (14)
Pn
d (x,xi )
2
i=1 g
h
A tangent vector ∆ is represented as a N × k matrix and
the exponential operator for Grassmann manifolds is [11]
expx (∆) = x v diag(cos λ) vT + u diag(sin λ) vT
T
(15)
where, u diag(λ) v is the compact SVD of ∆, which
finds the k largest singular values and corresponding singular vectors. The sin and cos act element-by-element on the
singular values.
4.2. Matrix Lie Groups
Matrix Lie groups [15] occur frequently as parameter
spaces in computer vision. The general nonlinear mean
shift algorithm does not explicitly involve logx , but for matrix Lie groups we approximate the gradient ∇d2 (x, xi ) in
terms of logx . Let exp and log be the matrix operators
P∞
(16)
exp(∆) = i=0 i!1 ∆i
P∞ (−1)i−1
i
log(y) = i=1
(y − e) .
(17)
i
These are standard matrix operators which can be applied
to any square matrix and no subscript is necessary to define them. They should not be confused with the manifold
operators, expx and logx , which are given by
expx (∆) = x exp x−1 ∆
(18)
−1
logx (y) = x log x y
(19)
where, y is any point on the manifold and ∆ ∈ Tx . The
distance function is given by
d(x, y) = klog x−1 y kF
(20)
where, k.kF denotes the Frobenius norm of a matrix. As
matrix Lie groups are Riemannian, this definition of d can
be shown to be the distance corresponding to an inner product on Tx .
A formula for the gradient ∇d2 (x, xi ), is rather complicated since we have to account for the variation of logx as
x changes. However, by ignoring this dependence, we get
∇d2 (x, xi ) = −logx (xi )
(21)
which, can be shown to be a first order approximation to
the true gradient [17]. This derivation involves a nontrivial
amount of algebra which we skip due to lack of space. Substituting this into (9), the mean shift vector for matrix Lie
groups is
2
Pn
d (x,xi )
logx (xi )
i=1 g
h2
mh (x) =
.
(22)
Pn
d2 (x,xi )
i=1 g
h2
Since the mean shift vector and the expx operator (18) are
known, the algorithm for matrix Lie groups is the same as
that proposed in [17].
5. Experimental Results
As an application of our algorithm we use it for robust
motion segmentation and multibody factorization. Given
a data set with multiple motions, the method discussed
here simultaneously segments and estimates all the motions
present and it requires no specification of the number of motions.
To the best of our knowledge, no previous algorithm has
tackled such a general problem. The class of RANSAC algorithms are robust and try to estimate a single motion at
a time treating all other points as outliers. Alternatively,
algorithms such as GPCA [18] estimate multiple motions
simultaneously, but are not robust since they assume each
point is an inlier for one of the motions. Both families of
algorithms require that some knowledge of the number of
motions be given by the user.
We do not present any comparisons with other algorithms for two reasons. Firstly, motion segmentation is just
one application of our novel mean shift algorithm. Our algorithm can also be used for other applications like robust
fusion over manifolds. Secondly, we are not aware of any
previous algorithm which is robust and simultaneously estimates all motions with no knowledge of the number of
motions present.
5.1. Multiple Motion Estimation
Our algorithm is similar to the method of [17] with the
only difference being that the motion parameters we cluster
are allowed to lie on any analytic manifold.
The input to the algorithm consists of a set of point
matches some of which are outliers. The algorithm has two
stages. In the first stage, the matches are randomly sampled
to generate elemental subsets. An elemental subset consists of the minimum number of points required to specify
a motion hypotheses. Depending on the problem, various
method can be used to generate motion hypotheses from the
elemental subsets. A number of elemental subsets are chosen and each one is used to generate a motion hypotheses.
The sampling and hypothesis generation can be improved
by a validation step which reduces computation in the second stage [17].
In the second stage, the parameter estimates are clustered using the algorithm proposed here. The number of
dominant modes gives the number of motions in the data.
These modes correspond to the motion parameters. For
each mode, we use the method of [16] to find the corresponding inliers. Since the inliers for each motion are decided independently, it is possible for a point to be assigned
to two motions. In such a case the tie is broken by assigning
it to the motion which gives a lower error.
In all our experiments, the points were detected using a
Harris corner detector and were matched across views using
the method of [14]. For mean shift we used a Gaussian kernel and the bandwidth was chosen by a nearest neighbour
procedure.
5.2. Error Measures
We now discuss the error measures used to test the performance of our system on real data. The number of modes
should be equal to the number of motions present. However,
mot.hyp.
107
173
102
66
M1
M2
M3
M4
kde
0.0388
0.0338
0.0239
0.0199
M1
M2
M3
Out
M1
26
0
0
0
M2
0
26
0
0
M3
0
0
26
0
Out
1
0
0
23
res
0.00112
0.00006
0.00179
ˆres
0.00080
0.00006
0.00136
Figure 2. Mean shift over G3,1 . 3D Translational Motion. In the left figure all the points are shown while on the right the inliers returned
by the system. The table on the left contains the properties of the first four modes. Only the first three modes correspond to motions. The
table on the right compares the results with the manual segmentations.
mean shift returns all local maxima of the kernel density estimate. Therefore, for a data set with m motions the first m
modes should clearly dominate the (m + 1)th mode, so that
these extraneous modes can be pruned.
We compare the segmentation returned by the algorithm
with a manual segmentation of the motions. Fewer misclassifications imply better performance.
Let ME be an estimated motion and pi be the vector containing the point coordinates across all frames. For an inlier
and correct motion, a relation of the form ME (pi ) = 0
should hold. This is violated in practice due to the noise
affecting pi . We measure the residual squared error as
n
res =
1X
2
|ME (pi )| .
n i=1
(23)
Lower errors imply better performance. In the best case, for
each motion estimate, we expect this error to go as low as
ˆres =
n
2
1 X M̂(pi )
n i=1
(24)
where, M̂ is the least squares estimate from the inliers,
which by definition, minimizes the squared error.
5.3. 3D Translational Motion
Matched points across two views satisfy the epipolar
constraint. If the camera is calibrated, the image coordi-
nates of the point can be converted to the normalized coordinates in 3D, and the epipolar constraint becomes the
essential constraint. If the point has undergone only translation with respect to the camera, the essential constraint is
of the form [18]
tT (x1 × x2 ) = 0
(25)
where, x1 and x2 are the normalized coordinates in the two
frames and t is the direction of translation in 3D. Since the
constraint is homogeneous, the translation can only be estimated upto scale and t represents a line in R3 [18]. A line
is a one-dimensional subspace of R3 , so the translation is
parameterized by the Grassmann manifold, G3,1 i.e. N = 3
and k = 1. An elemental subset for a motion consists of
two point matches and the hypotheses can be generated by
a cross product. The mean shift vector is given by (14).
The motion segmentation on a real data set with three
translations is shown in Figure 2. A total of 102 corners
were detected on the objects in the first frame. Points on
the background were identified as having zero displacement
and removed. On matching these points we obtain 27, 26
and 26 inliers for the motions and 23 outliers. These outliers
were due to mismatches by the point matcher. We generated
500 motion hypotheses and clustered on the manifold G3,1 .
The results are tabulated in Figure 2. In the table on the
left the number of hypotheses which converge to each mode
and the kernel density at the mode are shown. Since the data
M1
M2
M3
M4
mot.hyp.
209
695
52
12
kde
0.1315
0.0830
0.0165
0.0024
M1
M2
M3
Out
M1
32
0
0
1
M2
0
21
0
0
M3
0
0
29
0
Out
0
0
0
57
res
7.0376
2.8520
4.2007
ˆres
5.0193
0.7627
3.1648
Figure 3. Mean shift over G10,3 . Multibody Factorization The left figure shows the first frame with all the points which are tracked. The
middle and right images show the second and fifth frames with only the inliers. The table on the left contains the properties of the first four
modes. Only the first three modes correspond to motions. The table on the right compares the results with the manual segmentations.
set has three motions, there are three dominant modes with
the fourth mode having fewer points. The segmentation results and motion estimates are shown on the right. Each
row represents a motion and the row labeled Out represents
outliers. The first four columns show the classification results. For example, the first row indicates of the 27 inliers
for the first motion, 26 are correctly classified and one is
misclassified as an outlier. Values along the diagonal are
correctly classified, while off-diagonal values are misclassifications. The last two columns show the residual errors
for our estimates, , and for the least squares estimate, ˆ.
Our algorithm’s performance, with no knowledge of the seg-
mentation, is comparable to the manually segmented least
squares estimates.
5.4. Multibody Factorization
The positions of points tracked over F frames of an uncalibrated affine camera define a feature vector in R2F . For
points sharing the same motion, these vectors lie in a four
dimensional subspace of R2F , and for planar scenes this
subspace is only three dimensional [19]. In a scene with
multiple independent motions, each motion defines a different subspace, which can be represented by a point in the
Grassmann manifold G2F,3 i.e. N = 2F and k = 3. An
M1
M2
M3
M4
mot.hyp.
61
210
82
11
kde
0.0550
0.0547
0.0468
0.0155
M1
M2
M3
Out
M1
16
0
0
0
M2
0
15
0
0
M3
0
0
15
0
Out
0
3
1
30
res
2.7508
4.9426
3.2849
ˆres
2.7143
4.6717
3.0860
Figure 4. Mean shift over A(2). Affine motion segmentation. In the left figure all the points are shown, and on the right only the inliers are
shown. The table on the left contains the properties of the first four modes. Only the first three modes correspond to motions. The table on
the right compares the results with the manual segmentations.
elemental subset consists of the feature vectors defined by 3
points tracked across F frames. The basis can be obtained
through an SVD.
The results of multibody factorization with three motions
is shown in Figure 3. The system detected 140 corners in
the first frame. Points on the background were identified
as having zero displacement and removed. The rest of the
corners were tracked across 5 frames, therefore, F = 5 and
N = 10. The planar assumption holds due to negligible
depth variation, and each motion defines a 3-dimensional
subspace of R10 . The three motions contain 32, 21 and 29
points with 58 outliers. We generated 1000 hypotheses from
these matches and clustered them on the manifold G10,3 .
The results are organized like in the previous subsection.
The kernel density at the fourth mode is an order of magnitude below the density at the third mode. The classification
results are nearly perfect. One outlier is misclassified as an
inlier.
5.5. Affine Motion
The previous examples involved mean shift over Grassmann manifolds. We now present an example of a parameter space which is a Lie group.
An affine image transformation is given by
A b
M=
(26)
0T 1
where, A is a nondegenerate 2 × 2 matrix and b ∈ R2 .
The set of all affine transformations, A(2), forms a matrix
Lie group. An affine transformation has 6 parameters and
each point match gives 2 constraints. Each elemental subset, therefore consists of 3 point matches. The motion hypotheses are generated using least squares. As A(2) is a
matrix Lie group, the mean shift vector is given by (22).
We used a data set of 80 corners matched across two
frames with 3 independently moving bodies. Points on the
background were identified as having zero displacement
and removed. Some of the points on the background are
occluded in the second image and are consequently mismatched. These points do not have zero displacements and
survive as outliers in the data set. The motions contain 16,
18 and 16 inliers with 30 outliers. For clustering, 500 motion hypotheses were generated. The results of the experiment are shown in Figure 4. The images and the tables
display a similar analysis as in the previous figures.
Other common examples of Lie groups are the special
Euclidean groups, SE(3) and SE(2), which correspond
to rigid body transformations in 3D and 2D. Examples of
mean shift over these spaces can be found in [17].
6. Conclusions
We extended mean shift from standard Euclidean spaces
to a general class of nonlinear spaces, the set of analytic
manifolds. Both, the standard mean shift and the extension
of mean shift to matrix Lie groups [17], are special cases
of our algorithm. The new algorithm can be applied to any
analytic manifold whose geometry is understood such that
function gradients can be computed and the exp operator is
known.
As an application of our algorithm, we applied it to motion segmentation problems. Our method offers the advantages that it is robust, does not require a prior specification
of the number of structures present and simultaneously estimates and segments all the motions present.
This method of clustering is not restricted to motion segmentation and can be applied to a number of other problems. For example, Grassmann manifolds occur frequently
in linear regression and PCA. Regression in RN is equivalent to finding a one-dimensional subspace. For regression
problems with multiple structures and outliers, we can use
the above algorithm to find all the structures, by clustering
over the manifold GN,1 . A more general version of regression is PCA, which tries to estimate a k dimensional subspace of RN . In problems where multiple subspaces are
present, our algorithm can be used to cluster hypotheses
over GN,k , and simultaneously estimate all the subspaces.
Range image segmentation involves finding planes or
second order surfaces from 3D point clouds. The parameters describing these geometrical objects lie on manifolds
and a similar method can be used for segmenting them.
Previously, Euclidean mean shift was also used for the
robust fusion [5, 8]. The nonlinear mean shift algorithm
proposed here can be applied in a similar manner for robust
fusion of points lying on analytic manifolds.
References
[1] P.-A. Absil, R. Mahony, and R. Sepulchre, “Riemannian geometry of Grassmann manifolds with a view on algorithmic computation,” Acta Applicandae Mathematicae, vol. 80,
no. 2, pp. 199–220, 2003.
[2] E. Begelfor and M. Werman, “How to put probabilities on
homographies,” IEEE Trans. Pattern Anal. Machine Intell.,
vol. 27, no. 10, pp. 1666–1670, 2005.
[3] S. Birchfield and S. Rangarajan, “Spatiograms vs histograms
for region-based tracking,” in Proc. IEEE Conf. on Computer
Vision and Pattern Recognition, San Diego, CA, vol. II, June
2005, pp. 1158–1163.
[4] W. M. Boothby, An Introduction to Differentiable Manifolds
and Riemannian Geometry. Academic Press, 2002.
[5] H. Chen and P. Meer, “Robust fusion of uncertain information,” IEEE Trans. Systems, Man, Cybernetics-Part B,
vol. 35, pp. 578–586, 2005.
[6] Y. Cheng, “Mean shift, mode seeking, and clustering,” IEEE
Trans. Pattern Anal. Machine Intell., vol. 17, pp. 790–799,
1995.
[7] R. Collins, “Mean shift blob tracking through scale space,”
in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Madison, Wisconsin, vol. II, 2003, pp. 234–240.
[8] D. Comaniciu, “Variable bandwidth density-based fusion,”
in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Madison, Wisconsin, vol. I, 2003, pp. 59–66.
[9] D. Comaniciu and P. Meer, “Mean shift: A robust approach
toward feature space analysis,” IEEE Trans. Pattern Anal.
Machine Intell., vol. 24, pp. 603–619, May 2002.
[10] D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Trans. Pattern Anal. Machine Intell.,
vol. 25, pp. 564–577, 2003.
[11] A. Edelman, T. A. Arias, and S. T. Smith, “The geometry of
algorithms with orthogonality constraints,” SIAM Journal on
Matrix Analysis and Applications, vol. 20, no. 2, pp. 303–
353, 1998.
[12] A. Elgammal, R. Duraiswami, and L. S. Davis, “Efficient
kernel density estimation using the efficient kernel density
estimation using the color modeling and tracking,” IEEE
Trans. Pattern Anal. Machine Intell., vol. 25, no. 11, pp.
1499–1504, 2003.
[13] M. Fashing and C. Tomasi, “Mean shift is a bound optimization,” IEEE Trans. Pattern Anal. Machine Intell., vol. 25,
no. 3, pp. 471–474, 2005.
[14] B. Georgescu and P. Meer, “Point matching under large image deformations and illumination changes,” IEEE Trans.
Pattern Anal. Machine Intell., vol. 26, no. 6, pp. 674–689,
2004.
[15] W. Rossmann, Lie Groups: An Introduction through Linear
Groups. Oxford University Press, 2003.
[16] R. Subbarao and P. Meer, “Heteroscedastic projection based
M-estimators,” in Workshop on Empirical Evaluation Methods in Computer Vision, San Diego, CA, June 2005.
[17] O. Tuzel, R. Subbarao, and P. Meer, “Simultaneous multiple 3D motion estimation via mode finding on Lie groups,”
in Proc. 8th Intl. Conf. on Computer Vision, Beijing, China,
vol. 1, 2005, pp. 18–25.
[18] R. Vidal and Y. Ma, “A unified algebraic approach to 2-D
and 3-D motion segmentation,” in 8th European Conference
on Computer Vision, vol. I, 2004, pp. 1–15.
[19] Y.Sugaya and K. Kanatani, “Geometric structure of degeneracy for multi-body motion segmentation,” in The 2nd Workshop on Statistical Methods in Video Processing (SMVP
2004), no. 3247 in LNCS, pp. 13–25, December 2004.
[20] L. Zelnik-Manor and M. Irani, “Degeneracies, dependencies
and their implications in multi-body and multi-sequence factorizations,” in Proc. IEEE Conf. on Computer Vision and
Pattern Recognition, Madison, Wisconsin, vol. II, 2003, pp.
287–293.
[21] Z. Zivkovic and B. Krose, “An EM-like algorithm for colorhistogram-based object tracking,” in Proc. IEEE Conf. on
Computer Vision and Pattern Recognition, Washington, DC,
vol. I, June 2004, pp. 798–803.
Download