Nonlinear Mean Shift for Clustering over Analytic Manifolds Raghav Subbarao and Peter Meer Department of Electrical and Computer Engineering Rutgers University, Piscataway NJ 08854, USA rsubbara,meer@caip.rutgers.edu Abstract The mean shift algorithm is widely applied for nonparametric clustering in Euclidean spaces. Recently, mean shift was generalized for clustering on matrix Lie groups. We further extend the algorithm to a more general class of nonlinear spaces, the set of analytic manifolds. As examples, two specific classes of frequently occurring parameter spaces, Grassmann manifolds and Lie groups, are considered. When the algorithm proposed here is restricted to matrix Lie groups the previously proposed method is obtained. The algorithm is applied to a variety of robust motion segmentation problems and multibody factorization. The motion segmentation method is robust to outliers, does not require any prior specification of the number of independent motions and simultaneously estimates all the motions present. ifolds but not all analytic manifolds are Lie groups. In both motion models mentioned above, the parameter spaces are examples of Grassmann manifolds, but are not Lie groups. For such problems, the algorithm of [17] is not applicable. We propose a more general mean shift algorithm which applies to any analytic manifold. If the manifold under consideration is a matrix Lie group, our algorithm is the same as [17], therefore, [17] is a special case of the algorithm proposed here. The paper is organized as follows. In Section 2 we briefly discuss the relevant theory of analytic manifolds. Section 3 reviews the standard mean shift of [9] and in Section 4 we derive the general noneuclidean mean shift algorithm. We also present the computational details for two frequently occurring classes of parameter spaces, Grassmann manifolds and matrix Lie groups. In Section 5 we apply this algorithm to several motion segmentation problems. 2. Analytic Manifolds 1. Introduction The mean shift algorithm [6, 9] is a popular nonparametric clustering method. It has been applied to problems such as image segmentation [9], tracking [3, 7, 10, 12, 21] and robust fusion [5, 8]. The theoretical properties of mean shift are also studied, e.g., [13]. A major limitation of the original mean shift is that it can be applied only to points in Euclidean spaces. In [17] mean shift was extended to a particular class of nonlinear spaces, matrix Lie groups. The resulting algorithm was used to develop a robust motion segmentation method which simultaneously estimated the number of motions present and their parameters. Even restricting ourselves to motion segmentation problems, a number of motion models exist for which the parameter spaces are not Lie groups. Examples of such motions include 3D translations viewed from calibrated cameras [18] and multibody factorization [20]. A more general class of nonlinear spaces is the set of analytic manifolds. Lie groups are examples of analytic man- A manifold is a topological space that is locally similar (homeomorphic) to an Euclidean space. Intuitively, we can think of a manifold as a continuous surface lying in a higher dimensional Euclidean space. Analytic manifolds satisfy some further conditions of smoothness [4]. From now onwards, we restrict ourselves to analytic manifolds and by manifold we mean analytic manifold. The tangent space, Tx at x, is the plane tangent to the surface of the manifold at that point. The tangent space can be thought of as the set of allowable velocities for a point constrained to move on the manifold. For d-dimensional manifolds, the tangent space is a d-dimensional vector space. An example of a two-dimensional manifold embedded in R3 with the tangent space Tx , is shown in Figure 1. The solid arrow ∆, is a tangent at x. As Tx is a vector space, we can define an inner product gx . The inner product induces a norm for tangents ∆ ∈ Tx as k∆k2x = gx (∆, ∆). It should be noted that the inner product and norm vary with x and this dependence is indicated by the subscripts. Figure 1. Example of a manifold. The tangent space at the point x is also shown. The distance between two points on the manifold is given in terms of the lengths of curves between them. The length of any curve is defined by an integral over norms of tangents [2]. The curve with minimum length is known as the geodesic and the length of the geodesic is the intrinsic distance. Parameter spaces occurring in computer vision problems usually have well studies geometries and closed form formulae for the intrinsic distance are available. Tangents (on the tangent space) and geodesics (on the manifold) are closely related. For each tangent ∆ ∈ Tx , there is a unique geodesic starting at x with initial velocity ∆. The exponential map, expx , maps ∆ to the point on the manifold reached by this geodesic. The logarithm map is the inverse of the exponential map, logx = exp−1 x . The exponential and logarithm operators vary as the point x moves. These concepts are illustrated in Figure 1, where x, y are points on the manifold and ∆ ∈ Tx . The dotted line shows the geodesic starting at x and ending at y. This geodesic has an initial velocity ∆ and consequently, y and ∆ satisfy expx (∆) = y and logx (y) = ∆. The specific forms of these operators depend on the manifold and we discuss explicit formulae for them in later sections. The operator, expx is usually onto but not one-to-one. For any y on the manifold, if there exist many ∆ ∈ Tx satisfying expx (∆) = y, logx (y) is chosen as the tangent with the smallest norm. For a smooth, real valued function f defined on the manifold, the gradient of f at x, ∇f ∈ Tx , is defined to be the unique tangent vector satisfying gx (∇f, ∆) = ∂∆ f (1) for any ∆ ∈ Tx , where ∂∆ is the directional derivative along ∆. This gradient has the property of representing the tangent of maximum increase. Note, we represent the points on manifolds by small bold letters, e.g., x, y. In some of our examples, the manifold consists of matrices and each point represents a matrix. Although matrices are conventionally represented by capital bold letters, when we consider them to be points on a manifold, we denote them by small letters. This should not be a problem, since any matrix can be represented as a vector by rearranging its elements into a single column. A point on the Grassmann manifold, GN,k , represents a k dimensional subspace of RN and may be represented by an orthonormal basis as a N × k matrix, i.e., xT x = Ik×k . Since many basis span the same subspace, this representation of points on GN,k is not unique [11]. For Riemannian manifolds, it is possible to define gx and logx such that p (2) d(x, y) = gx (logx (y), logx (y)) = klogx (y)kx and such a metric is known as a Riemannian metric. Matrix Lie groups are subgroups of GL(n, R), which is the group of n × n nonsingular matrices. Matrix Lie groups are examples of Riemannian manifolds [15]. 3. Mean Shift Algorithm Given n data points xi , i = 1, . . . , n lying in the Euclidean space Rd , the kernel density estimate n ck,h X kx − xi k2 ˆ fk (x) = k (3) n i=1 h2 based on a profile function k satisfying k(z) > 0 z≥0 (4) is a nonparametric estimator of the density at x. The constant ck,h is chosen to ensure that fˆ integrates to 1. Define g(x) = −k 0 (x). Taking the gradient of (3) it can be shown Pn kx−xi k2 i=1 xi g h2 ∇fˆk (x) − x (5) mh (x) = C = P n kx−xi k2 fˆg (x) 2 i=1 g h where, C is a positive constant and mh (x) is the mean shift vector. The expression (5) shows that the mean shift vector is proportional to the normalized density gradient estimate. The iteration x(j+1) = mh (x(j) ) + x(j) (6) is a gradient ascent technique converging to a stationary point of the density. Saddle points can be detected and removed, to obtain only the modes. 4. Noneuclidean Mean Shift The weighted sum of points on the manifold is not well defined, and the mean shift vector of (5) is not valid. In this section, we will derive the mean shift vector as the weighted sum of tangent vectors. Since tangent spaces are vector spaces, a weighted average of tangents is possible and can be used to update the mode estimate. This method is valid over any analytic manifold. Consider a manifold with a metric d. Given n points on the manifold, xi , i = 1, . . . , n, the kernel density estimate with profile k and bandwidth h is 2 n d (x, xi ) ck,h X . k fˆk (x) = n i=1 h2 n = 1X ∇k n i=1 n 1X = − g n i=1 d2 (x, xi ) h2 d2 (x, xi ) h2 Given: Points on a manifold xi , i = 1, . . . , n for i ← 1 . . . n x ← xi repeat − mh (x) ← (7) The bandwidth h can be included in the distance function as a parameter. However, we write it in this form since it gives us a parameter to tune performance in applications. If the manifold is an Euclidean space with the Euclidean distance metric, (7) is the same as the Euclidean kernel density estimate of (3). Estimating ck,h is not always easy since it requires the integral of the profile over the manifold. Since a global scaling does not affect the position of the mode, we drop ck,h from now onwards. Calculating the gradient of fˆk at x, we get ∇fˆk (x) Algorithm: M EAN S HIFT OVER A NALYTIC M ANIFOLDS Pn i=1 ∇d2 (x,xi )g Pn i=1 g d2 (x,xi ) h2 d2 (x,x ) h2 i x ← expx (mh (x)) until kmh (x)k < Retain x as a local mode Report distinct local modes. A mean shift iteration is started at each data point by setting x = xi . The inner loop then iteratively updates x till convergence. This algorithm is valid for any manifold. A practical implementation requires the computation of the gradient vector ∇d2 (x, xi ) and the exponential operator expx . We now discuss this computation for two commonly occurring classes of manifolds. 4.1. Grassmann Manifolds ∇d2 (x, xi ) (8) h2 where, g(x) = −k 0 (x), like before. The gradient of the distance is taken with respect to x. Analogous to (5), which defined the mean shift vector for Euclidean spaces, define the noneuclidean mean shift vector as 2 Pn i) − i=1 ∇d2 (x, xi )g d (x,x 2 h 2 mh (x) = . (9) Pn d (x,xi ) i=1 g h2 All the operations in the above equation are well defined. The gradient terms, ∇d2 (x, xi ) lie in the tangent space Tx , and the kernel terms g(d2 (x, xi )/h2 ) are scalars. The mean shift vector is a weighted sum of tangent vectors, and is itself a tangent vector in Tx . The algorithm proceeds by moving the point along the geodesic defined by the mean shift vector. The noneuclidean mean shift iteration is x(j+1) = expx(j) mh (x(j) ) . (10) The iteration (10), updates x(j) by moving along the geodesic defined by the mean shift vector, to get the next estimate, x(j+1) . The complete algorithm is shown below. As mentioned before, the Grassmann manifold, GN,k , consists of N × k orthonormal matrices. Let f be a differentiable function defined on GN ×k . A closed form formula exists for the gradient of f , in terms of its Jacobian fx [11] ∇f (x) = fx − xxT fx = IN ×N − xxT fx . (11) As a metric, we use the arc length [1] d2 (x, xi ) = k − tr(xT xi xTi x) (12) where, k comes from the Grassmann manifold and tr is the trace. If we consider the distance function defined in (12) as function of x, and use it to replace f in (11), we get ∇d2 (x, xi ) = − IN ×N − xxT xi xTi x. (13) Using (13) to substitute for ∇d(x, xi )2 in the noneuclidean mean shift equation (9), we obtain 2 Pn d (x,xi ) IN ×N − xxT xi xTi x i=1 g h2 2 mh (x) = . (14) Pn d (x,xi ) 2 i=1 g h A tangent vector ∆ is represented as a N × k matrix and the exponential operator for Grassmann manifolds is [11] expx (∆) = x v diag(cos λ) vT + u diag(sin λ) vT T (15) where, u diag(λ) v is the compact SVD of ∆, which finds the k largest singular values and corresponding singular vectors. The sin and cos act element-by-element on the singular values. 4.2. Matrix Lie Groups Matrix Lie groups [15] occur frequently as parameter spaces in computer vision. The general nonlinear mean shift algorithm does not explicitly involve logx , but for matrix Lie groups we approximate the gradient ∇d2 (x, xi ) in terms of logx . Let exp and log be the matrix operators P∞ (16) exp(∆) = i=0 i!1 ∆i P∞ (−1)i−1 i log(y) = i=1 (y − e) . (17) i These are standard matrix operators which can be applied to any square matrix and no subscript is necessary to define them. They should not be confused with the manifold operators, expx and logx , which are given by expx (∆) = x exp x−1 ∆ (18) −1 logx (y) = x log x y (19) where, y is any point on the manifold and ∆ ∈ Tx . The distance function is given by d(x, y) = klog x−1 y kF (20) where, k.kF denotes the Frobenius norm of a matrix. As matrix Lie groups are Riemannian, this definition of d can be shown to be the distance corresponding to an inner product on Tx . A formula for the gradient ∇d2 (x, xi ), is rather complicated since we have to account for the variation of logx as x changes. However, by ignoring this dependence, we get ∇d2 (x, xi ) = −logx (xi ) (21) which, can be shown to be a first order approximation to the true gradient [17]. This derivation involves a nontrivial amount of algebra which we skip due to lack of space. Substituting this into (9), the mean shift vector for matrix Lie groups is 2 Pn d (x,xi ) logx (xi ) i=1 g h2 mh (x) = . (22) Pn d2 (x,xi ) i=1 g h2 Since the mean shift vector and the expx operator (18) are known, the algorithm for matrix Lie groups is the same as that proposed in [17]. 5. Experimental Results As an application of our algorithm we use it for robust motion segmentation and multibody factorization. Given a data set with multiple motions, the method discussed here simultaneously segments and estimates all the motions present and it requires no specification of the number of motions. To the best of our knowledge, no previous algorithm has tackled such a general problem. The class of RANSAC algorithms are robust and try to estimate a single motion at a time treating all other points as outliers. Alternatively, algorithms such as GPCA [18] estimate multiple motions simultaneously, but are not robust since they assume each point is an inlier for one of the motions. Both families of algorithms require that some knowledge of the number of motions be given by the user. We do not present any comparisons with other algorithms for two reasons. Firstly, motion segmentation is just one application of our novel mean shift algorithm. Our algorithm can also be used for other applications like robust fusion over manifolds. Secondly, we are not aware of any previous algorithm which is robust and simultaneously estimates all motions with no knowledge of the number of motions present. 5.1. Multiple Motion Estimation Our algorithm is similar to the method of [17] with the only difference being that the motion parameters we cluster are allowed to lie on any analytic manifold. The input to the algorithm consists of a set of point matches some of which are outliers. The algorithm has two stages. In the first stage, the matches are randomly sampled to generate elemental subsets. An elemental subset consists of the minimum number of points required to specify a motion hypotheses. Depending on the problem, various method can be used to generate motion hypotheses from the elemental subsets. A number of elemental subsets are chosen and each one is used to generate a motion hypotheses. The sampling and hypothesis generation can be improved by a validation step which reduces computation in the second stage [17]. In the second stage, the parameter estimates are clustered using the algorithm proposed here. The number of dominant modes gives the number of motions in the data. These modes correspond to the motion parameters. For each mode, we use the method of [16] to find the corresponding inliers. Since the inliers for each motion are decided independently, it is possible for a point to be assigned to two motions. In such a case the tie is broken by assigning it to the motion which gives a lower error. In all our experiments, the points were detected using a Harris corner detector and were matched across views using the method of [14]. For mean shift we used a Gaussian kernel and the bandwidth was chosen by a nearest neighbour procedure. 5.2. Error Measures We now discuss the error measures used to test the performance of our system on real data. The number of modes should be equal to the number of motions present. However, mot.hyp. 107 173 102 66 M1 M2 M3 M4 kde 0.0388 0.0338 0.0239 0.0199 M1 M2 M3 Out M1 26 0 0 0 M2 0 26 0 0 M3 0 0 26 0 Out 1 0 0 23 res 0.00112 0.00006 0.00179 ˆres 0.00080 0.00006 0.00136 Figure 2. Mean shift over G3,1 . 3D Translational Motion. In the left figure all the points are shown while on the right the inliers returned by the system. The table on the left contains the properties of the first four modes. Only the first three modes correspond to motions. The table on the right compares the results with the manual segmentations. mean shift returns all local maxima of the kernel density estimate. Therefore, for a data set with m motions the first m modes should clearly dominate the (m + 1)th mode, so that these extraneous modes can be pruned. We compare the segmentation returned by the algorithm with a manual segmentation of the motions. Fewer misclassifications imply better performance. Let ME be an estimated motion and pi be the vector containing the point coordinates across all frames. For an inlier and correct motion, a relation of the form ME (pi ) = 0 should hold. This is violated in practice due to the noise affecting pi . We measure the residual squared error as n res = 1X 2 |ME (pi )| . n i=1 (23) Lower errors imply better performance. In the best case, for each motion estimate, we expect this error to go as low as ˆres = n 2 1 X M̂(pi ) n i=1 (24) where, M̂ is the least squares estimate from the inliers, which by definition, minimizes the squared error. 5.3. 3D Translational Motion Matched points across two views satisfy the epipolar constraint. If the camera is calibrated, the image coordi- nates of the point can be converted to the normalized coordinates in 3D, and the epipolar constraint becomes the essential constraint. If the point has undergone only translation with respect to the camera, the essential constraint is of the form [18] tT (x1 × x2 ) = 0 (25) where, x1 and x2 are the normalized coordinates in the two frames and t is the direction of translation in 3D. Since the constraint is homogeneous, the translation can only be estimated upto scale and t represents a line in R3 [18]. A line is a one-dimensional subspace of R3 , so the translation is parameterized by the Grassmann manifold, G3,1 i.e. N = 3 and k = 1. An elemental subset for a motion consists of two point matches and the hypotheses can be generated by a cross product. The mean shift vector is given by (14). The motion segmentation on a real data set with three translations is shown in Figure 2. A total of 102 corners were detected on the objects in the first frame. Points on the background were identified as having zero displacement and removed. On matching these points we obtain 27, 26 and 26 inliers for the motions and 23 outliers. These outliers were due to mismatches by the point matcher. We generated 500 motion hypotheses and clustered on the manifold G3,1 . The results are tabulated in Figure 2. In the table on the left the number of hypotheses which converge to each mode and the kernel density at the mode are shown. Since the data M1 M2 M3 M4 mot.hyp. 209 695 52 12 kde 0.1315 0.0830 0.0165 0.0024 M1 M2 M3 Out M1 32 0 0 1 M2 0 21 0 0 M3 0 0 29 0 Out 0 0 0 57 res 7.0376 2.8520 4.2007 ˆres 5.0193 0.7627 3.1648 Figure 3. Mean shift over G10,3 . Multibody Factorization The left figure shows the first frame with all the points which are tracked. The middle and right images show the second and fifth frames with only the inliers. The table on the left contains the properties of the first four modes. Only the first three modes correspond to motions. The table on the right compares the results with the manual segmentations. set has three motions, there are three dominant modes with the fourth mode having fewer points. The segmentation results and motion estimates are shown on the right. Each row represents a motion and the row labeled Out represents outliers. The first four columns show the classification results. For example, the first row indicates of the 27 inliers for the first motion, 26 are correctly classified and one is misclassified as an outlier. Values along the diagonal are correctly classified, while off-diagonal values are misclassifications. The last two columns show the residual errors for our estimates, , and for the least squares estimate, ˆ. Our algorithm’s performance, with no knowledge of the seg- mentation, is comparable to the manually segmented least squares estimates. 5.4. Multibody Factorization The positions of points tracked over F frames of an uncalibrated affine camera define a feature vector in R2F . For points sharing the same motion, these vectors lie in a four dimensional subspace of R2F , and for planar scenes this subspace is only three dimensional [19]. In a scene with multiple independent motions, each motion defines a different subspace, which can be represented by a point in the Grassmann manifold G2F,3 i.e. N = 2F and k = 3. An M1 M2 M3 M4 mot.hyp. 61 210 82 11 kde 0.0550 0.0547 0.0468 0.0155 M1 M2 M3 Out M1 16 0 0 0 M2 0 15 0 0 M3 0 0 15 0 Out 0 3 1 30 res 2.7508 4.9426 3.2849 ˆres 2.7143 4.6717 3.0860 Figure 4. Mean shift over A(2). Affine motion segmentation. In the left figure all the points are shown, and on the right only the inliers are shown. The table on the left contains the properties of the first four modes. Only the first three modes correspond to motions. The table on the right compares the results with the manual segmentations. elemental subset consists of the feature vectors defined by 3 points tracked across F frames. The basis can be obtained through an SVD. The results of multibody factorization with three motions is shown in Figure 3. The system detected 140 corners in the first frame. Points on the background were identified as having zero displacement and removed. The rest of the corners were tracked across 5 frames, therefore, F = 5 and N = 10. The planar assumption holds due to negligible depth variation, and each motion defines a 3-dimensional subspace of R10 . The three motions contain 32, 21 and 29 points with 58 outliers. We generated 1000 hypotheses from these matches and clustered them on the manifold G10,3 . The results are organized like in the previous subsection. The kernel density at the fourth mode is an order of magnitude below the density at the third mode. The classification results are nearly perfect. One outlier is misclassified as an inlier. 5.5. Affine Motion The previous examples involved mean shift over Grassmann manifolds. We now present an example of a parameter space which is a Lie group. An affine image transformation is given by A b M= (26) 0T 1 where, A is a nondegenerate 2 × 2 matrix and b ∈ R2 . The set of all affine transformations, A(2), forms a matrix Lie group. An affine transformation has 6 parameters and each point match gives 2 constraints. Each elemental subset, therefore consists of 3 point matches. The motion hypotheses are generated using least squares. As A(2) is a matrix Lie group, the mean shift vector is given by (22). We used a data set of 80 corners matched across two frames with 3 independently moving bodies. Points on the background were identified as having zero displacement and removed. Some of the points on the background are occluded in the second image and are consequently mismatched. These points do not have zero displacements and survive as outliers in the data set. The motions contain 16, 18 and 16 inliers with 30 outliers. For clustering, 500 motion hypotheses were generated. The results of the experiment are shown in Figure 4. The images and the tables display a similar analysis as in the previous figures. Other common examples of Lie groups are the special Euclidean groups, SE(3) and SE(2), which correspond to rigid body transformations in 3D and 2D. Examples of mean shift over these spaces can be found in [17]. 6. Conclusions We extended mean shift from standard Euclidean spaces to a general class of nonlinear spaces, the set of analytic manifolds. Both, the standard mean shift and the extension of mean shift to matrix Lie groups [17], are special cases of our algorithm. The new algorithm can be applied to any analytic manifold whose geometry is understood such that function gradients can be computed and the exp operator is known. As an application of our algorithm, we applied it to motion segmentation problems. Our method offers the advantages that it is robust, does not require a prior specification of the number of structures present and simultaneously estimates and segments all the motions present. This method of clustering is not restricted to motion segmentation and can be applied to a number of other problems. For example, Grassmann manifolds occur frequently in linear regression and PCA. Regression in RN is equivalent to finding a one-dimensional subspace. For regression problems with multiple structures and outliers, we can use the above algorithm to find all the structures, by clustering over the manifold GN,1 . A more general version of regression is PCA, which tries to estimate a k dimensional subspace of RN . In problems where multiple subspaces are present, our algorithm can be used to cluster hypotheses over GN,k , and simultaneously estimate all the subspaces. Range image segmentation involves finding planes or second order surfaces from 3D point clouds. The parameters describing these geometrical objects lie on manifolds and a similar method can be used for segmenting them. Previously, Euclidean mean shift was also used for the robust fusion [5, 8]. The nonlinear mean shift algorithm proposed here can be applied in a similar manner for robust fusion of points lying on analytic manifolds. References [1] P.-A. Absil, R. Mahony, and R. Sepulchre, “Riemannian geometry of Grassmann manifolds with a view on algorithmic computation,” Acta Applicandae Mathematicae, vol. 80, no. 2, pp. 199–220, 2003. [2] E. Begelfor and M. Werman, “How to put probabilities on homographies,” IEEE Trans. Pattern Anal. Machine Intell., vol. 27, no. 10, pp. 1666–1670, 2005. [3] S. Birchfield and S. Rangarajan, “Spatiograms vs histograms for region-based tracking,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, San Diego, CA, vol. II, June 2005, pp. 1158–1163. [4] W. M. Boothby, An Introduction to Differentiable Manifolds and Riemannian Geometry. Academic Press, 2002. [5] H. Chen and P. Meer, “Robust fusion of uncertain information,” IEEE Trans. Systems, Man, Cybernetics-Part B, vol. 35, pp. 578–586, 2005. [6] Y. Cheng, “Mean shift, mode seeking, and clustering,” IEEE Trans. Pattern Anal. Machine Intell., vol. 17, pp. 790–799, 1995. [7] R. Collins, “Mean shift blob tracking through scale space,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Madison, Wisconsin, vol. II, 2003, pp. 234–240. [8] D. Comaniciu, “Variable bandwidth density-based fusion,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Madison, Wisconsin, vol. I, 2003, pp. 59–66. [9] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Machine Intell., vol. 24, pp. 603–619, May 2002. [10] D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Trans. Pattern Anal. Machine Intell., vol. 25, pp. 564–577, 2003. [11] A. Edelman, T. A. Arias, and S. T. Smith, “The geometry of algorithms with orthogonality constraints,” SIAM Journal on Matrix Analysis and Applications, vol. 20, no. 2, pp. 303– 353, 1998. [12] A. Elgammal, R. Duraiswami, and L. S. Davis, “Efficient kernel density estimation using the efficient kernel density estimation using the color modeling and tracking,” IEEE Trans. Pattern Anal. Machine Intell., vol. 25, no. 11, pp. 1499–1504, 2003. [13] M. Fashing and C. Tomasi, “Mean shift is a bound optimization,” IEEE Trans. Pattern Anal. Machine Intell., vol. 25, no. 3, pp. 471–474, 2005. [14] B. Georgescu and P. Meer, “Point matching under large image deformations and illumination changes,” IEEE Trans. Pattern Anal. Machine Intell., vol. 26, no. 6, pp. 674–689, 2004. [15] W. Rossmann, Lie Groups: An Introduction through Linear Groups. Oxford University Press, 2003. [16] R. Subbarao and P. Meer, “Heteroscedastic projection based M-estimators,” in Workshop on Empirical Evaluation Methods in Computer Vision, San Diego, CA, June 2005. [17] O. Tuzel, R. Subbarao, and P. Meer, “Simultaneous multiple 3D motion estimation via mode finding on Lie groups,” in Proc. 8th Intl. Conf. on Computer Vision, Beijing, China, vol. 1, 2005, pp. 18–25. [18] R. Vidal and Y. Ma, “A unified algebraic approach to 2-D and 3-D motion segmentation,” in 8th European Conference on Computer Vision, vol. I, 2004, pp. 1–15. [19] Y.Sugaya and K. Kanatani, “Geometric structure of degeneracy for multi-body motion segmentation,” in The 2nd Workshop on Statistical Methods in Video Processing (SMVP 2004), no. 3247 in LNCS, pp. 13–25, December 2004. [20] L. Zelnik-Manor and M. Irani, “Degeneracies, dependencies and their implications in multi-body and multi-sequence factorizations,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Madison, Wisconsin, vol. II, 2003, pp. 287–293. [21] Z. Zivkovic and B. Krose, “An EM-like algorithm for colorhistogram-based object tracking,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Washington, DC, vol. I, June 2004, pp. 798–803.