Max-margin Clustering: Detecting Margins from Projections of Points on Lines 1 Gopalan , 2 Sankaranarayanan Raghuraman and Jagan 1Center for Automation Research, University of Maryland, College Park, MD USA; 2NEC Labs, Cupertino, CA USA Multi-cluster Problem Max-margin Clustering Algorithm Location information of projected points (SI) alone is insufficient to detect margins Draw lines between all pairs of points Estimate the probability of presence of margins between a pair of points xi and xj by computing f(xi,xj) Perform global clustering using f between all point-pairs Results The Role of Distance of Projection Proposition 2 For line intervals in margin region, perpendicular to the separating hyperplane min Dmin min i Problem Statement Given an unlabelled set of points forming k clusters, find a grouping with maximum separating margin among the clusters Int * Prior work: (Mostly) Establish feedback between different label proposals, and run a supervised classifier on it Goal: To understand the relation between data points and margin regions by analyzing projections of data on lines Two-cluster Problem Assumptions Linearly separable clusters Kernel trick for non-linear case No outliers in data (max margin exist only between clusters) Enforce global cluster balance Proposition 1 SI* exists ONLY on line segments in margin region that are perpendicular to the separating hyperplane Such line segments directly provide cluster groupings Defn: Dmin of a line interval is the minimum distance of projection of points in that interval. No outlier assumption: Max margin between points within a cluster M m min i i i Proposition 3 For line intervals inside a cluster of length more than Mm max D M / 2 min m CL Int Proposition 4 An interval with SI having no projected points with distance of projection less than Dmin*, [ SI ]Dmin* min i can lie i only outside a cluster; where Dmin* min i i A Pair-wise Similarity Measure for Clustering f ( xi , x j ) exp( max D[SI ]D ) D:Intij f(xi,xj)=1, iff xi=xj f(xi,xj)<<1, iff xi and xj are from different clusters, and Intij is perpendicular to their separating hyperplane Summary Clustering Detecting margin regions Obtaining statistics of location and distance of projection of points that are specific to line segments in margin regions (Prop. 1 to 4) A pair-wise similarity measure to perform clustering, which avoids some optimization-related challenges prevalent in most existing methods References 1. F. De la Torre, and T. Kanade, “Discriminative cluster analysis”, ICML, pp. 241-248, 2006. ([8] in table) 2. K. Zhang, I.W. Tsang, and J.T. Kwok, “Maximum margin clustering made practical”, IEEE Trans. Neural Networks, 20(4), pp. 583-596, 2009. ([31] in table)