Lecture 21 clustering

Intro. ANN & Fuzzy Systems Lecture 21 Clustering (2) Intro. ANN & Fuzzy Systems Outline • Similarity (Distance) Measures • Distortion Criteria Scattering Criterion • Hierarchical Clustering and other clustering methods (C) 2001-2003 by Yu Hen Hu 2 Intro. ANN & Fuzzy Systems Distance Measure • Distance Measure – What does it mean “Similar"?   ( xi  yi )    i 1  – Norm: d ( x, y ) || x  y ||m   N 1/ m m – Mahalanobis distance: d(x,y) = |x – y|TSxy1|x – y| – Angle: d(x,y) = xTy/(|x|•|y|) Binary and symbolic features (x, y contains 0, 1 only): – Tanimoto coefficient: d ( x, y )  (C) 2001-2003 by Yu Hen Hu xT y xT x  y T y 3 Intro. ANN & Fuzzy Systems Clustering Criteria • Is the current clustering assignment good enough? Most popular one is the mean-square error distortion measure c n D   I ( xk , i) || xk  W (i) ||2 i 1 k 1 c  1    i 1  N i  || x  y || ,  x , yc ( i )  2 N N i   I ( xk , i) k 1 • Other distortion measures can also be used:  1 D    i 1  N i c (C) 2001-2003 by Yu Hen Hu  d ( x, y)   x ; yC ( i )   1  D    Min. d ( x, y )  x ; yC ( i ) i 1  N i  c 4 Intro. ANN & Fuzzy Systems Scatter Matrics • Scatter matrices are defined in the context of analysis of variance in statistics. • They are used in linear discriminant analysis. • However, they can also be used to gauge the fitness of a particular clustering assignment. • Mean vector for i-th cluster: 1 mi  Ni N  I ( xk , i) xk k 1 • Total mean vector 1 c 1 N m   N i mi   xk N i 1 N k 1 • Scatter matrix for i-th cluster:  N Si   I ( xk , i) ( xk  mi )(xk  mi )T  k 1 • Within-cluster scatter matrix c SW   Si i 1 • Between-cluster scatter matrix c  S B   N i (mi  m)(mi  m)T  i 1 (C) 2001-2003 by Yu Hen Hu 5 Intro. ANN & Fuzzy Systems Scattering Criteria • Total scatter matrix: N  ST   ( xk  m)(xk  m)T  k 1  SW  S B • Note that the total scatter matrix is independent of the assignment I(xk,i). But … • SW and SB both depend on I(xk,i)! • Desired clustering property – SW small – SB large • How to gauge Sw is small or SB is large? There are several ways. • Tr. Sw (trace of SW): Let M SW   m vm vmT m 1 be the eigenvalue decomposition of SW, then M c m 1 i 1 Tr. SW   m   Tr.Si c N   I ( xk , i ) || xk  mi ||2  D i 1 k 1 (C) 2001-2003 by Yu Hen Hu 6 Intro. ANN & Fuzzy Systems Cluster Separating Measure (CSM) std = 0.3, csm = 1.6667 1.5 • Similar to scattering criteria. • csm = (mi-mj)/(i+j) • The larger its value, the more separable the two clusters. • Assume underlying data distribution is Gaussian. 1 0.5 0 -2 -1 0 1 std = 0.5, csm = 1 2 -1 0 1 std = 0.8, csm = 0.625 2 -1 2 1.5 1 0.5 0 -2 2 1.5 1 0.5 0 -2 (C) 2001-2003 by Yu Hen Hu 0 1 7 Intro. ANN & Fuzzy Systems Hierarchical Clustering • Merge Method: Initially, each xk is a cluster. During each iteration, nearest pair of distinct clusters are merged until the number of clusters is reduced to 1. • How to measure distance between two clusters: dmin(C(i), C(j)) = min. d(x,y); x  C(i), y  C(j)  leads to minimum spanning tree dmax(C(i), C(j)) = max. d(x,y); x  C(i), y  C(j) davg(C(i), C(j)) = 1 Ni N j   d ( x, y) xC ( i ) yC ( j ) dmean(C(i), C(j)) = mi– mj (C) 2001-2003 by Yu Hen Hu 8 Intro. ANN & Fuzzy Systems Hierarchical Clustering (II) Split method: • Initially, only one cluster. Iteratively, a cluster is splited into two or more clusters, until the total number of clusters reaches a predefined goal. • The scattering criterion can be used to decide how to split a given cluster into two or more clusters. • Another way is to perform a m-way clustering, using, say, k-means algorithm to split a cluster into m smaller clusters. (C) 2001-2003 by Yu Hen Hu 9

Lecture 21 clustering

Related documents

Products

Support

Lecture 21 clustering

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib