Clustering, Distance Methods & Ordination: Lecture Notes

1 Chapter 12 Clustering, Distance Method, and Ordination 12.2 Similarity Measures Commonly used distance: Euclidean Distance: d  x, y   x1  y1 2  x2  y 2 2    x p  y p 2  x  y t x  y  Statistical Distance: d x, y   x  y t S 1 x  y  , where S is the sample variance-covariance matrix. Minkowski Distance:  p m d  x, y    xi  yi   i1  1 m . Canberra Metric: p d  x, y    i 1 xi  yi xi  yi  . Czekanowski coefficient: p d  x, y   1  2 min  xi , yi  i 1 p  x  y  i 1 i . i 12.3 Hierarchical Clustering Methods Agglomerative Hierarchical Clustering Algorithm (Grouping N Objects): 1. Start with N clusters, each containing a single entity and an N  N symmetric matrix of distances (or similarities) D  d ik . 1 2 2. Search the distance matrix for the nearest (most similar) pair of clusters. Let the distance between “most similar” clusters U and V be dUV . 3. Merge clusters U and V. Label the newly formed cluster (UV). Update the entries in the distance matrix (a) deleting the rows and columns corresponding to clusters U and V and (b) adding a row and column giving the distances between cluster (UV) and the remaining clusters. 4. Repeat Steps 2 and 3 a total of N-1 times. (All objects will be in a single cluster after the algorithm terminates.) Record the identity of clusters that are merged and the levels (distances) at which the merges take place. There are 3 linkage methods. The main differences among these methods are the distances between (UV) and any other cluster W. (I) Single Linkage: d UV W  min d UW , d VW . (II) Complete Linkage: d UV W  max dUW , dVW . (III) Average Linkage: d UV W  where d ik  d i ik k N UV  NW , is the distance between object i in the cluster (UV) and object k in the cluster W, and N UV  and N W are the number of items in clusters (UV) and W, respectively. 2

Clustering, Distance Methods & Ordination: Lecture Notes

Related documents

Products

Support

Clustering, Distance Methods & Ordination: Lecture Notes

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib