An Investigation Into New Approaches for Clustering Matrices

The Fiedler Vector and Graph Partitioning Barbara Ball baljmb@aol.com Clare Rodgers clarerodgers@hotmail.com College of Charleston Graduate Math Department Research Under Dr. Amy Langville Outline General Field of Data Clustering – – – – Motivation Importance Previous Work Laplacian Method    Fiedler Vector Limitations Handling the Limitations Outline Our Contributions – Experiments   – – Sorting eigenvectors Testing Non-symmetric Matrices Hypotheses Implications Future Work – – Non-square matrices Proofs References Understanding Graph Theory 2 3 4 1 6 5 10 7 8 9 Given this graph, there are no apparent clusters. Understanding Graph Theory 7 1 3 2 9 4 6 5 10 8 Although the clusters are now apparent, we need a better method. Finding the Laplacian Matrix • A = adjacency matrix Rows sum to zero • D = degree matrix  2 0 0 0 0 0 0 1  0 3 1 0 0 0 1 0   0 1 4 0 0 0 1 0   0 0 0 3 1 1 0 0  0 0 0 1 3 1 0 0 L  0 0 0 1 1 3 0 0  0 1 1 0 0 0 2 0   1 0 0 0 0 0 0 2  1  1  1 0 0 0 0  1   0 0  1  1  1  1 0 0 • Find the Laplacian matrix, L • L= D -A 3 2 4 1 6 5 7 8 9 10 1 1 1 0 0 0 0 1 4 0 0 0   1   1  1   1 0  0 0  4  Behind the Scenes of the Laplacian Matrix  Rayleigh Quotient Theorem: – – seeks to minimize the off-diagonal elements of the matrix or minimize the cutset of the edges between the clusters * * * * * * * * * * * * * Not easily clustered * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Clusters apparent * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Behind the Scenes of the Laplacian Matrix  Rayleigh Quotient Theorem Solution: – –  1=0, the smallest right-hand eigenvalue of the symmetric matrix, L 1 corresponds to the trivial eigenvector v1= e = [1, 1, …, 1]. Courant-Fischer Theorem: – also based on a symmetric matrix, L, searches for the eigenvector, v2, that is furthest away from e. Using the Laplacian Matrix  v2, gives relation information about the nodes.  This relation is usually decided by separating the values across zero.  A theoretical justification is given by Miroslav Fiedler. Hence, v2 is called the Fiedler vector. Using the Fiedler Vector  v2 is used to recursively partition the graph by separating the components into negative and positive values. Entire Graph: sign(V2)=[-, -, -, +, +, +, -, -, -, +] Reds: sign(V2)=[-, +, +, +, -, -] 1 2 3 7 4 8 9 1 8 7 2 9 6 3 5 10 Problems With Laplacian Method  The Laplacian method requires the use of: – – – an undirected graph a structurally symmetric matrix square matrices  Zero may not always be the best choice for partitioning the eigenvector values of v2 (Gleich)  Recursive algorithms are expensive Current Clustering Method  Monika Henzinger, Director of Google Research in 2003, cited generalizing directed graphs as one of the top six algorithmic challenges in web search engines. How Are These Problems Currently Being Solved?  Forcing symmetry for non-square matrices: – Suppose A is an (ad x term) non-square matrix. – B imposes symmetry on the information: 0 B T A  A 0  Example: 0 A 1  1 0 0  1  0 1 1 A   0 0 1   T 0 0  B  0  0 0 0 0 0 1 0 0 0 0 1 1 0 1 1 0 0 0 0 1  0 0 How Are These Problems Currently Being Solved?  Forcing symmetry in square matrices: – Suppose C represents a directed graph. – D imposes bidirectional information by finding the nearest symmetric matrix: D = C + CT  Example: 0 0 1  C  1 0 0 1 1 0 0 1 1  C T  0 0 1 1 0 0 0 1 2 D  1 0 1 2 1 0 How Are These Problems Currently Being Solved?  Graphically Adding Data: 1 1 2 3 2 3 How Are These Problems Currently Being Solved?  Graphically Deleting Data: 1 1 2 3 2 3 Our Wish:  Use Markov Chains and the subdominant righthand eigenvector ( Ball-Rodgers vector) to cluster asymmetric matrices or directed graphs. Where Did We Get the Idea? Stewart, in An Introduction to Numerical Solutions of Markov Chains, suggests the subdominant, right-hand eigenvector (Ball-Rodgers vector) may indicate clustering. The Markov Method 0 0  0  0 0 A 0 0  1 1  0 0 0 1 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 0 0 0 1  1 1  1 0  0 0  0 0 0  0  0 0 P 0 0   .5 .25   0 0 0 0 0 0 0 0 .3333 0 0 0 .3333 .25 0 0 0 0 .25 0 0 0 .3333 .3333 0 0 0 .3333 0 .3333 0 0 0 .3333 .3333 0 0 .5 .5 0 0 0 0 0 0 0 0 0 0 .25 .25 0 0 0 0 0 .25 .25 .25 .25 0 .5 .5 0 .3333 0 .25 0 0 0 0 0 0 0 0 0 .5 .25 0 0 0 0  0  .25   .3333 .3333  .3333 0   0  0   0  Different Matrices and Eigenvectors Different Matrices:  A: connectivity matrix Respective Eigenvectors:    L = D – A : Laplacian matrix –  P: probability Markov matrix –  rows sum to 0 –  2nd smallest of L (Fiedler vector)  2nd largest of P (Ball-Rodgers vector) 2nd smallest of Q the rows sum to 1 Q = I – P: transitional rate matrix rows sum to 0 2nd largest of A 2nd smallest of A  Graph 1 Eigenvector Value Plots 4 2 1 6 7 3 9 8 5 10 Second Largest of A Fiedler Vector 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 -0.1 -0.2 -0.1 -0.3 -0.2 -0.4 -0.3 -0.5 9 2 3 8 1 7 10 5 4 -0.4 6 8 1 Ball-Rodgers Vector 2 7 3 10 5 4 6 Second Smallest of Q 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 -0.1 -0.1 -0.2 -0.2 -0.3 -0.3 -0.4 9 4 5 6 10 3 7 2 9 1 8 -0.4 1 8 9 2 7 3 10 4 5 6 4 2 1 6 7 3 9 8 5 10 Graph 1 Banding Using Eigenvector Values Banded A Banded L Second Largest of A Fiedler Vector 9 8 2 1 3 9 8 2 1 7 7 3 10 5 4 6 9 2 3 8 1 7 10 nz = 30 5 4 6 Banded P of P Second Largest 8 1 9 2 7 Reorders just by using the indices of the sorted eigenvector – No Recursion 10 5 4 6 8 1 9 2 7 3 10 nz = 30 5 4 6 Banded Q Second Smallest of Q 1 8 9 2 7 3 3 10 10 5 4 4 5 6 6 8 1 9 2 7 3 10 nz = 30 5 4 6 1 8 9 2 7 3 10 nz = 30 4 5 6 Graph 1 Reordering Using Laplacian Method 4 2 1 6 7 3 9 8 5 10 Reordered Reordered LL A A 0 4 1 2 5 3 6 4 10 5 1 6 8 7 8 9 9 2 10 3 11 0 2 4 6 8 10 7 4 5 6 10 1 8 9 nz = 30 2 3 7 Graph 1 Reordering Using Markov Method 4 2 1 6 7 3 9 8 5 10 A A Reordered Reordered PP 0 1 2 2 3 3 4 7 5 1 6 8 Reordered Q Reordered Q 7 9 8 4 4 9 5 5 10 11 6 0 2 4 6 8 10 6 10 10 1 2 2 3 7 8 9 4 5 6 10 1 2 3 7 8 9 3 7 1 8 9 4 5 6 10 4 16 1 6 12 13 11 5 3 20 8 7 17 14 18 15 22 9 19 21 2 Graph 2 Eigenvector Value Plots 10 23 0.1 Second Largest of A Fiedler Vector 0.3 0.05 0.2 0 0.1 -0.05 -0.1 0 -0.15 -0.1 -0.2 -0.25 -0.2 -0.3 -0.3 -0.35 -0.4 1022 2021 1518 2 19122314 3 16 5 131711 9 6 4 8 7 1 Ball-Rodgers Vector 0.3 -0.4 Second Smallest of Q 0.4 0.2 0.3 0.1 0.2 0 0.1 -0.1 0 -0.2 -0.1 -0.3 -0.2 -0.4 9 4 7 1 8 6 1117 16 3 20 2 192115121022141823 5 13 9 4 7 1 8 6 1117 16 3 20 2 19211522 1023 181214 5 13 -0.3 5 13231814102212 1521 2 1920 3 161711 6 8 1 7 4 9 4 16 1 6 12 13 11 3 20 8 7 17 Graph 2 Banding Using Eigenvector Values 14 18 15 22 9 5 19 21 2 10 23 Banded A Banded L Fiedler Vector Second Largest of A 10 22 20 21 15 18 2 19 12 23 14 3 16 5 13 17 11 9 6 4 8 7 1 Nicely banded, but no apparent blocks. 9 4 7 1 8 6 11 17 16 3 20 2 19 21 15 22 10 23 18 12 14 5 13 9 4 7 1 8 61117163202192115221023181214513 nz = 71 10222021151821912231431651317119 6 4 8 7 1 nz = 71 Banded P Banded Q Second Largest of P 9 4 7 1 8 6 11 17 16 3 20 2 19 21 15 12 10 22 14 18 23 5 13 Second Smallest of Q 5 13 23 18 14 10 22 12 15 21 2 19 20 3 16 17 11 6 8 1 7 4 9 9 4 7 1 8 61117163202192115121022141823513 nz = 71 51323181410221215212192031617116 8 1 7 4 9 nz = 71 4 1 16 6 12 13 11 5 3 20 8 7 17 14 18 15 22 9 19 2 21 Graph 2 Reordering Using Laplacian Method 10 23 ReorderedL L Reordered 2 A A 15 0 19 5 12 13 5 14 10 18 20 10 21 22 23 3 15 11 17 1 4 20 6 7 8 9 0 5 10 15 nz = 71 20 16 2 15 19 5 12 13 14 10 18 20 21 22 23 3 11 17 1 4 6 7 8 9 16 4 1 16 6 12 13 11 5 3 20 8 7 17 14 18 15 22 9 19 2 21 Graph 2 Reordering Using Markov Method 10 23 Reordered Reordered P P AA 0 5 Reordered Reordered Q Q 10 15 20 0 5 10 15 nz = 71 20 1 3 4 6 7 8 9 11 16 17 2 5 10 12 13 14 15 18 19 20 21 22 23 10 18 21 22 23 2 15 19 20 5 12 13 14 1 4 7 8 9 3 6 11 16 17 1018212223 2151920 5121314 1 4 7 8 9 3 6111617 nz = 71 1 3 4 6 7 8 91116172 51012131415181920212223 nz = 93 Directed Graph 1 7 2 1 4 9 3 8 10 5 Although it is directed, the Fiedler vector still works. 6 Directed Graph 1 7 2 1 4 9 3 8 10 5 6 v2 = -0.5783 -0.2312 -0.0388 0.1140 0.1255 0.1099 -0.1513 -0.5783 -0.4536 0.0821 7 Directed Graph 1 Reordering Using Laplacian Method 2 3 1 9 4 10 8 5 6 Reordered L Reordered L 1 A A 0 2 1 8 2 9 3 3 4 7 5 4 6 5 7 6 8 10 9 1 10 11 0 1 2 3 4 5 6 7 8 9 10 11 2 8 9 3 7 4 5 6 10 7 Directed Graph 1 Reordering Using Markov Method 2 3 1 9 4 10 8 5 6 ReorderedPP Reordered 1 A A 0 8 1 9 2 2 3 3 4 7 5 4 6 7 5 8 6 9 10 10 1 11 0 1 2 3 4 5 6 7 8 9 10 11 8 9 2 3 7 4 5 6 10 Directed Graph 1-B 7 2 1 4 3 9 8 Was bi-directional 10 5 6 Directed Graph 1-B    The Laplacian Method no longer works on this graph. Certain edges must be bi-directional in order to make the matrix irreducible. 7 Currently, to deal with 2 3 this problem, a small 1 number is added to each 9 4 10 element of the matrix. .01 8 5 6 7 2 3 4 1 10 9 5 6 8 Directed Graph 1-B Reordering Using Markov Method ReorderedP P Reordered AA 1 1 8 2 9 3 2 4 5 3 6 7 7 4 8 5 9 6 10 10 1 2 3 4 5 6 7 8 9 10 1 8 9 2 3 7 4 5 6 10 Answer Answer 0 Directed Graph 2 Reordering Using Markov Method 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 47 0 2 4 6 8 Reordered Reordered P P 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 4647 A A 14 15 37 4 10 17 6 25 3 5 9 27 26 44 43 13 40 20 22 1 38 45 12 30 7 8 36 35 31 39 2 29 11 28 19 16 32 33 41 42 18 34 24 46 21 23 141537 4 1017 6 25 3 5 9 2726444313402022 138451230 7 836353139 2291128191632334142183424462123 14 15 22 39 40 44 4 5 7 9 10 11 20 25 26 31 1 2 3 6 8 12 13 16 17 18 19 21 23 24 27 28 29 30 32 33 34 35 36 37 38 41 42 43 45 46 141522394044 4 5 7 9 1011202526311 2 3 6 8 12131617181921232427282930323334353637384142434546 Answer Answer 0 Directed Graph 3 Reordering Using Markov Method 50 100 150 200 250 300 Reordered Reordered P 261 74 350 37 0 50 100 150 200 250 300 350 176 AA 24 261 165 368 74 59 117 37 229 144 176 338 140 24 327 381 165 219 22 261 368 59 117 261 74 37 176 24 165 368 59 117 74 37 176 24 165 368 59 117 229 144 338 140 327 381 21922 Answer Answer 0 Directed Graph 4 Reordering Using Markov Method 20 40 60 80 100 10% antiblock elements 120 140 Reordered Reordered PP 1 10 12 23 0 20 40 60 80 100 120 140 24 AA 139 26 57 27 130 31 39 32 75 33 49 41 35 43 3 48 24 63 137 66 67 1 44 109 46 20 77 98 139 57 130 39 75 49 35 3 24 137 44 109 46 20 7798 10 12 23 24 26 27 31 32 33 41 43 48 63 6667 Answer Answer 0 Directed Graph 5 Reordering Using Markovian Method 20 40 60 80 100 30% antiblock elements 120 140 0 20 40 60 80 100 120 140 AA 62 115 1 23 Only the first partition is shown. 30 36 94 28 16 127 53 19 119 108 140 55 62 115 1 23 30 36 94 28 16 127 53 19 119 108 140 55 Reordered P Hypothesis Implications Plotting the eigenvector values gives better estimates of the number of clusters A number other than zero may be used to partition the eigenvector values. Sometimes, sorting the eigenvector values clusters the matrix without any type of recursive process. Recursive methods are timeconsuming. The eigenvector plot takes virtually no time at all and requires very little programming or storage! Using the stochastic matrix P cluster asymmetric matrices or directed graphs Non-symmetric matrices (or directed graphs) can be clustered without altering data! Future Work  Experiments on Large Non-Symmetric Matrices  Non-square matrices  Clustering eigenvector values to avoid recursive programming  Proofs Questions References          Friedberg, S., Insel, A., and Spence, L. Linear Algebra: Fourth Edition. Prentice-Hall. Upper Saddle River, New Jersey, 2003. Gleich, David. Spectral Graph Partitioning and the Laplacian with Matlab. January 16, 2006. http://www.stanford.edu/~dgleich/demos/matlab/spectral/spectral.html Godsil, Chris and Royle, Gordon. Algebraic Graph Theory. Springer-Verlag New York, Inc. New York. 2001. Karypis, George. http://glaros.dtc.umn.edu/gkhome/node Langville, Amy. The Linear Algebra Behind Search Engines. The Mathematical Association of America – Online. http://www.joma.org. December, 2005. Mark S. Aldenderfer, Mark S. and Roger K. Blashfield. Cluster Analysis . Sage University Paper Series: Quantitative Applications in the Social Sciences,1984. Moler, Cleve B. Numerical Computing with MATLAB. The Society for Industrial and Applied Mathematics. Philadelphia, 2004. Roiger, Richard J. and Michael W. Geatz. Data Mining: A Tutorial-Based Primer Addison-Wesley, 2003. Vanscellaro, Jessica E. “The Next Big Thing In Searching” Wall Street Journal. January 24, 2006. References             Zhukov, Leonid. Technical Report: Spectral Clustering of Large Advertiser Datasets Part I. April 10, 2003. Learning MATLAB 7. 2005. www.mathworks.com www.Mathworld.com www.en.wikipedia.org/ http://www.resample.com/xlminer/help/HClst/HClst_intro.htm http://comp9.psych.cornell.edu/Darlington/factor.htm www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Markov.html http://leto.cs.uiuc.edu/~spiros/publications/ACMSRC.pdf http://www.lifl.fr/~iri-bn/talks/SIG/higham.pdf http://www.epcc.ed.ac.uk/computing/training/document_archive/meshdecompslides/MeshDecomp-70.html http://www.cs.berkeley.edu/~demmel/cs267/lecture20.html http://www.maths.strath.ac.uk/~aas96106/rep02_2004.pdf Eigenvector Example QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. back Structurally Symmetric 0 .5 .5   A  1 0 0  1 0 0  back 0 * *    A  * 0 0  * 0 0 Theory Behind the Laplacian  Minimize the edges between the clusters Theory Behind the Laplacian  Minimizing edges between clusters is the same as minimizing offdiagonal elements in the Laplacian matrix.  min pTLp where pi = {-1, 1} and i is the each node.  p represents the separation of the nodes into positives and negatives.  pTLp = pT(D-A)p = pTDp – pTAp  However, pTDp is the sum across the diagonal, so is is a constant.  Constants do not change the outcome of optimization problems. Theory Behind the Laplacian  min pTAp  This is an integer nonlinear program.  This can be changed to a continuous program by using Lagrange relaxation and allowing p to take any value from –1 to 1. We rename this vector x, and let its magnitude be N. So, xTx=N.  min xTAx - (xTx – N)  This can be rewritten as the Rayleigh Quotient: min xTAx/xTx = 1 Theory Behind the Laplacian  1=0 and corresponds to the trivial eigenvector v1=e  The Courant-Fischer Theorem seeks to find the next best solution by adding an extra constraint of x  e.  This is found to be the subdominant eigenvector v2, known as the Fiedler vector. Theory Behind the Laplacian  Our Questions: – The symmetry requirement is needed for the matrix diagonalization of D. Why is D important since it is irrelevant for a minimization problem? – If diagonalization is important, could SVD be used instead? future

An Investigation Into New Approaches for Clustering Matrices

Related documents

Products

Support

An Investigation Into New Approaches for Clustering Matrices

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib