Amino acid partitioning using a Fiedler vector model SJ Shepherd 1, CB Beggs 1 & S Jones 2 1. Medical Biophysics Group, SEDT, University of Bradford, BD7 1DP, UK. 2. Division of Biomedical Sciences, School of Life Sciences, University of Bradford, BD7 1DP, UK. Supplementary material SPECTRAL GRAPH THEORY A graph with V vertices and E edges is, in the most general sense, an abstract structure for modelling the relationship E between V entities. As such, it is one of the most powerful of all mathematical techniques because of its enormous generality and hence, ability to model a vast range of systems. In the two problems we are concerned with here, it appears that the graph paradigm is ideally suited to capturing the set of chemical and physical interactions between the amino acid residues that comprise a protein. Once we have modelled the protein in question as a graph, we can bring to bear a number of extremely powerful results on the relationship between the graph, its spectrum and its connectivity properties. These ideas can be formalised using the spectrum and eigenvectors of the graph G, its adjacency matrix A, its Laplacian L and its degree (or valence) matrix . DEFINITION. Let G = (V, E, w) be a weighted undirected graph without loops on n nodes. Let A be the adjacency matrix of G. Let Δ be the diagonal matrix of dimension n where Δii equals the sum of weights of all arcs incident to i. The Laplacian of G is defined as L = Δ – A. It is easy to show that all the eigenvalues of the Laplacian of G are non-negative and that the multiplicity of the eigenvalue zero equals the number of connected components of G. For each connected component of G, the characteristic vector of that component is a corresponding eigenvector of G with eigenvalue zero. The Laplacian has many important properties we can exploit. Among these are: L is real symmetric and hence its n eigenvalues are real and its eigenvectors are orthogonal; L is semi-positive definite and hence all its eigenvalues are non-negative; 1n (1, 1, …, 1)T n is an eigenvector of L with associated eigenvalue 0; The multiplicity of the eigenvalue zero is equal to the number of connected components of G. In particular, if G is connected, then 1n is the only eigenvector associated with eigenvalue 0. There is an intimate relationship between the combinatorial and topological characteristics of a graph and the algebraic properties of its Laplacian. The idea at the heart of spectral graph theory is that there is a direct connection between the spectrum of the Laplacian and the isoperimetric number of the graph. Denote by λ0, λ1, …, λn-1 the eigenvalues of L in ascending order. λ0 is zero since L is singular (by virtue of the fact that all its rows sum to zero). For the same reason, the eigenvector associated with λ0 is a vector of constant values. Therefore, G is connected iff λ1 > 0. λ1 will be “close” to zero if G is “almost” unconnected, that is, contains a “weak” set of links. More precisely, the relationship between λ1 and the isoperimetric number hG of the graph is: hG2 1 2hG 2 The principal eigenvector of A contains information about the steady-state ranking of the elements of the matrix. This goes back to the work of Wei [6] and is the principle underlying the powerful PageRank algorithm which is responsible for the success of the Google search engine. Furthermore, a connected graph having a large subdominant eigenvalue (relative to the dominant eigenvalue) can be separated into two sets of vertices such that the two induced sub-graphs have a high degree of connectivity. In many cases, this is the maximum connectivity, that is, the graph has been “cut” optimally via its “weakest” links. In essence, we wish to find an optimal cut of G, that is, a balanced bipartition such that the weight δ(S, Sc) is minimised. This can be extended to higher dimensions. Denote by {v0, v1, …, vn-1} the eigenvectors of L corresponding to the eigenvalues {λ0, λ1, …, λn-1}. Embed G in d using the eigenvectors {v1, v2, …, vd} as coordinate vectors, that is, the vertex vi is positioned at point (v1(i), v2(i), …, vd(i)) d. Now find the direction s of the largest spread of the vertices in d, and the (d–1)-dimensional hyperplane in d normal to s which partitions d into two half-spaces with roughly the same number of vertices in each. The graph edges that straddle the hyperplane are the optimal cut of G. When d = 1, Fiedler [7] showed1 that this cut is given by the signum of the second smallest eigenvector of L, named for him as the Fiedler vector. That is, those nodes which are members of one partition are denoted by the corresponding elements of the Fiedler vector whose signs are positive, and those nodes which are members of the other partition are denoted by the corresponding elements of the Fiedler vector whose signs are negative. We can go still further and use successive higher-order eigenvectors of the Laplacian for graph drawing, by means of an eigen-projection or embedding of the graph into a kdimensional constrained vector (sub)space. We are justified in drawing orthogonal axes because the eigenvectors of L are, by definition, orthogonal. The basic ideas underlying this 1 This is nothing more than a constrained minimisation problem in disguise. Many of the useful properties of the Laplacian stem from the fact that its associated quadratic form is the weighted sum of all the pairwise squared distances x Lx T w (x ( i , j )E quotient constraint that ij i x j ) 2 . For the one-dimensional case, we have the the well known Rayleigh i min x 1n xT Lx . That x is orthogonal to the 1n vector is equivalent to x having zero xT x mean, so some of its elements are negative and some positive. Since the eigenvector v1 of L minimises the Rayleigh quotient, the quadratic form implies that embedding the vertices on the real line according to their values in v1 minimises the resulting sum of squared edge lengths. Hence, partitioning the graph at the origin will gave the most balanced, optimal cut. 2 powerful technique were first described by Hall [8] but have been almost forgotten since. Spectral graph drawing algorithms are almost absent in the literature2 but have been revived recently by Koren [9] and others. As pointed out in [9], we can use either the eigenvectors of L or the eigenvectors of A for graph drawing. For regular graphs, the two sets of eigenvectors are equal but are associated with the eigenvalues in reverse order3. However, experiments have shown that much better visual results, in particular, more pleasing aspect ratios, are usually obtained by using the degree-normalised eigenvectors instead of the raw vectors. That is, instead of seeking solutions to the equation Lx = x, we would prefer to find the solutions of Lx = x (or equivalently Ax = x with reversed order). Again, it is easy to show that these are equivalent. Using the fact that L = – A and taking u, a generalised eigenvector of (L, ), then u satisfies ( – A)u = u or equivalently by simple rearrangement, Au = (1 – )u. Thus, A and L have the same -normalised eigenvectors, although the order of the eigenvalues is again reversed. In this way, when drawing with degree-normalised eigenvectors, we can take either the low generalised eigenvectors of the Laplacian or the high generalised eigenvectors of the adjacency matrix without changing the result. However, for non-regular graphs (that is, almost every graph encountered in practice!), the two eigensystems are not related so simply and there are distinct visual advantages in using the Laplacian as opposed to the adjacency matrix, especially in terms of well-scaled aspect ratios. However, for large matrices it is problematic to find solutions for generalised eigensystems. It is also difficult to compute vectors starting from the smallest – it is much easier to compute vectors starting from the largest. We therefore use a simple algebraic manipulation to get around both these problems. In essence, we invert the eigensystem of the graph in such a way as to make the largest, non-generalised eigenvalues and vectors of the new system equivalent to the smallest, generalised (degree-normalised) eigenvalues and eigenvectors of the original. To achieve this, we multiply the equation Ax = x by -1 to obtain -1Ax = x. We call T = -1A the transition matrix. The (non-generalised) eigenvectors of T are the degree-normalised eigenvectors of L which give the nice drawings we desire. Finding nongeneralised eigenpairs starting from the largest is easy since we can now use simple techniques such as the power method to extract the eigenvectors of interest with minimum computational effort. We can write any matrix M with eigenvalues (1, …, n) and eigenvectors (e1, …, en) as: n 1 M 0 e0 e0 1e1e1 n 1en 1en 1 T T T 0 The two books on graph drawing, Di Battista et. al., “Graph Drawing : Algorithms for the Visualisation of Graphs”, Prentice Hall (1999) and Kaufmann et. al., “Drawing Graphs : Methods and Models”, Springer LNCS 2025 (2001) do not mention spectral methods at all! 3 This is easy to see because L = – A = deg·I – A and adding or subtracting the identity matrix does not change the eigenvectors, only the order of the eigenvalues. 2 3 However, by our construction of T, we don’t actually need to compute 0 and e0, we can just write them down – we know that all the eigenvalues of L are positive (since the matrix is semi-positive definite by definition) and that the number of zero eigenvalues is equal to the number of disjoint clusters of the graph. In our current context of amino acid residue interaction, there is only one zero eigenvalue since all the nodes in the graph are connected, that is, there is interaction between each and every residue. Therefore, the smallest eigenvector of L must have an associated eigenvalue of zero. In the transformation to T, we essentially inverted the problem by subtracting the degree-normalised eigensystem (L, ) from the identity4. Therefore, the largest eigenvalue of T is 1 and its associated eigenvector is simply (1/n) 1n. We immediately remove their contribution from T by computing: T T – ((1/n) 1 1T) and then use the power method to pull out the second and subsequent eigenvectors we require for partitioning and drawing, starting the iteration each time with an initial vector orthogonal to the previous ones5. Although the computational advantage is minimal for a very small matrix such as the 20 20 MJ matrix, the benefits of this method become very apparent when dealing with larger matrices. We can find the 20 or so largest eigenvectors of a 10,000 10,000 matrix in a few seconds on a desktop PC using this technique. There is also a very powerful insight to be gained by extending classical Fourier analysis to graphs. If we have a simple graph consisting of n vertices connected in a cycle then the adjacency matrix of this graph is the circulant n n matrix induced by the vector {1 0 1} where the zero coincides with the diagonal. Its Laplacian is the analogue of the second spatial derivative. As is well known in matrix theory, the eigenvectors of L are the basis functions of the Fourier transform6. frequencies. The associated eigenvalues are the squared The zero eigenvalue of L (or correspondingly, the unit eigenvalue of T) corresponds to the eigenvector of constant values, the projection of any n-dimensional real vector onto which, is just the DC component of the vector. Analogously, the eigenvectors of the Laplacian of a graph with the topology of a 2D grid are the Fourier basis functions for 2D signals. These are simply the 2D cosine transform bases used in the JPEG image compression algorithm. Extending the argument to 3D, the eigenvectors of L form an orthogonal basis in n. The associated eigenvalues may be considered frequencies, and the three projections of each of the coordinate vectors of a graph on the basis functions are the spectrum of the graph. The essential observation is that geometries that are smooth relative to the graph topology should yield spectra dominated by low frequency components. Note that there is a separate spectrum for each of the x, y and z components of the Since L = – A , we have A = – L and so Ax = x is equivalent to ( – L)x = x. Multiplying both sides by -1 gives (I – -1L)x = x. 5 This is easily done by setting ui ui – (ui · uj) uj for the 0 ≤ j < i vectors already computed. 6 In the case of real circulants, such as those which represent graph adjacencies, the eigenvectors are real and simplify to the basis function of the Hartley transform. 4 4 geometry and they behave independently depending on the directional geometric properties, e.g. the curvature, of the graph topology. The topic of spectral analysis of graphs is a fascinating area in its own right with perhaps a wider range of applications than almost any other tool in mathematics and the reader is referred to [9] and the references therein for further details. We have given sufficient background to show the motivation for our approach and we now move on to describe how these results can be applied to the problems of partitioning the amino acid residue set and for visualising in a dramatic way the energetics governing the folding pathways of proteins. 5