SJ Shepherd 1, CB Beggs 1 & S Jones 2

advertisement
Amino acid partitioning using a Fiedler vector model
SJ Shepherd 1, CB Beggs 1 & S Jones 2
1. Medical Biophysics Group, SEDT, University of Bradford, BD7 1DP, UK.
2. Division of Biomedical Sciences, School of Life Sciences, University of Bradford, BD7 1DP, UK.
Supplementary material
SPECTRAL GRAPH THEORY
A graph with V vertices and E edges is, in the most general sense, an abstract structure for
modelling the relationship E between V entities. As such, it is one of the most powerful of all
mathematical techniques because of its enormous generality and hence, ability to model a
vast range of systems. In the two problems we are concerned with here, it appears that the
graph paradigm is ideally suited to capturing the set of chemical and physical interactions
between the amino acid residues that comprise a protein. Once we have modelled the
protein in question as a graph, we can bring to bear a number of extremely powerful results
on the relationship between the graph, its spectrum and its connectivity properties. These
ideas can be formalised using the spectrum and eigenvectors of the graph G, its adjacency
matrix A, its Laplacian L and its degree (or valence) matrix .
DEFINITION. Let G = (V, E, w) be a weighted undirected graph without loops on n nodes.
Let A be the adjacency matrix of G. Let Δ be the diagonal matrix of dimension n where Δii
equals the sum of weights of all arcs incident to i. The Laplacian of G is defined as L = Δ –
A. It is easy to show that all the eigenvalues of the Laplacian of G are non-negative and that
the multiplicity of the eigenvalue zero equals the number of connected components of G.
For each connected component of G, the characteristic vector of that component is a
corresponding eigenvector of G with eigenvalue zero.
The Laplacian has many important properties we can exploit. Among these are:

L is real symmetric and hence its n eigenvalues are real and its eigenvectors are
orthogonal;

L is semi-positive definite and hence all its eigenvalues are non-negative;

1n  (1, 1, …, 1)T  n is an eigenvector of L with associated eigenvalue 0;

The multiplicity of the eigenvalue zero is equal to the number of connected
components of G. In particular, if G is connected, then 1n is the only eigenvector
associated with eigenvalue 0.
There is an intimate relationship between the combinatorial and topological characteristics of
a graph and the algebraic properties of its Laplacian. The idea at the heart of spectral graph
theory is that there is a direct connection between the spectrum of the Laplacian and the
isoperimetric number of the graph.
Denote by λ0, λ1, …, λn-1 the eigenvalues of L in
ascending order. λ0 is zero since L is singular (by virtue of the fact that all its rows sum to
zero). For the same reason, the eigenvector associated with λ0 is a vector of constant
values. Therefore, G is connected iff λ1 > 0. λ1 will be “close” to zero if G is “almost”
unconnected, that is, contains a “weak” set of links.
More precisely, the relationship
between λ1 and the isoperimetric number hG of the graph is:
hG2
 1  2hG
2
The principal eigenvector of A contains information about the steady-state ranking of the
elements of the matrix. This goes back to the work of Wei [6] and is the principle underlying
the powerful PageRank algorithm which is responsible for the success of the Google search
engine. Furthermore, a connected graph having a large subdominant eigenvalue (relative to
the dominant eigenvalue) can be separated into two sets of vertices such that the two
induced sub-graphs have a high degree of connectivity. In many cases, this is the maximum
connectivity, that is, the graph has been “cut” optimally via its “weakest” links. In essence,
we wish to find an optimal cut of G, that is, a balanced bipartition such that the weight δ(S,
Sc) is minimised. This can be extended to higher dimensions. Denote by {v0, v1, …, vn-1} the
eigenvectors of L corresponding to the eigenvalues {λ0, λ1, …, λn-1}. Embed G in d using
the eigenvectors {v1, v2, …, vd} as coordinate vectors, that is, the vertex vi is positioned at
point (v1(i), v2(i), …, vd(i))  d. Now find the direction s of the largest spread of the vertices
in d, and the (d–1)-dimensional hyperplane in d normal to s which partitions d into two
half-spaces with roughly the same number of vertices in each.
The graph edges that
straddle the hyperplane are the optimal cut of G. When d = 1, Fiedler [7] showed1 that this
cut is given by the signum of the second smallest eigenvector of L, named for him as the
Fiedler vector. That is, those nodes which are members of one partition are denoted by the
corresponding elements of the Fiedler vector whose signs are positive, and those nodes
which are members of the other partition are denoted by the corresponding elements of the
Fiedler vector whose signs are negative.
We can go still further and use successive higher-order eigenvectors of the Laplacian for
graph drawing, by means of an eigen-projection or embedding of the graph into a kdimensional constrained vector (sub)space. We are justified in drawing orthogonal axes
because the eigenvectors of L are, by definition, orthogonal. The basic ideas underlying this
1
This is nothing more than a constrained minimisation problem in disguise. Many of the useful properties of the
Laplacian stem from the fact that its associated quadratic form is the weighted sum of all the pairwise squared
distances x Lx 
T
 w (x
( i , j )E
quotient constraint that
ij
i
 x j ) 2 . For the one-dimensional case, we have the the well known Rayleigh
i  min
x  1n
xT Lx
. That x is orthogonal to the 1n vector is equivalent to x having zero
xT x
mean, so some of its elements are negative and some positive. Since the eigenvector v1 of L minimises the
Rayleigh quotient, the quadratic form implies that embedding the vertices on the real line according to their
values in v1 minimises the resulting sum of squared edge lengths. Hence, partitioning the graph at the origin
will gave the most balanced, optimal cut.
2
powerful technique were first described by Hall [8] but have been almost forgotten since.
Spectral graph drawing algorithms are almost absent in the literature2 but have been revived
recently by Koren [9] and others. As pointed out in [9], we can use either the eigenvectors of
L or the eigenvectors of A for graph drawing.
For regular graphs, the two sets of
eigenvectors are equal but are associated with the eigenvalues in reverse order3. However,
experiments have shown that much better visual results, in particular, more pleasing aspect
ratios, are usually obtained by using the degree-normalised eigenvectors instead of the raw
vectors. That is, instead of seeking solutions to the equation Lx = x, we would prefer to find
the solutions of Lx = x (or equivalently Ax = x with reversed order). Again, it is easy to
show that these are equivalent. Using the fact that L =  – A and taking u, a generalised
eigenvector of (L, ), then u satisfies ( – A)u = u or equivalently by simple
rearrangement, Au = (1 – )u. Thus, A and L have the same -normalised eigenvectors,
although the order of the eigenvalues is again reversed. In this way, when drawing with
degree-normalised eigenvectors, we can take either the low generalised eigenvectors of the
Laplacian or the high generalised eigenvectors of the adjacency matrix without changing the
result.
However, for non-regular graphs (that is, almost every graph encountered in
practice!), the two eigensystems are not related so simply and there are distinct visual
advantages in using the Laplacian as opposed to the adjacency matrix, especially in terms of
well-scaled aspect ratios.
However, for large matrices it is problematic to find solutions for generalised eigensystems.
It is also difficult to compute vectors starting from the smallest – it is much easier to compute
vectors starting from the largest. We therefore use a simple algebraic manipulation to get
around both these problems. In essence, we invert the eigensystem of the graph in such a
way as to make the largest, non-generalised eigenvalues and vectors of the new system
equivalent to the smallest, generalised (degree-normalised) eigenvalues and eigenvectors of
the original. To achieve this, we multiply the equation Ax = x by -1 to obtain -1Ax = x.
We call T = -1A the transition matrix. The (non-generalised) eigenvectors of T are the
degree-normalised eigenvectors of L which give the nice drawings we desire. Finding nongeneralised eigenpairs starting from the largest is easy since we can now use simple
techniques such as the power method to extract the eigenvectors of interest with minimum
computational effort.
We can write any matrix M with eigenvalues (1, …, n) and
eigenvectors (e1, …, en) as:
n 1
M   0 e0 e0  1e1e1    n 1en 1en 1
T
T
T
0
The two books on graph drawing, Di Battista et. al., “Graph Drawing : Algorithms for the Visualisation of
Graphs”, Prentice Hall (1999) and Kaufmann et. al., “Drawing Graphs : Methods and Models”, Springer
LNCS 2025 (2001) do not mention spectral methods at all!
3
This is easy to see because L =  – A = deg·I – A and adding or subtracting the identity matrix does not
change the eigenvectors, only the order of the eigenvalues.
2
3
However, by our construction of T, we don’t actually need to compute 0 and e0, we can just
write them down – we know that all the eigenvalues of L are positive (since the matrix is
semi-positive definite by definition) and that the number of zero eigenvalues is equal to the
number of disjoint clusters of the graph.
In our current context of amino acid residue
interaction, there is only one zero eigenvalue since all the nodes in the graph are connected,
that is, there is interaction between each and every residue.
Therefore, the smallest
eigenvector of L must have an associated eigenvalue of zero. In the transformation to T, we
essentially inverted the problem by subtracting the degree-normalised eigensystem (L, )
from the identity4. Therefore, the largest eigenvalue of T is 1 and its associated eigenvector
is simply (1/n) 1n. We immediately remove their contribution from T by computing:
T  T – ((1/n) 1 1T)
and then use the power method to pull out the second and subsequent eigenvectors we
require for partitioning and drawing, starting the iteration each time with an initial vector
orthogonal to the previous ones5. Although the computational advantage is minimal for a
very small matrix such as the 20  20 MJ matrix, the benefits of this method become very
apparent when dealing with larger matrices. We can find the 20 or so largest eigenvectors
of a 10,000  10,000 matrix in a few seconds on a desktop PC using this technique.
There is also a very powerful insight to be gained by extending classical Fourier analysis to
graphs. If we have a simple graph consisting of n vertices connected in a cycle then the
adjacency matrix of this graph is the circulant n  n matrix induced by the vector {1 0 1}
where the zero coincides with the diagonal. Its Laplacian is the analogue of the second
spatial derivative. As is well known in matrix theory, the eigenvectors of L are the basis
functions of the Fourier transform6.
frequencies.
The associated eigenvalues are the squared
The zero eigenvalue of L (or correspondingly, the unit eigenvalue of T)
corresponds to the eigenvector of constant values, the projection of any n-dimensional real
vector onto which, is just the DC component of the vector. Analogously, the eigenvectors of
the Laplacian of a graph with the topology of a 2D grid are the Fourier basis functions for 2D
signals.
These are simply the 2D cosine transform bases used in the JPEG image
compression algorithm.
Extending the argument to 3D, the eigenvectors of L form an
orthogonal basis in n. The associated eigenvalues may be considered frequencies, and
the three projections of each of the coordinate vectors of a graph on the basis functions are
the spectrum of the graph. The essential observation is that geometries that are smooth
relative to the graph topology should yield spectra dominated by low frequency components.
Note that there is a separate spectrum for each of the x, y and z components of the
Since L =  – A , we have A =  – L and so Ax = x is equivalent to ( – L)x = x. Multiplying both sides
by -1 gives (I – -1L)x = x.
5
This is easily done by setting ui  ui – (ui · uj) uj for the 0 ≤ j < i vectors already computed.
6
In the case of real circulants, such as those which represent graph adjacencies, the eigenvectors are real and
simplify to the basis function of the Hartley transform.
4
4
geometry and they behave independently depending on the directional geometric properties,
e.g. the curvature, of the graph topology.
The topic of spectral analysis of graphs is a fascinating area in its own right with perhaps a
wider range of applications than almost any other tool in mathematics and the reader is
referred to [9] and the references therein for further details.
We have given sufficient
background to show the motivation for our approach and we now move on to describe how
these results can be applied to the problems of partitioning the amino acid residue set and
for visualising in a dramatic way the energetics governing the folding pathways of proteins.
5
Download