NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL

advertisement
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
JUN KOO PARK
Department of Mathematics
Iowa State University, Ames, Iowa, 50010, USA
Email:jkpark@iastate.edu
Abstract. The functions of large biological molecules are related with various kinds
of large-amplitude molecular motions. With some assumptions, those motions can be
analyzed by Normal Mode Analysis and Gaussian Network Model. However, despite
their contributions to wide applications, the relationship between NMA and GNM requires
further discussion. In this review, the author compares the Normal Mode Analysis and
Gaussian Network Model and addresses their common applications in structural biology.
Key Words. Normal Mode Analysis, Gaussian Network Model, protein dynamics
1. Introduction
1.1. Historical development
In 1950, Goldstein introduced the theory of normal modes in classical mechanics [4,64,79,86].
Since then, normal mode analysis has been widely used in several fields such as mechanical
engineering, wave theory and optics. It was first applied in the early 1980’s to examine
mechanical aspects of small biological systems such as bovine pancreatic trypsin inhibitor
(BPTI) [34]. Afterwards, research was carried out on larger proteins [73,74], however, the
number of residues was still less than 300. With the progress of X-ray crystallography,
formations of larger biological systems have become usable and new methods for NMA
were needed to study fluctuations of these larger systems. Mouawad and Perahia proposed
the diagonalization in a mixed basis (DIMB) method in 1993 [37,84], which performs an
iterative diagonalization based on mixing of subsets taken from two different basis sets:
low-frequency NM and Cartesian coordinates. After they introduced the DIMB method,
the studies on proteins of more than 300 residues were possible. In 2000, Tama et al. introduced the rotation translation block (RTB) method [23,53], which yields low-frequency
normal modes that are sufficient to describe the global dynamics of macromolecules such
as RNA polymerase [47], the ribosome [5] and the F1 -ATPase [51]. Furthermore, Coarsegrained elastic network model combined with diagonalization techniques such as DIMB
and RTB has reduced the computational cost on large biological machines. Nowadays,
because of advancement in computational techniques as well as processing power, normal
1
2
JUN KOO PARK
mode analysis has become as a very powerful tool to study dynamics of large biological
molecules that are critically important in our bodies.
In Figure 1.1 [91], history of NMA applied to biological molecules is described. The figure
illuestrates the progress that has been made during the past few years.
Figure 1.1
Bahar et al. introduced Gaussian network model (GNM) in 1997 [29]. The model was
based on the previous work by Tirion and Go et al. Go et al. showed that the dynamic
structure of the globular protein can be described as a linear combination of harmonic
high-frequency motions and coupled anharmonic low-frequency motions of collective variables in 1983 [60]. Tirion showed that a single-parameter potential is sufficient to reproduce the slow dynamics in good detail in 1996 [89].
In the GNM, the fluctuation mechanics of large biological molecules can be modeled as
those of elastic networks composed of nodes and springs; the nodes are the residues, and
the springs are the inter-residue forces. Residues are assumed to have a Gaussian distributed fluctuations around their mean positions. That is, the GNM method applies the
theory of elastic network and Gaussian(Normal) distribution to understand macromolecular dynamics with the assumption that the structure is near its equilibrium state, and
only considers the residue contacts such as Cα contacts. The model uses the harmonic
approximation to model inter-residue interactions. In other words, the spatial interactions
between nodes are modeled with a uniform harmonic spring.
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
3
One of the purpose of the GNM is to understand dynamical properties of long-scale
mechanics. The model can be used to study not only small proteins such as enzymes but
also large biological molecules such as a ribosome or a viral capsid.
Especially, GNM method involves only one single parameter from the work of Tirion [89].
From a computational point of view, The GNM method only requires solving a singular
value decomposition (SVD) problem of Hessian matrix and therefore it makes the calculations computationally inexpensive than molecular dynamic simulation.
1.2. Motivation
To understand the biological functions of protein, it is essential to know the internal
motions. Especially, large amplitude motions are important to apprehend the conformational change. To study those dynamical properties of biological molecules using computational techniques several approaches can be considered. One of the method is using the
molecular dynamics simulations. Alder et al. proposed the general method of studying in
molecular dynamics in 1959 [59]. In the molecular dynamics simulations, the system can
be expressed as a function of time. The application for large-scale molecular assemblies is
still restricted to relatively short time scales(picoseconds) because of the computational
complexity of all-atom simulation methods, even though computational techniques and
processing power have been improved significantly. In other words, the utilization of standard molecular dynamics simulation is limited, as the timescales of these conformational
changes are on the order of microseconds. Moreover, due to the limitation of X-ray crystallography, it is usually the case that a high resolution structure is not available and only
low-resolution structural information is available. In that situation, molecular dynamics
simulations can not be applied. Furthermore, because of the complex nature of the energy
surface and the existence of a lot of energy minima, this method was not well-adapted.
One of the way to lengthen the time scale is to use more coarse-grained models, which
enable microsecond time scale to be reached for small proteins [78]. Because the coarsegrained models are sufficient to reproduce the low frequency normal modes on structure,
one can assume that there is no need for an atomic description of the molecule. Note that
Cα refers to the first carbon after the carbon that attaches to the functional group. It
is the backbone carbon next to the carbonyl carbon. When describing a protein (which
is a chain of amino acids), one often approximates the location of each amino acid as
the location of its Cα . Thus, in these simulations, typically the Cα atoms are considered, which considerably reduce the number of atoms necessary for simulation. However,
these calculations are still computationally expensive for large biological assemblies. To
simulate large and slow conformational rearrangements of large macromolecular assemblies, we need to employ alternative methods that do not depend on the framework of
molecular dynamics simulations. One of these methods is the Normal Mode Analysis
(NMA). In NMA, a complete analytical solution of the equations of motion can be obtained with the assumption that the potential energy of the system can be approximated
as a quadratic function around the potential energy minimum, and molecular motions
are decomposed into vibrational modes. Despite this assumption, low-frequency modes
obtained by this method appeared to describe enough the conformational changes that
are observed experimentally [37,62,28,76,90]
4
JUN KOO PARK
2. Normal Mode Analysis
2.1. Theory
In normal mode analysis, the dynamics of a molecule can be approximately represented
as a set of simple harmonic oscillators. In this sense, because the motion of a harmonic
oscillator can be analytically described, this expression is very useful [65]. From the classical mechanical model, for a simple harmonic oscillator of a mass m connected to a spring
with the spring constant k, the equation of motion is given by:
d2 x
dt2
where F represents the total force on the particle and x represents its coordinate.
F = −kx = m
Therefore, the solution of the above equation is able to describe the dynamics of the
particle. The solutions of above system can be obtained by the form of
x = αcos(ωt + φ)
where α and φ are the amplitude and the phase at time t=0, respectively and ω =
the angular frequency associated with the vibrational mode [81].
q
k
m
Recall. (Euler-Lagrange equation)
Let L be a continuously differentiable functional. Then the Euler-Lagrange equation is
given by
∂L(x, x0 , t)
d ∂L(x, x0 , t)
− [
]=0
∂x
dt
∂x0
For the general case, based on the theory of classical mechanics, we have following:
Lemma 2.1.1 Suppose that a biological molecule has n atoms. Let r(t) be the collection of the position vectors given molecule at time t such that r(t)={ri (t) : ri (t) =
(xi (t), yi (t), zi (t))T , i = 1, · · · , n} where ri is the position vector of atom i. Then the
molecular motion can be described as a collection of movements of the atoms in the molecule, as given in the following equations of motions:
M r00 = −∇Ep (r)
where M is a diagonal matrix with mii , the mass of atom i and Ep is the potential energy
[56].
Proof. Let
1
L = (r0 )T M (r0 ) − Ep (r)
2
Then
∂L
1
=
(M + M T )r0 = M r0
∂r0
2
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
5
d ∂L
( 0 ) = M r00
dt ∂r
By the Euler-Lagrange equation,
M r00 =
And
d ∂L
∂L
( 0) =
dt ∂r
∂r
∂L
= −∇Ep (r)
∂r
Therefore,
M r00 = −∇Ep (r)
Because a large biological molecules is composed of a lot of molecules, the potential energy
of the biological molecules is not simple, and thus the equation of motion of the molecules
cannot be solved analytically. However, if one consider the motion in the surrounding of
equilibrium position, the potential energy can be approximated by a simple form. Suppose
that we have a molecule with n atoms, and let (xi , yi , zi ) be the coordinate of atom i. Then
the coordinates of atoms are given by r = (x1 , y1 , z1 , · · · , zn ). Let r0 be the equilibrium
configuration of the molecule which means that at r0 the potential energy is minimized.
Then a Taylor expansion of the potential energy function Ep (r) around a minimum r0
gives:
X ∂ 2 Ep X ∂Ep 1
0
0
0
Ep (r) = Ep (r0 ) +
0 (ri − ri ) + 2!
0 (ri − ri )(rj − rj ) + · · ·
∂r
∂r
∂r
i
i
j
r=r
r=r
i
ij
Since the r0 is a minimum of the energy function,
∂Ep 0
(r ) = 0
∂ri
In addition, without loss of generality, the potential energy can be defined in connection
with this structure as
Ep (r0 ) = 0
Finally, if one considers sufficiently small displacements, terms greater than or equal to
the second order may be neglected. Therefore, the approximate potential energy function
is given by the harmonic approximation such as following:
1 X ∂ 2 Ep Ep (r) ≈
(ri − ri0 )(rj − rj0 )
2 ij ∂ri ∂rj r=r0
or
1
Ep (r) ≈ (∆r)T [∇2 Ep (r0 )](∆r)
2
where ∆r = r − r0
Then,
1
M ∆r00 = M r00 = −∇Ep (r) = −∇[ (∆r)T [∇2 Ep (r0 )](∆r)] = −∇2 Ep (r0 )∆r.
2
From this equation of motion, we know that the protein is fluctuating as if the protein is
governed by a harmonic potential energy around the equilibrium position, and the force
6
JUN KOO PARK
field becomes approximately linear. Therefore, the equation can be solved analytically
through the singular value decomposition of the Hessian matrix, ∇2 Ep (r0 ). Suppose that
we have a protein with n atoms. Then the dimension of the Hessian matrix, H is 3n and
we are able to find 3n sets of eigenpair solutions. Among those solutions, 3n-6 normal
modes are indeed meaningful. Because the first six smallest normal modes have an eigenvalue equal to 0 and which correspond to 3 translational and 3 rotational motions of the
whole system [14].
Fact. Based on classical dynamics, each atom will have a time-averaged potential energy
of 23 kB T , where kB is Boltzmann constant and T absolute temperature, i.e.
3
Epta (ri ) = kB T
2
[86]
Fact. For some constant ω > 0 and β,
Z
1 n
1
lim
cos2 (ωt + β)dt = hcos(ωt + β), cos(ωt + β)i =
n→∞ n 0
2
Lemma 2.1.2 Suppose the singular value decomposition of the Hessian matrix, H =
∇2 Ep (r0 ) is given by U ΛU T . Then, the solution of the system of equations can be obtained by
∆ri (t) =
3n
X
Uij αj cos(ωj t + βj ), ωj2 = Λjj
j=1
where αj and βj can be determined by the given condition of the system and in particular, [52,56]
αj2 = 6kB T [Λ−1 ]jj
P
2
0
Proof. Let ∆ri (t) = 3n
j=1 Uij αj cos(ωj t + βj ) and H = −∇ Ep (r ). Then,
From the equation of motion, M ∆r00 = −∇2 Ep (r0 )∆r = H∆r
3n
X
j=1
Mij ∆rj00
=−
3n
X
Hij ∆rj
j=1
Substituting for ∆r and ∆r00 in above equation gives:
−
3n
X
j=1
Mij
3n
X
k=1
Ujk αk ωk2
cos(ωk t + βk ) = −
3n
X
j=1
Hij
3n
X
k=1
Uij αk cos(ωk t + βk )
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
7
Simplifying
3n
3n
X
X
k=1
!
Mij Ujk
αk ωk2 cos(ωk t + βk ) =
j=1
3n
3n
X
X
k=1
!
Hij Ujk
αk cos(ωk t + βk )
j=1
Because above equation must hold for all t:
!
3n
3n
X
X
2
Mij Ujk ωk =
Hij Ujk
j=1
j=1
for each k = 1, · · · , 3n
If we rewrite the above equation in matrix notation, it gives the equations of motion as:
M U Λ = HU
Because the singular value decomposition of H is given by U ΛU T ,
M U Λ = U ΛU T U = U Λ
Thus, M = I, which means that coordinate U are mass-weighted. Therefore, the solution
is given by
3n
X
∆ri (t) =
Uij αj cos(ωj t + βj ), ωj2 = Λjj
j=1
where αk is the amplitude, ωj the angular frequency and βj , the phase of the jth normalmode of motion.
For the amplitude αk , let Qj = αj cos(ωj t + βj ) Then,
∆ri (t) =
3n
X
Uij Qj
j=1
∆r = U Q
Also,
1
1
1
Ep (r) = (∆r)T [∇2 Ep (r0 )](∆r) = (∆r)T H(∆r) = (U Q)T U ΛU T (U Q)
2
2
2
Therefore,
3n
3n
1X
1X 2 2
1 T
2
Λjj Qj =
ω Q
Ep (r) = Q ΛQ =
2
2 j=1
2 j=1 j j
Then, the time-averaged potential energy is:
1 2
1
1
3
ωj hQj , Qj i = ωj2 αj2 hcos(ωj t + βj ), cos(ωj t + βj )i = ωj2 αj2 = kB T
2
2
4
2
Therefore,
6kB T
αj2 =
= 6kB T [Λ−1 ]jj
2
ωj
8
JUN KOO PARK
Here, ωj2 = Λjj is called the jth mode of the motion. Based on the above solution, the
larger value the ωj has, the faster the corresponding mode is [73].
If the system is in the thermal equilibrium, from the proof of lemma,
kB T
hQj , Qj i = 2
ωj
Fact. If ωi 6= ωj , then
hcos(ωi t + βi ), cos(ωj t + βj )i = 0
Theorem 2.1.3 Suppose that r1 , · · · , rn is the equilibrium component positions of the
residues in a protein,
and the solution of the system of equations, for each atom i, is given
P3n
kB T
2
−1
by ∆ri (t) =
j=1 Uij αj cos(ωj t + βj ), where αj = 6 γ [Λ ]jj and the singular value
decomposition of the Hessian matrix, H = ∇2 Ep (r0 ) is given by U ΛU T . Then, the mean
square fluctuation of atom i is
h∆ri , ∆ri i = 3kB T
3n
X
Uij [Λ−1 ]jj Uij
j=1
[52]
Proof.
h∆ri , ∆ri i =
=
=
* n
X
Uij αj cos(ωj t + βj ),
j=1
n
X
n
X
+
Uij αj cos(ωj t + βj )
j=1
hUij αj cos(ωj t + βj ), Uij αj cos(ωj t + βj )i
j=1
n
X
Uij αj2 Uij hcos(ωj t + βj ), cos(ωj t + βj )i
j=1
n
1X
=
Uij αj2 Uij
2 j=1
n
=
1X
Uij 6kB T [Λ−1 ]jj Uij
2 j=1
= 3kB T
n
X
Uij [Λ−1 ]jj Uij
j=1
Therefore,P
based on the theorem above, because the residue fluctuation of a protein,
3n
∆ri (t) =
j=1 Uij αj cos(ωj t + βj ) can be expressed as a sum of simple oscillator, the
dynamics of the system can be described as a linear combination of independent normal
mode oscillators. The calculated fluctuations show the structural flexibilities by adding
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
9
over the significant modes. They give the beneficial dynamical properties of the given
protein.
Notice that
h∆ri , ∆ri i = 3kB T
3n
X
j=1
−1
Uij [Λ ]jj Uij = kB T
3n
X
Uij2
j=1
ωj2
If the singular value, λjj is small, then the frequency ωj is also small, therefore the mode
is slow. Thus, the smaller singular value is, the slower mode it corresponds to, and also
the larger fluctuation it makes.
The mean square fluctuation has usually connection with the experimentally detected
average atomic fluctuation so called B-factor in X-ray crystallography and the order parameter in NMR. Especially, using the above relations, the B-factor, Bi (also called x-ray
crystallographic temperature factor or Debye-Waller factor) of each atom is given by
Bi =
3n
X
8π 2
8π 2 8π 2
h∆ri , ∆ri i =
kB T
Uij [Λ−1 ]jj Uij
(ri − ri0 )2 =
3
3
3
j=1
3n
X
Uij2
8π 2
kB T
=
3
ωj2
j=1
From this equation, the smaller ωj has, the larger Bi has. Thus it is obvious that the
largest contributions to the atomic displacement comes from the lowest frequency normal
modes(small ω). In addition, the eigenvector corresponding to the lowest frequency represents the most globally distributed or collective motions. For these reasons, studies on
the NMA generally focus on a few low frequency normal modes, and in real, only several
terms corresponding to the smallest singular values are used for the evaluation of the
residue fluctuation [90].
Figure 2.1 [65] illustrates the difference between the adenylate kinase and cytochrome
c. For the lowest frequency normal mode of the adenylate kinase, almost all the atoms
are moving together with a large amplitude. On the other hand, a high frequency normal
mode of the cytochrome c shows only a few atoms are experiencing a motion with a very
small amplitude. From the latter case, this normal mode is much more localized than the
one for adenylate kinase. The direction and amplitude of arrows represents the those of
motions of atoms, respectively [1].
10
JUN KOO PARK
Figure 2.1
Normal Mode Computations
When the theory of NMA is applied to the biological molecules, energy minimization
is required first to ensure that the structure is indeed at a minimum of the potential energy function. This minimization is followed by the diagonalization of the Hessian matrix,
the second dervatives of the potential energy with respect to the Cartesian coordinates.
The size of this matrix increases as the square of the number of atoms. Generally, the digonalization process has been computationally expensive. Since the 1980, a lot of studies
performed to reduce the number of degrees of freedom. The first attempt was based on the
use of dihedral angle space, ignoring the other degrees of freedom [52,60]. This approach
reduces the degrees of freedom more than ten times. But the size of Hessian matrix still
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
11
increases as the square of the dihedral angle’s number. Elastic Network Model(ENM)
methods neglect the coupling between the retained degrees of freedom and those that
were ignored, although such a coupling may be essential to describe the conformational
change. Thus, it is important to consider other methods that take into account all the
degrees of freedom. One way is using iterative methods to diagonalize the Hessian matrix
[70,84,10]. There are three main iterative methods. One is the method based on Rayleigh
quotient, and another is perturbation method and the other is DIMB method. Mouawad
and Perahia introduced an iterative method of diagonalization in a mixed basis(DIMB)
in 1993 [84,61,1]. The method was implemented in the CHARMM [30]. Moreover, Tama
et. al. proposed the rotation translation block(RTB) methods in which set of residues
was regarded as a rigid block having six translation-rotation degrees of freedom [23]. This
approach reduces drastically the size of the Hessian matrix and thus the computational
time. In conclusion, computational techniques such as the diagonalization in a mixed
basis(DIMB) or the rotation translation block(RTB) approach enable NMA calculations
in a very short amount of time for large biological molecules.
Potential energy functions used for NMA
When NMA is applied to the biological molecules, the quality of the normal modes usually
depends on reaching a true minimum on the potential energy. Thus, for large biological
molecules, the minimizaiton procedure discussed above is not trivial. Generally the minimization process needs to be performed until it is sufficient to ensure that real frequencies
for all modes are not connected with translation or rotation of the entire molecule.
Tirion proposed a simplified model in 1996 which can be used to overcome the major
issues regarding the minimization procedure [89]. In this simplified model, the biological
molecule can be described as a three-dimensional elastic network on which the probability
distribution of atomic fluctuation is normal in an experimentally determined structure.
One of the model which satisfied those conditions is Gaussian Network Model [1]. In
this model, because the energy function is at the minimum, it is possible to eliminate
the need for minimization prior to NMA. As a result, NMA can be applied directly on
crystal coordinates of NMR structures. Hence, it is now possible to determine whether
two crystal forms of a protein, open form and closed form, are interconvertible using the
slow modes as coordinates [89]. Those procedures reduces a lot of the computational
expense and makes NMA an ideal candidate for study of conformational change pathways
between two states [6,21,40,48,68,69,72,74]. Here, the direction of low frequency normal
modes can be used to deform the structure from its original conformation to other known
conformations. Due to the nonlinearity of the conformational change, one needs to deform the structure iteratively so that it fits optimally. This concept was introduced in
the normal mode flexible fitting(NMFF) method [46,75]. Those iteration procedure may
require hundreds of NMA. Such deformations would cause the structure to move from
its minimun, and therefore a minimization process would be necessary to perform NMA
on the deformed structure. However, with elastic network, one can directly apply NMA
without additional minimization.
More importantly, the elastic network model can be used for protein models without
atomic detail. Using the coarse-grained representation of a biological molecule reduces
the size of the Hessian matrix. Thus, the model improves the computational efficiency
12
JUN KOO PARK
dramatically, which is essential for the study of large biological molecules [13,25,26].
Normal modes are properties of the shape of Biomolecules
As advanced models and computational techniques developed based on lots of studies,
the role of shape and form has became clear. From all-atom model to multi-resolution
elastic network models, those advances has shown the connection between the shape and
governing the dynamics of biological molecule. In fact, this elastic network model can
be used to express the slow dynamical properties of proteins with high degree of accord
[24]. Around high frequencies this accord tends to disagree ,however, there have been
many studies showing that collective motions found in the low frequency normal modes
effectively describe biologically relevant conformational changes. The good agreement between experimental data and the modes constructed from these methods suggests that low
frequency normal modes are a predominant property of the shape of the molecular system
[5,39,91]. In other words, the character of these low frequency normal modes is mainly
due to the properties of the shape of the biological molecules [65]. Thus, it means that
an atomic representation of the biological molecules is not needed to obtain its dynamical
properties, rather only its shape would be necessary. These observations indicates that
larger biological assemblies may have been designed in a way that they adopt a specific
form from which specific dynamics can arise [49,82]. Furthermore, those suggests that
these machines then use these motions that are the property of the shape. Hence, the
shape corresponding to dynamical properties of the structure is the key to understand
the function of biological system.
2.2 Applications
Conformational changes
Theory of normal mode can be used to investigate the dynamics of molecules of biological
object. Such area includes NMR order parameters, the study of large-scale conformational changes of proteins [6,21,40,48,68,69,72,74], B-factor refinement in crystallography
[71,77,78], and so on. Moreover, NMA has also been used to investigate DNA and RNA
dynamics [42]. Nowadays, because of advanced computational techniques, many studies
indicates the utilization of NMA to cope with problems in connection with dynamics of
large biological systems. In fact, with the help of NMA, dynamical properties of the
molecules can be obtained in a very short amount of time compared to molecular dynamics simulations [18,54]. From the discussion above, the low frequency normal modes
often correlate well with experimentally observed conformational changes [8,24]. A few
low-frequency normal modes are able to describe the most visible motions occurring in the
system and those are expected to be related with function of given biological molecules.
For these reasons, NMA has been used in many studies and the study focused only on a
few large-amplitude/low-frequency normal modes. The fact that studies on the normal
modes of a biological system are able to give insights to dynamics of large-scale conformational change of protein is quite interesting. Such studies have helped our understanding
of the functioning of large molecular biological machines.
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
13
Furthermore, based on the theory of NMA, feasible pathways between two distinct structurally determined conformations of a biological molecule could be constructed by using a
linear combinations of the low frequency normal modes. The explanation of the pathways
for conformational changes in biomolecular systems may be helpful in the analysis of functional rearrangements. To find those pathways, models of intermediate structures between
the two known conformational states of the molucule needs to be generated. This can
be approached in various ways using NMA [93]. Figure 2.2 illustrates the conformational
change pathway of the maltose binding protein [65]. The protein starts with the open
form and a linear combination of the low frequency normal modes yield the intermediate
conformation iteratively. Finally, the protein reaches the closed structure.
Figure 2.2
The basic idea is to follow the lowest frequency nomral modes directions of the system
between the two end point states. One way is using the linear interpolation between
endpoints using normal mode directions. And the other way is using nonliner/iterative
approach [19]. Figure 2.3 shows the nonlinear conformational change between the open
and closed forms of the adenylate kinase [1]. The lowest frequency normal mode of
the open structure illustrates the preferential direction of the conformational change.
Although the conformational change pathway is nonlinear, the normal modes provide only
linear motions. Thus, the mode itself is not enough to lead the structure to the closed form.
From the figure 2.3, the direction of the conformational change varies as the structure is
changing from the initial to the final structure. Furthermore, if the molecule is moving
along the lowest normal mode direction too far, the local structure may be distorted.
Because of the those problems, performing the NMA and conformational deformations in
an iterative manner is essential to find the intermediate structure [46,75].
14
JUN KOO PARK
Figure 2.3
Interpretation of experimental data
NMA has been used to refine crystallographic B-factors obtained from X-ray crystallography since 1990s. Nowadays, due to the development of computational techniques for
NMA, new ways began to open to complement structural experimental data from experimental tools. For the most part, normal mode vectors can be used as search directions
to mimic protein dynamics to achieve a better fit to the experimental data. [46,75]
More recently, NMA was also introduced to improve the molecular replacement technique
used in X-ray crystallography. Indeed, structures of unknown large biological molecules
could be modeled using data from previously known structural fragments. However, it
might be the case that the conformation would not exactly match the known templates.
By exploring the dynamics of these templates using NMA, one can generate several conformations that can be examined as possible solutions for molecular replacements.
Compared to a full-atomic normal mode approach, one of the advantage of elastic network model is that the techniques of NMA can be applied to low resolution structural
data. Recently, new methodologies based on NMA have also appeared to use NMA in
connection with lower resolution structural information to predict the structure or the
conformational change of biological molecules. As a matter of fact, Chacon et al. showed
functionally important rearrangements of several large biological molecules with the applications of NMA to low resolution data [55]. The low resolution information can be
obtained from density maps measured by cryogenic electron microscopy (cryo-EM). In
the case of cryo-electron microscopy experiments [31], it is often the case that a low resolution structure of the biological molecule is obtained but its conformation differs from a
known X-ray structure. To interpret the low resolution data, one would like to deform the
known X-ray structure so that it would fit into the low resolution data. As low frequency
normal modes often represent functional motions, those modes can be used to deform the
structure. In other words, atomic motions corresponding to low-frequency normal modes
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
15
are not localized but globalized. Moreover, a description of the dynamics can be applied
directly to 3D image reconstructions from EM, as it is independent of the resolution of
the underlying data. Therefore, the structural information at low resolution can be used
to construct discrete multi-resolution models using the techniques of vector quantization
[94]. This technique can be used to represent a 3D density map in real space with a finite
number of so-called landmark points.
Figure 2.4
Figure 2.4 illustrates how NMA can be applied to a low-resolution electron microscopy
map of E. coli RNA polymerase [63,66]. The figure shows that how the lowest-frequency
normal modes can be extracted from a low-resolution electron microscopy map in the
absence of a known atomic structure. The lowest frequency normal modes can describe
the essential motions. From the left figure, the EM map is represented by 1500 landmarks (green) distributed by vector quantization according to the map density. Those
landmarks were computed with vector quantization algorithm implemented in the Situs
package [66]. The elastic network is formed by close-neighbor interactions (red). From the
right, the lowest-frequency mode is obtained by NMA. The arrow indicates the direction
and relative amplitude of the displacements. Thus large conformational changes can be
elucidated directly from low-resolution structural data through the application of elastic
normal mode theory and vector quantization.
Database for the NMA
As the information of protein structure has been accumulated, the necessity to use given
data has been raised. As a result, a lot of databases were made to investigate the dynamical properties of protein more deeply and accurately and computer simulations has played
an important role as a tool for understanding functions of protein [38,41]. Although NMA
is based on the harmonic approximation, the results from NMA are less time-consuming
16
JUN KOO PARK
and reasonable than Monte Carlo simulation and molecular dynamics [32], and hence
databases of NMA has been constructed such as ProMode, ElNemo, WEBnm, NOMADRef and so on [36,50,67]. Promode is based on full-atom model [45], and it can provide
more realistic molecular motions in full-atom details [35]. Moreover, comparisons between homologous proteins can be possible [3] and search can be manipulated by a new
knowledge information [83]. ElNemo provides not only a fast and simple tool to analyze
low-frequency modes but also generate a large number of different starting templates for
molecular replacement [33]. One of the strengh of ElNemo is that there is no upper limit
to the size of the proteins that can be treated because rotation-translation block (RTB)
approximation is implemented
3. Gaussian Network Model
3.1. Theory
Let’s consider the representation of nodes(residues of protein) in elastic network of GNM.
Assume that every node is connected to its spatial neighbors with uniform spring constant. For ith and j th nodes, Let Ri0 and Rj0 denote the distance from the origin to the
ith and j th residue of protein, respectively. So Ri0 and Rj0 are equilibrium position vec0
denote the equilibrium distance vector between Ri0 and
tors, respectively. Similarly, Rij
Rj0 . Also, ∆Ri and ∆Rj means the instantaneous fluctuation vectors of each ith , j th
node. Instantaneous position vectors of these nodes are defined by Ri and Rj , and Rij is
instantaneous distance vector between ∆Ri and ∆Rj . Therefore,
Rj − Rj0 = ∆Rj , Ri − Ri0 = ∆Ri
0
Rij − Rij
= (Rj − Ri ) − (Rj0 − Ri0 ) = (∆Rj − ∆Ri ) = ∆Rij
Potential energy of the Gaussian network
In GNM method, a three dimensional protein structure is usually described as an elastic network connected by harmonic springs with a certain cutoff distance. Generally, in
the reduced model, the residue of a protein is considered as a location of Cα atoms of a
protein, and a sequence of such residues connected with strings forms an elastic network.
In the mathematical field of graph theory, the Laplacian matrix, sometimes called admittance matrix or Kirchhoff matrix, is a matrix representation of a graph. This matrix can
be used to express the connectivity of the residue of the protein. Contact matrix of Cα
atoms of a protein is constructed by using the Kirchhoff matrix Γ, defined by

if i 6= j and Rij < rc
 −1
0
if i 6= j and Rij > rc
Γij =
 PN
− j,j6=i Γij if i = j
where rc is a cutoff distance for spatial interactions, usually 7Å for proteins [80].
The energy function can be used to describe the interaction among the residues. When a
protein changes from an arbitrary position to an equilibrium state, the connected residues
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
17
can be considered as a set of masses contacting with springs. Thus around their equilibrium positions, using the harmonic potential approximation, potential energy of the
network can be defined in terms of ∆Ri and ∆Rj , instantaneous fluctuation vectors of
each ith , j th node respectively.
From the integral of Hooke’s Law, Ep can be expressed in terms of ∆Xi , ∆Yi and ∆Zi
components of ∆Ri as follows
N
γ X
2
2
2
Ep =
Γij [(∆Xj − ∆Xi ) + (∆Yj − ∆Yi ) + (∆Zj − ∆Zi ) ]
2 i,j
for some force constant γ.
If we express the X, Y and Z components of the fluctuation vectors ∆Ri as
∆X = [∆X1 , ∆X2 , · · · , ∆XN ]T , ∆Y = [∆Y1 , ∆Y2 , · · · , ∆YN ]T and ∆Z = [∆Z1 , ∆Z2 , · · · , ∆ZN ]T ,
then above equation simplifies to
Ep =
γ
[∆X T Γ∆X + ∆Y T Γ∆Y + ∆Z T Γ∆Z]
2
[1]
Furthermore, in the field of statistical mechanics, each fluctuation vector ∆X, ∆Y and
∆Z has to follow the Maxwell-Boltzmann distribution, respectively
p(∆X) =
−Ep (∆X)
1
γ
1
exp(
)=
exp(−
∆X T Γ∆X)
N
kB T
N
2kB T
where kB is the Boltzmann constant, T is the absolute temperature and N is the normalization factor [58].
Similarly, p(∆Y ) and p(∆Z) can be expressed.
Note that because p(∆X) is the probability density function(PDF),
Z
Z
1
γ
p(∆X)dX =
exp(−
∆X T Γ∆X) = 1
N
2k
T
B
Rn
Rn
Thus,
Z
N=
exp(−
Rn
γ
∆X T Γ∆X)
2kB T
This model, which also referred to as the Gaussian Network Model [88], can be used
to understand the movement of each residue in the protein mechanically. Especially,
we are interested in calculating expectation values of a particular residue fluctuations socalled mean-square fluctuation or the correlations between the fluctuations of two different
residues. Based on the statistical physics foundations of GNM, mean-square fluctuation
for each residue h∆Ri , ∆Ri i, and cross-correlations, h∆Ri , ∆Rj i can be evaluated.
Because the mean square fluctuations for particular residue i, is the sum of component
fluctuations,
h∆Ri , ∆Ri i = h∆Xi , ∆Xi i + h∆Yi , ∆Yi i + h∆Zi , ∆Zi i
18
JUN KOO PARK
Because cross-correlations between the fluctuation of two different residues, i and j, is the
sum of component cross-correlations,
h∆Ri , ∆Rj i = h∆Xi , ∆Xj i + h∆Yi , ∆Yj i + h∆Zi , ∆Zj i
Definition 3.1.1 Suppose that X1 , · · · , Xn has a normal distribution N(µ1 , σ12 ), · · · ,
N(µn , σn2 ), respectively. Then an n-dimensional random variable X with mean vector µ
and covariance matrix Σ is said to have a nonsingular multivariate normal distribution,
if (i)Σ is positive definite, and (ii) the probability density function (PDF) of X is of the
form
1
1
T −1
fX (x1 , · · · , xn ) =
(x − µ))
1 exp(− (x − µ) Σ
n
2
2
2
(2π) |Σ|
where |Σ| is the determinant of the covariance matrix [87,9].
If we combine the nonsingular and singular cases together, we have following definition.
Definition 3.1.2 An n-dimensional random variable with mean vector µ and covariance matrix Γ is said to have a multivariate normal distribution ( in symbols Nn (µ,Σ)) if
either
X ∼ Nn (µ, Σ), Σ > 0,
or
X ∼ Nn (µ, Σ), |Σ| = 0
Under certain conditions, an n-dimensional random variable with mean vector µ and covariance matrix Γ can have a singular multivariate normal distribution [9].
From the above discussion,
p(∆X) =
1
γ
1
1
kB T −1 −1
exp(−
∆X T Γ∆X) =
exp(− (∆X T (
Γ ) ∆X))
N
2kB T
N
2
γ
By the Definition 3.1, we know that ∆X has a n-dimensional multivariate normal distribution with mean zero and the covariance matrix kBγ T Γ−1
∆X ∼ N (0,
kB T −1
Γ )
γ
Therefore, the probability density function (PDF) for ∆X around the equilibrium positions is given by
p(∆X) =
1
(2π)
N
2
1
1
( kBγ T ) 2 |Γ−1 | 2
1
kB T −1 −1
exp(− (∆X T (
Γ ) ∆X))
2
γ
Because the smallest eigenvalue of contact matrix Γ is zero, the determinant of Γ is zero.
Thus, Γ−1 does not exist. Instead, pseudo inverse can be computed by using singular-value
decomposition of Γ. Γ−1 is constructed using the N-1 non-zero eigenvalues and associated
eigenvectors.
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
19
Recall. From calculus, for some constant a>0,
r
Z
π
2
exp(−ax )dx =
a
R
r
Z
π
x2 exp(−ax2 )dx =
4a3
R
Theorem 3.1.3 Suppose that X1 , · · · , Xn is the equilibrium component positions of
the residues in a protein, Γ is the corresponding contact matrix, and ∆X1 , · · · , ∆Xn
is the corresponding component residue fluctuations. Suppose that the singular value
decomposition of Γ is given by UΛUT where U is orthogonal matrix and Λ is diagonal
matrix with singular values. Then,
n
kB T X
h∆Xi , ∆Xi i =
Uij [Λ−1 ]jj Uij
γ j=1
[56]
Proof.
Z
h∆Xi , ∆Xi i =
(∆Xi )2 p(∆Xi )d∆Xi
R
1
−Ep (∆Xi )
(∆Xi )2 exp(
)d∆Xi
Ni
kB T
R
Z
1
−γΓii (∆Xi )2
(∆Xi )2 exp(
)d∆Xi
Ni R
2kB T
R
2
ii (∆Xi )
(∆Xi )2 exp( −γΓ2k
)d∆Xi
R
BT
R
2
ii (∆Xi )
exp( −γΓ2k
)d∆Xi
R
BT
r
r
π
c
−γΓii
,c =
3
4c
π
2kB T
kB T −1
[Γ ]ii
γ
kB T
[U Λ−1 U T ]ii
γ
n
kB T X
Uij [Λ−1 ]jj Uij
γ j=1
Z
=
=
=
=
=
=
=
[29,57,88] The above theorem shows that the mean-square fluctuation of residue can be
evaluated with only simple calculations referred to as singular-value decomposition.
Because
h∆Xi , ∆Xi i = h∆Yi , ∆Yi i = h∆Zi , ∆Zi i =
1
h∆Ri , ∆Ri i
3
20
JUN KOO PARK
the mean square fluctuation of the residue i is given by
h∆Ri , ∆Ri i =
3kB T −1
[Γ ]ii
γ
and the cross-correlations between residue i and j is given by
h∆Ri , ∆Rj i =
3kB T −1
[Γ ]ij
γ
Mode decomposition
Based on the above discussion, the cross-correlations between residue i and j is
h∆Ri , ∆Rj i =
3kB T −1
3kB T
3kB T
[Γ ]ij =
[(U ΛU T )−1 ]ij =
[U Λ−1 U T ]ij
γ
γ
γ
3kB T X 1
[ uk uTk ]ij
=
γ
λk
k
The frequency and shape of a mode is represented by its eigenvalue and eigenvector, respectively. Since the Kirchhoff matrix is positive semi-definite, the first eigenvalue, λ1 ,
is zero and the corresponding eigenvector have all its elements equal to √1N . This shows
that the network model is translation invariant. Here, the summation is performed over
all non-zero eigenvalues of Γ. It is clear that the above equations allow us to calculate
the amplitudes of fluctuations for individual residues without providing information regarding their absolute orientations or directions. However, in reality the fluctuations are
anisotropic in general. And it is important to access the directions of collective motions,
as these can be directly relevant to biological function and mechanisms. It is not indeed
possible to acquire an understanding of the mechanism of motion unless the fluctuation
vectors, in addition to their magnitudes, are elucidated. An extension of the GNM, called
the anisotropic network model (ANM) address this issue [1,2]. The main difference between the GNM and ANM is that all residue fluctuations are isotropic in the GNM, while
ANM does have directional preferences [32].
In the GNM, for the X, Y and Z directions, fluctuations of X, Y and Z are indepdent
and identically distributed each other, therefore the joint probability distribution function
(PDF) of all fluctuation can be expressed by
P (∆R) = P (∆X, ∆Y, ∆Z) = p(∆X)p(∆Y )p(∆Z) = [p(∆X)]3
where P denotes the joint PDF of X, Y and Z and p denotes the PDF.
This formula requires inverse of the Kirchhoff matrix. In the GNM, the determinant
of Kirchhoff matrix is zero, hence calculation of its inverse requires singular value decomposition. Therefore, the probability distribution of all fluctuations in GNM becomes
P (∆R) = p(∆X)p(∆Y )p(∆Z) =
1
(2π)
3N
2
3
T kB T −1 −1
exp(−
(∆X
(
Γ ) ∆X))
2
γ
( kBγ T ) |Γ−1 |
3
2
3
2
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
21
The relationship between NMA and GNM
Theorem 3.1.4 The Gaussian Network Model is equivalent to the Normal Mode Analysis
for evaluating the mean square residue fluctuations of a protein around equilibrium state,
with the energy function defined for the residues instead of the atoms by [52,56]
γ
Ep ≈ [∆X T Γ∆X]
2
Proof. In the proof of Lemma 2.1.2,
1
Ep (r) = QT ΛQ
2
Thus, suppose that Ep (r) is defined by
γ
Ep (r) = QT ΛQ
2
Then,
6kB T −1
αj2 =
[Λ ]jj
γ
Therefore, by Theorem 2.1.3
3n
3kB T X
h∆ri , ∆ri i =
Uij [Λ−1 ]jj Uij
γ j=1
and Theorem 3.1.3,
n
kB T X
Uij [Λ−1 ]jj Uij
h∆Xi , ∆Xi i =
γ j=1
Therefore,
h∆Ri , ∆Ri i =
n
3kB T X
Uij [Λ−1 ]jj Uij
γ j=1
Therefore, for both NMA and GNM, the mean square fluctuation for residue i can be
evaluated by using the singular value decomposition. Hence the Gaussian Network Model
is equivalent to the Normal Mode Analysis for evaluating the mean square residue fluctuations of a protein around equilibrium state given potential energy.
3.2 Applications
As we know, in order to know a variety of biological functions of proteins, the knowledge
of three dimensional structures of proteins and their dynamic behaviors are indispensable. Based on the GNM, to understand the dynamic behaviors of proteins, fluctuations
of residue can be calculated around the equilibrium state. On the other hand, equilibrium fluctuations of biological molecules can be measured experimentally, too. One of
the way to measure the fluctuations is based on the X-ray crystallography. In X-ray crystallography, B-factor (or temperature factor) of each atom is a measure of mean-squared
22
JUN KOO PARK
fluctuation of the native structure. The other way to obtain the fluctuations is established
on the work of nuclear magnetic resonance (NMR). In NMR experiments, this measure
can be obtained by calculating root-mean-squared differences between different models.
In many applications and publications [17,22,27], including the original articles [29], it
has been shown that expected residue fluctuations obtained from GNM is in good agreement with the experimentally measured native state fluctuations. The relation between
B-factors, for example, and expected residue fluctuations obtained from GNM is as follows
Bi =
8π 2
8π 2 kB T −1
h∆Ri , ∆Ri i =
(Γ )ii
3
γ
There is a single parameter in the theory, the force constant γ. This parameter is a measure of the strength of inter-molecular potential energy that stabilize the native folding.
In the GNM theory, γ has been estimated for each protein by comparing above equation.
Figure 3.1
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
23
Figure 3.1 [1](a) compares the B-factors evaluated by experimental data(open circles) and
those predicted from GNM for ribonuslease T1(RNase T1) in Protein Data Bank. (b)
shows the agreement between the mean square fluctuations predicted by the GNM and
those obtained across the 20 NMR models in PDB for reduced disulphide-bond formation
facilitator (DsbA) [92]. The correlation coefficient between GNM results and experimental data for these two example protein are 0.769 and 0.823, respectively.
Physical meanings of slow and fast modes
Recall. The cross-correlations between residue i and j is
h∆Ri , ∆Rj i =
3kB T X 1
[ uk uTk ]ij
γ
λk
k
Then, consider the correlation h∆Ri , ∆Rj ik , contributed by the kth mode
h∆Ri , ∆Rj ik =
3kB T 1
[uk ]i [uk ]j
γ λk
where [uk ]i is the ith element of uk .
Similarly, we can consider mean square fluctuation for residue i, contributed by the kth
mode
h∆Ri , ∆Ri ik =
3kB T 1
[uk ]2i
γ λk
Because U is unitary, uTk uk =1, uk is normalized. Thus, the plot of [uk ]2i for each residue
i=1,· · · ,n generalizes the normalized distribution of mean square fluctuations in the kth
mode. This plot is called the kth mode shape. Because the residue fluctuations are related
with the B factors, [uk ]i represents the mobilities of residue i in the kth mode. Due to
the factor λ1k , the slowest mode has the largest contribution. Moreover, in general, the
slowest motions have the highest degree of collectivity. In other words, many residues
are involved in the slow motion [1]. Shapes of the slowest modes reveal the mechanisms
of global motions, and the most constrained residues (minima of residue fluctuation) in
these modes, play a important role that govern the correlated movements of entire domains [17,85]. Although these motions are slow, they involve substantial conformational
change over several residues. On the other hand, the fastest modes involve the most
tightly packed and constrained residues in the molecule.
Figure 3.2 [1] compares the degree of collectivity for the slowest and fastest modes for
a protein, RNase T1. Figure (a) illustrates the broad and delocalized peaks over the
residues in the slow mode. In contrast, Figure (b) shows that the fastest modes are highly
localized with only a few peaks.
24
JUN KOO PARK
Figure 3.2
Diagonalization of the Kirchhoff matrix decomposes the normal modes of collective motions of the Gaussian network model of a biomolecule. The expected values of fluctuations
and cross-correlations are obtained from linear combinations of fluctuations along these
normal modes. The contribution of each mode is scaled with the inverse of that modes
frequency. Hence, slow (low frequency) modes contribute most to the expected fluctuations. Along the few slowest modes, motions are shown to be collective and global and
potentially relevant to functionality of the biomolecules. Fast (high frequency) modes,
on the other hand, describe uncorrelated motions not inducing notable changes in the
structure.
Database based on the GNM
To control and understand the function of protein, not only the knowledge of protein
structure but also the dynamics of protein is required because a function is a dynamic
property. In this sense, characterizing the dynamics of proteins is essential. Yang et
al. computed protein dynamics from protein data bank based on the GNM(PDB). The
singular value decomposition of contact matrix Γ is the most expensive part in GNM
calculations [7]. They also generated information at the residue level around equilibrium
position and the results are saved on iGNM, a web-based system. Static and dynamic
images of protein are accessible. Moreover, from oGNM, users are able to evaluate the
shape and dispersion of normal modes. Computations with the new engine are faster
than those using conventional normal mode analyses [11]. Based on the many studies,
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
25
usefulness of the GNM with EN methods are demonstrated. Thus, effective calculation
tool, such as oGNM is expected to be a useful resource in their research field.
4. Concluding Remarks
In this paper, the original and historical background of Normal Mode Analysis was reviewed. Since the establishment of normal mode theory, it has been used in a variety of
fields such as B-factor refinement in crystallography, NMR order parameters, the study
of large-scale conformational changes of proteins [48,68,69,72,74].
Until the development of diagonalization in mixed basis(DIMB) method, normal mode
studies remained limited to proteins of less than 300 residues. By using DIMB method,
studying on large molecular assembly comprising 2760 residues was successful [1] contrary
to methods based on Rayleigh quotient [70] or on the perturbation approach [10]. Evolution in diagonalizaiton techniques such as DIMB or RTB with the coarse-grained elastic
network model has reduced computational expense on large biological system.
In particular, as techniques in X-ray crystallography have progressed, structures of larger
biological molecules have become available and new techniques for NMA were needed to
address dynamics of these larger molecules. A few lowest frequency normal modes are
related to the motions of structural domains [20]. Thus, NMA is useful for studying the
mechanical details of the motions if the motion has a collective character and amplitude is
large enough [23,47]. Moreover, perturbing the structure along its lowset-frequency modes
seems promising tool for molecular replacement problems [12], structure refinement [44]
and fitting atomic structures into low resolution electron desity maps [19].
Regarding theory of normal mode, mean square fluctuation was reviewed mathematically.
In other words, residue fluctuation can be expressed as a sum of simple oscillator provided
that singular value decomposition of the Hessian matrix is given.
For the larger biological molecules, classical molecular dynamics simulation has been
limited due to their complexity. Based on the normal mode analysis approach, analysis
for conformational rearrangement and low resolution experimental data of protein can
be possible. Furthermore, a key to understand the dynamics of protein is the shape
dependent on the structure.
The other method to understand dynamics of protein is Gaussian Network Model based
on the theory of elastic network and Gaussian distribution combined with the assumption
that the structure is near its equilibrium state [1]. Based on the statistical physics foundations, mean-square fluctuation of residue can be evaluated with only simple calculations
referred to as singular-value decomposition.
More importantly, the Gaussian Network Model is equivalent to the Normal Mode Analysis for evaluating the mean square residue fluctuations of a protein around equilibrium
state combined with same energy function. Therefore, the GNM can be used in several
areas that the NMA applied such as B-factor refinement in crystallography, NMR order
parameters and so on [17,22,27,29].
Furthermore, GNM can be used to understand the relationship between the degree of
collectivity and the modes. Shapes of slow modes reveal the dynamics of global motions
and fastest modes involve the most constrained residues [1].
26
JUN KOO PARK
References
[1] Cui, Q., Bahar, I. (2006). Normal Mode Analysis: Theory and Application to biological and chemical
systems. Chapman & Hall/CRC
[2] Eran Eyal, Lee-Wei Yang and Ivet Bahar (2006). Anisotropic Network Model: Systematic Evaluation
and a New Web Interface
[3] Wako, H., Otsuka, M., Tomizawa, Y., Kato, M., Endo, S. (2005). Comparisons between Dynamic
Properties of Homologous Protein Structures in ProMode.
[4] (2003). Classical Normal Mode Analysis:
Harmonic Approximation. (lecturenote)
http://www.colby.edu/chemistry/PChem/notes/NormalModesText.pdf
[5] Tama, F., Valle, M., Frank, J., Brooks C.L. (2003). Dynamic reorganization of the functionally
active ribosome explored by normal mode analysis and cryo-electron microscopy. PNAS, 100(16),
9319-9323
[6] Valadie, H., Lacapc?re, J.J., Sanejouand, Y.-H., Etchebest, C. (2003). Dynamical Properties of the
MscL of Escherichia coli: A Normal Mode Analysis, Journal of Molecular Biology, 332, 657-674
[7] Yang,L.W., Liu,X., Jursa,C.J., Holliman,M., Rader,A.J., Karimi,H.A. and Bahar,I. (2005). iGNM:
a database of protein functional motions based on Gaussian Network Model. Bioinformatics, 21,
2978-2987.
[8] Van Wynsberghe, A. W., Cui, Q. (2006). Interpreting Correlated Motions Using Normal Mode
Analysis. Structure, 14, 1647-1653
[9] Tong, Y. (1990). The Multivariate Normal Distribution. Srpinger-Verlag
[10] Durand, P., Trinquier, G., Sanejouand, Y. (1994). A new approach for determining low frequency
normal modes in macromolecules, Biopolymers, 34, 759
[11] Yang, L.W., Rader, A. J., Liu, X., Jursa, C.J., Chen, S.C., Karimi, H.A., Bahar, I. (2006). oGNM:
online computation of structural dynamics using the Gaussian Network Model. Nucleic Acids Research, 34, W24-W31
[12] Suhre,K. and Sanejouand,Y.-H. (2004) On the potential of normal-mode analysis for solving difficult
molecular-replacement problems. Acta Crystallogr., D60, 796-799.
[13] Kondrashov, D.A., Cui, C., Phillips, G.N. (2006). Optimization and Evaluation of a Coarse-Grained
Model of Protein Motion Using X-Ray Crystal Data. Biophysical Journal, 91, 2760-2767
[14] Schlick, T.(2003). Molecular Modeling and Simulation, An Interdisciplinary Guide, Springer
[15] Eman, B. (2006). The Gaussian Network Model: Precise Prediction of Residue Fluctuations and
Application to Binding Problems. Biophysical Journal, 91, 3589-3599
[16] Li, G., and Q. Cui. (2002). A coarse-grained normal mode approach for macromolecules: an efficient
implementation and application to Ca21-ATPase. Biophys. J. 83:2457-2474.
[17] Bahar, I., Atilgan, A., Demirel, C., Erman, B. (1998). Vibrational Dynamics of Folded Proteins:
Significance of Slow and Fast Motions in Relation to Function and Stability. Physical Review Letters,
80(12), 2733-2736.
[18] Micheletti, C., Carloni, P., Maritan, A. (2004). Accurate and efficient description of protein vibrational dynamics: comparing molecular dynamics and Gaussian models. Proteins, 55, 635-645
[19] Tama, F., Miyashita, O., Brooks III, C. (2004). Flexible multi-scale fitting of atomic structures into
low-resolution electron density maps with elastic network normal mode analysis. J. Mol. Biol, 337,
985-999
[20] Hinsen, K. (1998). Analysis of Domain Motions by Approximate Normal Mode Calculations, Proteins, 33:417-429
[21] Hinsen, K., Thomas, A, Field, M.J. (1999). Analysis of Domain Motions in Large Proteins, Proteins,
34:369-382
[22] Atilgan, A. R. , Durell, S. R., Jernigan, R. L., Demirel, M. C., Keskin, O. and Bahar I. (2001).
Anisotropy of Fluctuation Dynamics of Proteins with an Elastic Network Model, Biophysical Journal,
80:505-515
[23] Tama, F., F. X. Gadea, O. Marques, and Y. Sanejouand. (2000). Building block approach for determining low-frequency normal modes of macromolecules. Prot. Struct. Funct. Gen. 41:1-7.
[24] Petrone, P., Pande, V.S. (2006). Can Conformational Change Be Described by Only a Few Normal
Modes? Biophysical Journal, 90, 1583-1593
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
27
[25] Bahar,I. and Rader,A.J. (2005) Coarse-grained normal mode analysis in structural biology. Curr.
Op. Struct. Biol., 15, 586-592.
[26] Eom, K., Na, S. (2008). Coarse-Grained Structural Model of Protein Molecules. Preprint to the
chapter of the book ”Computational Biology: New Research”
[27] Keskin, O., Jernigan, R., Bahar, I. (2000). Proteins with Similar Architecture Exhibit Similar LargeScale Dynamic Behavior. Biophysical Journal, 78, 20932106
[28] Frauenfelder, H., Parak, F., Young, R. (1988). Conformational substates in proteins. Ann. Rev.
Biophys. Chem, 17, 451-479
[29] Bahar, I., Atilgan, A.R., Erman, B. (1997). Direct evaluation of thermal fluctuations in proteins
using a single-parameter harmonic potential, Folding & Design, 2, 173-181
[30] Brooks, B., Bruccoleri, R. Olafson, B., States, D., Swaminathan, S., Karplus, M. (1983). CHARMM:
a program for macromolecular energy, minimization, and dynamics calculations. J. Comp. Chem, 4,
187
[31] Tama, F., Valle, M., Frank, J., Brooks, C.L. (2003). Dynamic reorganization of the functionally
active ribosome explored by normal mode analysis and cryo-electron microscopy. PNAS, 100(16),
9319-9323
[32] Doruker, P., A. R. Atilgan, and I. Bahar. (2000). Dynamics of proteins predicted by molecular
dynamics simulations and analytical approaches: application to a-Amylase inhibitor. Prot. Struct.
Funct. Gen. 40:512-524.
[33] Suhre,K. and Sanejouand,Y.H. (2004) ElNemo: a normal mode web server for protein movement
analysis and the generation of templates for molecular replacement. Nucleic Acids Res., 32, W610W614.
[34] Brooks BR and Karplus M (1983) Harmonic dynamics of proteins: normal mode and fluctuations
in bovine pancreatic trypsin inhibitor. Proceedings of the National Academy of Sciences of the USA
80: 6571-6575.
[35] Wako, H., Kato, M., Endo, S. (2003). Improvements in ProMode (a Database of Normal Mode
Analyses of Proteins). Genome Informatics, 14, 663-664
[36] Lindahl, E., Azuara, C., Koehl, P., Delarue, M. (2006). NOMAD-Ref: visualization, deformation
and refinement of macromolecular structures based on all-atom normal mode analysis. Nucleic Acids
Research, 34, W52-W56
[37] Perahia, D., Mouawad, L. (1995). Computation of low-frequency normal modes in macromolecules:
improvements to the method of diagonalization in a mixed basis and application to hemoglobin.
Comput. Chem, 19, 241
[38] Krebs,W.G., Alexandrov,V., Wilson,C.A., Echols,N., Yu,H. and Gerstein,M. (2002) Normal mode
analysis of macromolecular motions in a database framework: developing mode concentration as a
useful classifying statistic. Proteins: Struct. Funct. and Genet., 48, 682-695.
[39] Lu, M., Ma, J. (2005). The role of shape in determining molecular motions. Biophys. J., 89, 2395-2401
[40] Tama F. (2003). Normal mode analysis with simplified models to investigate the global dynamics of
biological systems. Protein and Peptide Letters, 10(2), 119-132
[41] Alexandrov,V., Lehnert,U., Echols,N., Milburn,D., Engelman,D. and Gerstein,M. (2005) Normal
modes for predicting protein motions: a comprehensive database assessment and associated Web
tool. Prot.Sci., 14, 633-643.
[42] Van Wynsberghe, A., Li, G., Cui, Q. (2004). Normal-Mode Analysis Suggests Protein Flexibility
Modulation throughout RNA Polymerase’s Functional Cycle. Biochemistry, 43, 13083-13096
[43] Harary, F. (1971) Graph theory. Addison-Wesley.
[44] Delarue,M. and Dumas,P. (2004) On the use of low-frequency normal modes to enforce collective
movements in refining macromolecular structural models. Proc. Natl. Acad. Sci., 101, 6957-6962.
[45] Wako, H., Endo, S. (2002). ProMode: A Database of Normal Mode Analysis of Proteins. Genome
Informatics, 13, 519-520
[46] Gorba, C., Miyashita, O., Tama, F. (2008). Normal mode flexible fitting of high resolution structure
of biological molecules toward one dimensional low resolution data. Biophysical Journal, 94, 15891599
[47] Delarue,M. and Sanejouand,Y.-H. (2002) Simplified normal mode analysis of conformational transitions in DNA-dependent polymerases: the elastic network model. J. Mol. Biol., 320, 1011-1024.
28
JUN KOO PARK
[48] Thomas, A., K. Hinsen, M. J. Field, and D. Perahia. (1999). Tertiary and quaternary conformational
changes in aspartate transcarbamylase: a normal mode study. Prot. Struct. Funct. Gen. 34:96-112.
[49] Ma JP (2005) Usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes. Structure 13(3): 373-380.
[50] Hollup,S.M., Salensminde,G. and Reuter,N. (2005) WEBnm@: a web application for normal mode
analysis of proteins. BMC Bioinformatics, 6, 52.
[51] Cui, Q., Li, G., Ma, J., Karplus, M. (2004). A Normal Mode Analysis of Structural Plasticity in the
Biomolecular Motor F1-ATPase. Journal of Molecular Biology, 340, 345-372
[52] Levitt, M., Sander, C., Stern, P.S. (1985). Protein Normal-mode Dynamics: Trypsin Inhibitor,
Crambin, Ribonuclease and Lysozyme. Journal of Molecular Biology, 181, 423-447
[53] Li, G., Cui, Q. (2004). Analysis of Functional Motions in Brownian Molecular Machines with an
Efficient Block Normal Mode Approach: Myosin-II and Ca21-ATPase. Biophysical Journal, 86, 743763
[54] Yu, X., Park, J., Leitner, D. (2003). Thermodynamics of Protein Hydration Computed by Molecular
Dynamics and Normal Modes. Journal of physical chemistry, 107, 12820-12828
[55] Chacon, P., Tama, F., Wriggers, W. (2003). Mega-Dalton biomolecular motion captured from electron microscopy reconstructions. J. Mol. Biol, 326, 485-492
[56] Wu, Z. (2006). Linear Algebra in Biomolecular Modeling. Handbook of Linear Algebra, Chapman/Hall CRC Press
[57] Kloczkowaski, A., Mark, J., Erman, B. (1989). Chain Dimensions and Fluctuations in Random
Elastomeric Networks. 1. Phantom Gaussian Networks in the Undeformed State. Macromolecules,
22, 1423-1432
[58] Huang, Kerson (1990). Statistical Mechanics. Wiley, John & Sons, Inc. ISBN 0-471-81518-7.
[59] Alder, B. J.; Wainwright, T. E. (1959). Studies in Molecular Dynamics. I. General Method. Journal
of Chem. Phys. 31(2), 459
[60] Go, N., T. Noguti, and T. Nishkawa. (1983). Dynamics of a small globular protein in terms of
low-frequency vibrational modes. Proc. Natl. Acad. Sci. USA. 80:3696-3700.
[61] Perahia, D., Mouawad, L. (1995). Computation of low-frequency normal in macromolecules: improvements to method of diagonalization in a mixed and application to hemoglobin. Computers
Chem., 19(3), 241-246
[62] Marques, O., and Y. Sanejouand. (1995). Hinge-bending motion in citrate synthase arising from
normal mode calculations. Prot. Struct. Funct. Gen. 23:557-560.
[63] Darst, S., Opalka, N., Chacon, P., Polyakov, A., Richter, C., Zhang, G. (2002). Conformational
flexibility of bacterial RNA polymerase. PNAS. 99(7), 4296-4301
[64] Hayward S. (2001). Normal mode analysis of biological molecules. In Computational biochemistry
and biophysics. New-York, Marcel Dekker, Inc., 153-168.
[65] Miyashita, O., Tama, F. (2007).Normal Mode Analysis Techniques in Structural Biology. Encyclopedia of life Sciences, John Wiley & Sons.
[66] http://emotion.biomachina.org/about.html
[67] Delarue M, Lindahl E. (2004). Normal mode calculation and visualization using Pymol.
[http://lorentz.immstr.pasteur.fr/nma/].
[68] Reuter N, Hinsen K, Lacapere JJ. (2003). Transconformations of the SERCA1 Ca-ATPase: a normal
mode study. Biophys J, 85:2186-2197.
[69] Brink J, Ludtke SJ, Kong YF,Wakil SJ, Ma JP, Chiu W. (2004). Experimental verification of conformational variation of human fatty acid synthase as predicted by normal mode analysis. Structure
12:185-91
[70] Brooks, B., and Karplus, M. (1985). Normal Modes for Specific Motions of Macromolecules: Application to the Hinge-Bending Mode of Lysozyme. Proc. Natl. Acad. Sci. USA 82, 4995-4999.
[71] Kidera, A., Inaka, K., Matsushima, M., Go, N. (1991). Normal mode refinement: Crystallographic
refinement of protein dynamic structure applied to human lysozyme. Biopolymers, 32(4), 315-319.
[72] Ma, J., and Karplus, M. (1997). Ligand-induced conformational changes in ras p21: a normal mode
and energy minimization analysis. J. Mol. Biol. 274, 114-131.
[73] Wu, Z. Lecture notes on computational structural biology.
NORMAL MODE ANALYSIS AND GAUSSIAN NETWORK MODEL
29
[74] Seno, Y., and Go, N. (1990b). Deoxymyoglobin studied by the conformational normal mode analysis.
I. Dynamics of globin and the heme-globin interaction. J. Mol. Biol. 216, 95-109.
[75] Tama, F., Miyashita, O., Brooks, C.L. (2004). Normal mode based flexible fitting of high-resolution
structure into low-resolution experimental data from cryo-EM. Journal of Structural Biology, 147,
315-326
[76] Thomas, A., Field, M.J., and Perahia, D. (1996b). Analysis of the Low-frequency Normal Modes of
the R State of Aspartate Transcarbamylase and a Comparison with the T State Modes. J. Mol. Biol.
261, 490-506.
[77] Tirion, M.M., ben-Avraham, D., Lorenz, M., and Holmes, K.C. (1995). Normal modes as refinement
parameters for the F-actin model. BiAcad. ophys. J. 68, 5-12.
[78] Tozinni, V., Mccammon JA. (2005). A coarse grained model for the dynamics of ap opening in HIV-1
protease. Chem. Phys. Lett. 413,123-128.
[79] Moore, W. J.(1972). Physical Chemistry, 4th Ed., Prentice-Hall: Englewood Cliffs, NJ, , Chapter
17, Sec. 14., 775-776.
[80] Godsil,C. (2001). Algebraic graph theory. New York : Springer ISBN:0387952209
[81] Serway, R., Jewett, J. (2003). Physics for Scientists and Engineers. Brooks/Cole. ISBN 0-534-408427.
[82] Ma, J. (2004). New Advances in Normal Mode Analysis of Supermolecular Complexes and pplications
to Structural Refinement. Current Protein and Peptide Science, 5(2), 119-123
[83] Wako, H., Otsuka, M., Tomizawa, Y., Kato, M., Endo, S. (2005). Reflection of Knowledge Information in ProMode.
[84] Mouawad, L., Perahia, D. (1993). Diagonalization in a mixed basis: a method to compute lowfrequency normal modes for large macromolecules. Biopolymers, 33, 599-611.
[85] Bahar, I., Jernigan, R. L. (1998). Vibrational dynamics of transfer RNAs: comparison of the free
and synthetase bound forms. J. Mol. Biol. 281, 871-884
[86] Goldstein, H. (2002). Classical mechanics. Addison Wesley
[87] Casella, G., Berger, R. (2002). Statistical inference. Duxbury.
[88] Haliloglu, T., Bahar, I., Erman, B. (1997). Gaussian Dynamics of Folded Proteins. Physical Review
Letters, 79(16), 3090-3093.
[89] Tirion, M. (1996). Large Amplitude Elastic Motions in Proteins from a Single-Parameter, Atomic
Analysis. Physical Review Letters, 77(9), 1905-1908.
[90] Tama, F., Sanejouand, Y. (2001). Conformational change of proteins arising from normal mode
calculations. Protein Engineering, 14(1), 1-6
[91] Tama, F., Brooks, C. (2006). Symmetry, Form, and Shape: Guiding Principles for Robustness in
macromolecular Machines. Annu. Rev. Biophys. Biomo. Struct, 35: 115-133
[92] Wilson, K., Malcolm, B., Matthews, B. (1992). Structural and thermodynamic analysis of compensating mutations within the core of chicken egg white lysozyme. J. Biol. Chem, 267, 10842
[93] Xu, C., Tobi, D., Bahar, I. (2003). Allosteric Changes in Protein Structure Computed by a Simple
Mechanical Model: Hemoglobin T ↔ R2 Transition, J. Mol. Biol, 333, 153-168
[94] Tama, F., Wriggers, W., Brooks III, C.(2002). Exploring Global Distortions of Biological Macromolecules and Assemblies from Low-resolution Structural Information and Elastic Network Theory.
J. Mol. Biol, 321, 297305
Download