Constructing Chained Molecular Dynamics Simulations of HIV-1

advertisement
Constructing Chained Molecular Dynamics Simulations of HIV-1
Protease Using the Application Hosting Environment
P. V. Coveney, S. K. Sadiq, R. S. Saksena, and S. J. Zasada
Centre for Computational Science, Department of Chemistry,
University College London, Christopher Ingold Laboratories,
20 Gordon Street, London, WC1H 0AJ
Abstract
Many crystal structures of HIV-1 protease exist, but the number of clinically interesting
drug resistant mutational patterns is far larger than the available crystal structures. Mutational protocols convert one protease sequence with available crystal structure into another
that diverges by a small number of mutations. It is important that such mutational algorithms
are followed by suitable multi-step equilibration protocols, e.g. using chained molecular dynamics simulations, to ensure that the desired mutant structure is an accurate representation.
Previously these have been difficult to perform on computational grids due to the need to keep
track of large numbers of simulations. Here we present a simple way to construct a chained
MD simulation using the Application Hosting Environment.
I
Introduction
Computational grids [1, 2] provide an ideal
environment to perform compute intensive tasks
such as molecular dynamic simulations, but
many scientists have been deterred from using
them due to the perceived difficulty of using
the grid middleware [3]. The Application Hosting Environment [4] is a lightweight, WSRF [5]
compliant middleware system designed to allow
a scientist to easily run applications on remote
grid resources. We have successfully used it to
host the NAMD [6] molecular dynamics code,
and run jobs on both the UK National Grid Service and the US TeraGrid. Here we present a
case study using the AHE to construct chained
application workflows in an investigation into
the HIV-1 protease.
II
Molecular Dynamics of HIV-1
Protease
Our case study is on the use of the AHE to
manage molecular dynamics simulations of the
HIV-1 protease. The protease encoded by HIV
is responsible for the cleavage of viral polyprotein precursors and subsequent maturation of
the virus. The protease is a symmetric dimer
(each monomer has 99 amino acids) that encloses a pair of catalytic aspartic acid residues in
the active site. The active site is bound by a pair
of highly flexible flaps that allow the substrate
access to the aspartic acid dyad [9, 10].
The enzyme has been a key target for antiretroviral inhibitors and an example of structure assisted drug design [11]. Unfortunately,
therapy is limited by the emergence and pro-
liferation of drug resistant mutations in various
enzymes of HIV [12]. HIV-1 protease, also exhibits tolerance to a significant quantity of nondrug resistant mutations as part of its natural
variability [13]. Comparisons of resolved crystal
structures of HIV-1 protease supports the stability of tertiary structure to many mutations
[14].
Although many such crystal structures of
HIV-1 protease exist, the scope and extent of
both clinically interesting drug resistant mutational patterns [15] and non-drug resistant mutations is far larger than available by crystallographic methods. It is therefore necessary
when modeling HIV-1 protease mutants to employ mutational protocols that convert one protease sequence with available crystal structure
into another that diverges by a small number
of mutations, but which has no crystal structure. It is also important that such mutational
algorithms are followed by suitable multi-step
equilibration protocols to ensure that the desired
mutant structure is an accurate representation
of the actual structure. Whilst standard protocols exist that employ several steps including
gentle annealing to physiologically relevant temperatures, removal of force constraints on the
protease and establishing a relevant thermodynamic ensemble [10], more extensive protocols
are required to cope with the implementation of
divergent mutations from a crystal structure.
Here we present an equilibration protocol
composed of a chained sequence of molecular
dynamics simulations that implements standard
protocol requirements as well as including steps
that allow for conformational sampling and re-
Eq Step
Procedure
eq0
eq1
eq2
eq3∗
eq4
eq5
eq6
eq7
eq8
eq9
eq10
eq11
minimization
annealing: 50K - 100K
annealing: 100K - 300K
NVT∗∗
NVT
NVT
NVT
NVT
NVT
NVT
NVT
NPT∗∗∗
Sim Duration
(ps)
2000 iterations
10
20
200
50
50
50
50
50
50
470
1000
Force Constant
(kcal/mol)
1
1
1
1
0.8
0.6
0.4
0.2
0.2
0.2
0
0
Constrained
Atoms
A
A
A
A
A
A
A
A
B
C
-
A = all non-hydrogen protease atoms
B = class ‘A’ except atoms of all amino acids within 5 Å of and including N25D mutations
C = class ‘A’ except atoms of all amino acids within 5 Å of and including I84V mutations
* This step prevents premature flap collapse [7]
** NVT ensemble temperature maintained using Langevin thermostat with coupling coefficient of 5 /ps
*** NPT ensemble maintained using Berendsen Barostat [8] at 1bar and with pressure coupling of 0.1 ps
Table 1: Equilibration protocol for molecular dynamics simulation of HIV-1 protease incorporating
relaxation of mutated amino acid residues.
laxation of the incorporated mutations within
the framework of their surrounding protease
structure. Furthermore, we show how use of the
AHE both automates such a chained protocol
and facilitates deployment of such simulations
across distributed grid resources.
III
The Application Hosting
Environment
The Application Hosting Environment
(AHE) is a lightweight, WSRF [5] compliant,
web services based environment for hosting unmodified scientific applications on the grid. The
AHE is designed to allow scientists to quickly
and easily run unmodified, legacy applications
on grid resources, manage the transfer of files
to and from the grid resource and monitor the
status of the application. The philosophy of the
AHE is based on the fact that very often a group
of researchers will all want to access the same
application, but not all of them will possess the
skill or inclination to install the application on
a remote grid resource. In the AHE, an expert
user installs the application and configures the
AHE server, so that all participating users can
share the same application. For a discussion
of the architecture and implementation of the
AHE, see [4] .
The AHE provides users with both GUI and
command line clients to interact with a hosted
application. In order to run an application using
the AHE command line clients, firstly the user
must issue the ahe-listapps command to find the
end point of the application factory of the application she wants to run. Next she issues the aheprepare command to create a new WS-Resource
to manage the state of her application instance.
Finally she issues the ahe-start command, which
will parse her application configuration file to
find any input files that need to be staged to the
remote grid resource, stage the necessary files,
and launch the application. The user can then
use the ahe-monitor command to check on the
progress of her application and, once complete,
the ahe-getoutput command to stage the output
files back to her local machine. By calling these
simple commands from a shell or Perl script the
user is able to create complex application workflows, starting one application execution using
the output files from a previous run.
IV
Implementation
The 1TSU crystal structure was used as the
starting point for the molecular dynamics equilibration protocol. This structure contains inactive wildtype protease complexed to a substrate.
VMD [16] was used for the initial preparation of
the system prior to simulation. The coordinates
of the substrate were removed from the structure, all missing hydrogen atoms were inserted
and the structure was solvated and neutralized.
The N25D mutation was incorporated to restore
catalytic activity to the protease and the I84V as
it is a primary drug resistant mutation for sev-
(a)
(b)
Figure 1: Root-mean-squared-deviation (RMSD) of protease amino acid backbone atoms excluding
hydrogen, with respect to the initial X-ray structure (a) and of the dimeric pair of mutated amino
acid atoms excluding hydrogen, with respect to the initial X-ray structure (b).
eral inhibitors. The molecular dynamics package
NAMD2 [6] was used for all equilibration simulations.
The equilibration protocol was adapted from
Perryman et al.
[10] with several important modifications and is presented in Table 1.
NAMD configuration files corresponding to each
step of the equilibration protocol were set up in
a way such that the output files of each step
would serve as the input files of the next step.
The files were generated automatically using a
Perl script designed to set up such systems, and
a naming convention was used for the NAMD
configuration file at each step of the equilibration protocol to ease scripting of the workflow.
A Perl script was created to execute the desired equilibration chain on a remote grid resource, using the AHE middleware to manage
the state of each of the steps in the chain. The
script executed the ahe-prepare command followed by the ahe-start command sequentially for
each step of the equilibration protocol. This had
the effect of preparing a WS-Resource to manage the step, staging input files necessary for the
step, and executing the application. The script
then polled the AHE server at regular intervals
using the ahe-monitor command until the simulation step had completed. Once complete, the
script staged the files back to the local machine
and used them initiate the next step of the equilibration protocol. The script terminated after
sequentially executing all desired steps in the
chained protocol.
tease was performed (Figure 1 (a)). A slow relaxation of the backbone occurs across the first
0.4 ns of equilibration due to the gradual reduction of the force constant from 1 kcal/mol
to 0.2 kcal/mol. The RMSD plateau of 0.5 Å
between 0.4 ns and 0.6 ns is a signature of that
part of the equilibration protocol where the force
constant is maintained at 0.2 kcal/mol for most
of the protease and stepwise relaxation of the
mutated amino acid positions and local environment is allowed. As soon as the force constant
is removed there is a rapid rise in the RMSD
to approximately 1 Å away from X-ray structure as all amino acid positions along the backbone move towards optimal conformations. The
mean RMSD across the last 1ns of equilibration
is 1.11±0.11 Å and describes equilibration of the
protease at a relatively low distance from initial
X-ray structure with small fluctuations.
The RMSD of the backbone and sidechain
atoms (excluding hydrogen) of the mutated
amino acids was also calculated (Figure 1 (b)).
The positions of both residues change relatively
little during the period in which their force constants are set to zero. Once the whole protease
is free from constraints, residue 25 describes an
abrupt change in RMSD to approximately 1 Å
whilst residue 84 changes more gradually to the
same value. This may be due to the fact that
although both sets of residues are in the active
site, the D25 dyad is more exposed to water than
V84 and thus more prone to moving once constraints have been lifted. The mean RMSD during the last 1 ns of simulation is 1.02±0.15 Å
V Results & Conclusion
and 0.92±0.10 Å for D25 and V84 residues reAnalysis of the root-mean-square-deviation spectively, which is smaller than the backbone
(RMSD) of the backbone atoms of HIV-1 pro- RMSD of the whole protease.
The simulation has shown that in this case, [7] K. L. Meagher and H. A. Carlson. Solthe change in RMSD of mutated amino acids
vation Influences Flap Collapse in HIV-1
is similar to that of the protease backbone as
Protease. Proteins: Struct. Funct. Bioinf.,
a whole. Whilst this is indicative of a good
58:119–125, 2005.
initial mutational protocol, such as that used
in VMD, differences in the RMSD of mutated [8] H. J. C. Berendsen, J. P. M. Postma, W. F.
van Gunsteren, A. DiNola, and J. R. Haak.
residues during minimization and force relaxMolecular dynamics with coupling to an exation show that an equilibration protocol that
ternal bath. J. Chem. Phys., 81:3684–3690,
allows conformational change of mutated amino
1984.
acids assists in the achievement of equilibration. Furthermore, as a significant degree of [9] W. R. P. Scott and C. A. Schiffer. Curling
simulation using a multi-step protocol is necesof Flap Tips in HIV-1 Protease as a Mechsary to achieve equilibration, the ability to auanism for Substrate Entry and Tolerance of
tomate implementation of such a protocol using
Drug Resistance. Structure, 8:1259–1265,
the AHE is greatly beneficial when considering
2000.
the need to do such equilibrations for a large
[10] A. L. Perryman, J. Lin, and J. A. McCamnumber of protease mutations.
mon. HIV-1 protease molecular dynamics
We have also shown that due to the flexiof a wild-type and of the V82F/I84V muble nature of the AHE, a complex workflow can
tant: Possible contributions to drug resisbe orchestrated by scripting the AHE command
tance and a potential new target site for
line clients; in this case we have conducted a
drugs. Protein Sci., 13:1108–1123, 2004.
chained molecular dynamic simulation using less
than forty lines of Perl code.
[11] A. Wlodawer and J. Vondrasek. Inhibitors
of HIV-1 Protease: A Major Success of
References
Structure-Assissted Drug Design. Annu.
[1] P. V. Coveney, editor. Scientific Grid ComRev. Biophys. Biomol. Struct., 27:249–284,
puting. Phil. Trans. R. Soc. A, 2005.
1998.
[2] I. Foster, C. Kesselman, and S. Tuecke. The
[12] V. A. Johnson, F. Brun-Vezinet, B. Clotet,
anatomy of the grid: Enabling scalable virB. Conway, D. R. Kuritzkes, D. Pillay,
tual organizations. Intl J. Supercomputer
J. Schapiro, A. Telenti, and D. Richman.
Applications, 15:3–23, 2001.
Update of the Drug Resistance Mutations
[3] J. Chin and P. V. Coveney.
Toin HIV-1: 2005. Int. AIDS Soc. - USA,
wards tractable toolkits for the grid:
13:51–57, 2005.
a plea for lightweight, useable middleware.
Technical report, UK e-Science [13] N. G. Hoffman, C. A. Schiffer, and
R. Swanstrom. Covariation of amino acid
Technical Report UKeS-2004-01, 2004.
positions in hiv-1 protease.
Virology,
http://nesc.ac.uk/technical papers/
314:536–548,
2003.
UKeS-2004-01.pdf.
[4] P. V. Coveney, S. K. Sadiq, R. S. Saksena, [14] V. Zoete, O. Michielin, and M. Karplus.
Relation between Sequence and Structure
M. Thyveetil, S. J. Zasada, M. Mc Keof HIV-1 Protease Inhibitor Complexes: A
own, and S. Pickles. A lightweight applicaModel System for the Analysis of Protein
tion hosting environment for grid computFlexibility. J. Mol. Biol., 315:21–52, 2002.
ing. 5th UK e-Science All Hands Meeting,
2006.
[15] T. D. Wu, C. A. Schiffer, M. Gonzales,
J. Taylor, R. Kantor, S. Chou, D. Is[5] S. Graham, A. Karmarkar, J Mischkinraelski, A. R. Zolopa, W. J. Fessel, and
sky, I. Robinson, and I. Sedukin. Web
R. W. Shafer. Mutation Patterns and
Services Resource Framework. Technical
Structural Correlates in Human Immunodreport, OASIS Technical Report, 2006.
eficiency Virus Type 1 Protease following
http://docs.oasis-open.org/wsrf/wsrfDifferent Protease Inhibitor Treatments. J.
ws resource-1.2-spec-os.pdf.
Virol., 77:4836–4847, 2003.
[6] L. Kale, R. Skeel, M. Bhandarkar,
R. Brunner, A. Gursoy, N. Krawetz, [16] W. Humphrey, A. Dalke, and K. Schulten.
VMD - Visual Molecular Dynamics. J. Mol.
J. Phillips, A. Shinozaki, K. Varadarajan,
Graph., 14:33–38, 1996.
and K. Schulten. NAMD2: Greater scalability for parallel molecular dynamics. J.
Comp. Phys., 151:283–312, 1999.
Download