Applying e-Science to Computational Chemistry

advertisement
Applying e-Science to Computational Chemistry
D.S. Coombesa, B. Butchartb, C.R.A. Catlowa, S.A. Frencha, H. Nowellc, S.L. Pricec
and W. Emmerichb
a
Davy Faraday Research Laboratory, The Royal Institution of Great Britain, 21
Albemarle Street, London, W1S 4BS, UK
b
Dept. of Computing Science, University College London, Gower Street, London,
WC1E 6BT, UK
c
Dept. of Chemistry, University College London, 20 Gordon Street, London, WC1H
0AJ, UK
Abstract
We summarise our work on the EPSRC e-science project ‘E-Science Technologies in the
Simulation of Complex Materials’. The aim of this project has been to apply eScience/Grid technologies to two important areas in computational chemistry, namely
combinatorial catalytic chemistry and crystal structure prediction.
Introduction
The EPSRC funded e-Science project ‘eScience Technologies in the Simulation of
Complex Materials’ has set far-reaching goals
both in application and development work.
Stress has been placed on improving both the
computational environment available to
researchers and achieving maximum efficiency
from computational hardware, both those inhouse and those available within the national
Grid network, which will greatly increase the
resources available to all institutions. The
project combines researchers specialised in
applications, code development and computer
science to produce the design and innovation
that will assist the standardisation of practices
within modern computational chemistry. We
describe two important areas of computational
chemistry, which have greatly benefited from
the tools and techniques being developed
within the e-science framework.
Applications
I Combinatorial Chemistry
Combinatorial techniques are important
in many areas of chemistry and will have a
major role to play in the development of new
materials with specific properties. One of the
most important is in the area of catalyst
development, where modelling will allow
screening of a large number of zeolites
containing different constituents that are varied
in optimisations.
One application is
optimisation of metal modified zeolite
catalysts for selective oxidation catalysis (e.g.
alkenes to oxides1).
Here, the aim is to select the optimum
metal/framework combination for activity and
selectivity. The ‘e-science challenge’ to be
addressed is to harness effectively large
distributed computational tasks.
Applications
that
require
many
calculations of a similar timescale are highly
applicable to a grid based infrastructure, a
prime example being Monte Carlo simulations.
We have used and developed a combined
Monte Carlo and energy minimisation
approach to model zeolitic materials with low
and medium Si/Al ratios. Firstly Al is inserted
into a siliceous unit cell and then a charge
compensating cation, such as Na, is added
between two of the oxygen’s coordinated to
Al. When we have confirmed the lowest
energy positions of Al the cation is exchanged
for a proton and again energy minimised, as
shown in Figure 1. The method developed
along with the exploitation of low specification
computational resources will allow us to
construct realistic models of low and medium
Si/Al zeolites. Such structures can be used for
further simulations and aid the interpretation of
experimental data.
All the calculations have been performed
on the UCL Condor pool. The pool consists of
approximately 1000 low specification desktop
teaching PCs running Windows 2000, which
act solely as a client for a Windows Terminal
Server. Therefore their processors are virtually
unused and can be made available for
calculations as shown by the statistics given in
Table 1.
Figure 1 An example of Al Substitution and
proton charge compensation
Number of Nodes
950
Number of
Simulations
150,000
Number of
Particles
75,000,000
Total cpu time
150 years
Table 1 Condor statistics
II Crystal Structure Prediction
Crystal structure prediction is of great
importance in relation to the development of
pigments and dyes as well as in the
development of energetic materials. However,
it is of most benefit in the pharmaceutical
industry, where a drug can only be marketed in
the licensed crystal form. The appearance of
new crystalline forms (polymorphs) of a
pharmaceutical
compound
can
cause
considerable problems during development,
scale-up, production and storage. Discovery of
a new polymorph is also important in the area
of patent protection, where it can lead to
prolonging the ability of a drug company to
manufacture solely a particular drug.
Pharmaceutical
molecules
are
generally flexible; the complexity of
polymorph prediction2 is thus increased
because of the need not only to search through
the huge range of possible crystal packings but
also to consider the range of energetically
plausible conformers (different shapes of the
molecule). Our method for predicting
polymorphs involves a number of programs
that have traditionally been run sequentially
with manual editing of input and output files.
Using this manual method, a polymorph
prediction study typically takes several months
of work for each flexible molecule studied.
The search for possible crystal structures using
crystallographic relationships is implemented
in the program MOLPAK3. This currently
searches 13 space groups represented by 29 of
the most common packing types. For a
flexible molecule it is necessary to perform a
thorough conformational analysis to produce a
series of energetically plausible conformers.
Up to 200 densely packed structures are found
for each packing type and each is input to
DMAREL4 for lattice energy minimisation.
The next stage of the process involves
removing duplicate crystal structures as many
of those found in the search will represent the
same minimum. The remaining structures are
then sorted in terms of energy. Finally,
property calculations are then performed to see
which structures are more likely to be observed
experimentally. We can calculate properties
such as elastic constants and phonon
frequencies using DMAREL. Subsequent
processes allow us to calculate the morphology
(shape) of the hypothetical crystal structures as
well as calculate powder x-ray diffraction
patterns.
The ‘e-science challenge’ to be
addressed here involves developing and
optimising the work-flow so that the
MOLPAK and DMAREL codes are linked as a
set of loosely coupled web services. The
property prediction codes are at present run as
a separate process, but could be easily
integrated into the same package.
We
have
investigated
the
polymorphism of the nootropic drug
piracetam, whose molecular structure is shown
in Figure 2. Prior to this study there were three
known polymorphs of piracetam5-7 (which we
refer to as form I, form II and form III). The
predicted morphologies are shown in Figure 3
and powder patterns of these three forms are
shown in Figure 4.
O
N
NH 2
O
Figure 2 The piracetam molecule.
indicate flexible torsion angles
Arrows
We have previously reported how our
preliminary calculations suggested that it is
unlikely that the known polymorphs would be
located during a search using a gas phase
optimised molecular structure.
of rigid conformers to systematically explore
which regions of conformational space could
give rise to low energy hydrogen bonded
crystal structures.. The search is then refined
using crystallographic insight to optimise
particular
intermolecular
interactions.
Currently a search on one conformation takes
about one hour.
Using this method we were able to
easily locate forms I, II and III. All of these
crystal structures contain molecules whose
conformations are very different from the gas
phase optimised molecule. During the course
of this work, a new experimental polymorph
(form IV) had been obtained via
recrystallisation and data collection at high
pressure8. Six computed crystal structures
from the low energy region (within 5 kJ mol-1
of the global minimum) were sent to the
experimental team.
The lowest energy
structure proved to be a good approximation to
form IV.
Use of e-Science Tools
Figure 3 Left to right; predicted morphologies
for piracetam forms I, II and III
In this project we have made
extensive use of various e-Science tools. A
condor pool consisting of around 1000
Desktop PC’s has be been utilised for
calculations on zeolite systems.
The
‘interactive search’ system for crystal structure
prediction uses a BPEL web service system to
orchestrate the complex multi-program
workflows as a grid application. We are also
using the CCLRC dataportal and the SRB to
store low energy structures and properties. We
are also developing a database to allow data
mining of the results.
Conclusions
Figure 4 Left to right; predicted powder
patterns for piracetam forms I, II and III
Our
‘interactive
searching’
methodology allows us to search a large area
of conformational space quickly, thus
increasing the chances of finding a different
polymorph which could result in a more
thermodynamically stable crystal structure
than the known one. Firstly, we search for low
energy crystal structures using a large number
We have shown how existing
programs and processes for both combinatorial
chemistry and crystal structure prediction have
been grid enabled. This has made it possible
to study important processes in catalytic
chemistry that would have been impossible
without the use of a Condor pool to carry out
the calculations in a reasonable amount of
time.
The crystal structure prediction
methodology has also benefited from being
grid-enabled, so that time and manpower
required to perform a study on a particular
molecule has been reduced. We are currently
developing a database of known and calculated
crystal structures and properties for eventual
data mining to develop techniques for and
increase our knowledge of crystal structure
prediction.
Acknowledgements
This work was funded by the project
‘E-Science Technologies in the Simulation of
Complex Materials’.
References
(1) Notari, B. Microporous
crystalline titanium silicates. In Advances in
Catalysis, Vol 41; ACADEMIC PRESS INC:
San Diego, 1996; Vol. 41; pp 253-334.
(2) Ouvrard, C.; Price, S. L. Cryst.
Growth Des. 2004, 4, 1119-1127.
(3) Holden, J. R.; Du, Z. Y.;
Ammon, H. L. J. Comput. Chem. 1993, 14,
422-437.
(4) Willock, D. J.; Price, S. L.;
Leslie, M.; Catlow, C. R. A. J. Comput. Chem.
1995, 16, 628-647.
(5) Céolin, R.; Agafonov, V.; Louër,
D.; Dzyabchenko, V. A.; Toscani, S.; Cense, J.
M. J. Solid State Chem. 1996, 122, 186-194.
(6) Louër, D.; Louër, M.;
Dzyabchenko, V. A.; Agafonov, V.; Céolin, R.
Acta Crystallogr. Sect. B-Struct. Sci. 1995, 51,
182-187.
(7) Admiraal, G.; Eikelenboom, J. C.;
Vos, A. Acta Crystallogr. Sect. B-Struct. Sci.
1982, 38, 2600-2605.
(8) Fabbiani, F. P. A.; Allan, D. R.;
Parsons, S.; Pulham, C. R. CrystEngComm
2005, 7, 179-186.
Download