e-Science Technologies in the Simulation of Complex Materials

advertisement
e-Science Technologies in the Simulation
of Complex Materials
L. Blanshard, R. Tyer, K. Kleese
S. A. French, D. S. Coombes, C. R. A. Catlow
eMaterials
B. Butchart, W. Emmerich – CS
H. Nowell, S. L. Price – Chem
H
H3C
CH3
H
N
O
H
NO2
NO2
Polymorphism
prediction of polymorphs –
a drug substance may exist
as two or more crystalline
phases in which the
molecules are packed
differently.
Combinatorial Computational Catalysis
explore which sites are involved in
catalysis – used in diverse
industries including petroleum,
chemical, polymers,
agrochemicals, and environmental.
Acid Sites in Zeolites
Polymorph Prediction
Different crystal structures of a molecule are called polymorphs.
Polymorphs may have considerably different properties
(e.g. bioavailability, solubility, morphology)
Polymorph prediction is of great importance to the pharmaceutical
industry where the discovery of a new polymorph during production or
storage of a drug may be disastrous
Drug molecules are often flexible and this
makes the polymorph prediction process
more challenging…
Polymorph Prediction Workflow
For flexible molecules: conformational optimisation
n feasible rigid molecular probes representing
energetically plausible conformers
MOLPAK
Generation of ~6000 densely packed crystal
structures using rigid molecular probe
n times
Morphology
DMAREL
Lattice energy optimisation
Data : Unit cell volume, density, lattice energy
Restricted number of structures selected
crystal structures and properties stored in
Database
n = number of
conformers
Blind Test 2004
H
H3C
CH3
O
The Challenge:
H
N
H
Predict the crystal structure of
2-methyl-4,5-dinitro-phenyl-acetamide
NO2
NO2
Flexibility indicated with arrows
Potential energy
surface scan
about the CCNC
torsion angle
Wide range of conformers
within plausible energy
range
8 conformers chosen and
used in subsequent
searches
Energy Difference / kJmol -1
40
35
30
25
20
15
10
5
0
0
100
200
CCNC Torsion Angle / ˚
300
400
Blind Test 2004
Minima in the Lattice Energy for Different Conformations
Volume / Z (Å3 molecule-1)
Lattice energy + intramolecular
energy / kJmol-1
250
270
290
310
330
350
370
390
410
-30
-50
Conformer:
a
b
-70
c
10
20
-10
-90
-20
-5
-110
-130
Blind Test 2004
Minima in the Lattice Energy for Different Conformations
Volume / Z (Å3 molecule-1)
Lattice energy + intramolecular
energy / kJmol-1
260
265
270
275
280
285
290
-116
-118
Conformer:
a
b
c
-120
10
20
-10
-20
-122
-5
-124
Best 10kJmol-1
-126
Necessary to consider properties of best crystal structures, such as
growth rates, to decide which are more likely to be observed
Results
Observed crystal structure (revealed upon completion of blind test) –
higher energy conformer than those considered!
When just the observed conformer is used as the rigid probe
in the search the observed structure is found as global
minimum in lattice energy
Predicted
Observed
Summary
High energy gas phase conformers may be stabilised by
packing within a lattice in the solid state
As many conformers as possible need to be considered to
maximise the chance of predicting crystal structures correctly
and exploring the range of structures that are energetically
feasible as polymorphs
A fast, distributed e-Science application is being developed, to
enable routine crystal structure prediction for large numbers of
conformers –this is essential to develop computational methods
of predicting possible polymorphs of pharmaceutical molecules
Predicting Morphologies
The shape, or morphology, of a crystal plays an important
role in the manufacturing process as there are considerable
problems if the morphology changes due to impurities or
changes of solvent or when the process is scaled up for high
volume manufacture.
An understanding of the factors influencing crystal
morphology will help us to understand how the
crystallisation process can be controlled through, for
example the use of solvents or additives.
• BFDH Theory – based on geometrical factors
• AE Model – based on energetic factors
Scheme for Morphology Calculations
Minimised Structure
Choose faces to study
~15-20
For each face calculate AE
From DMAREL minimised structure
BFDH calculation in GDIS
Calculate valid shifts
Converge regions (exclude polar)
Draw morphology for each
crystals set of faces
Calculate relative volume growth rates
Wulff plot
New property
Morphologies
The calculated morphology can be visualised using a Wulff plot-where
the ratio of surface normal distances of all planes from the centre of the
crystal are determined by either the interplanar spacings, attachment or
surface energies.
HO
H
N
CH3
O
Observed and predicted morphology of form 1 of paracetamol
Growth Volume
New property ‘growth volume’- obtained by numerical integration
to find the volume within the Wulff shape-gives an indication of
whether one face dominates.
Pyridine
-30
10
9
-25
N
8
Relative Volume
-20
6
-15
5
4
-10
AE/kJ mol-1 per molecule
7
Volume
AE
Form 1 Z’=4
3
2
-5
1
0
fa
37
ak
11
am
50
cb
38
fc
21
aq
34
dd
31
am
20
ak
23
cd
49
av
32
ca
21
am
43
ai
36
cb
39
de
20
ca
28
de
40
cb
47
ai
18
am
5
fo
rm
II
ca
43
ak
7
az
5
ak
14
fa
38
fa
29
fo
rm
I
ak
15
0
Polymorph-Decreasing Stability
Prompted expt.
search for more
polymorphs
Many low energy structures, new observed form 2 predicted to grow fast
e-Science Issues to Address
•
•
•
•
•
•
•
simulations take too long to run
data are distributed across many sites and systems
no catalogue system
output in legacy text files, different for each program
few tools to access, manage and transfer data
workflow management is manual
licensing within distributed environment
Fortran Web Services
1. Expose Fortran binary as
distributed Web Service
(Web Service
Description Language)
WSDL
FO
XML
Fortran
output
XML
<x…/>
Fortran
binary
XSL
FO
Define an XML interface
to the computation
Fortran
input
To get binary to “talk” in XML: either change Fortran code so
input and output uses XML or use parsers and XSLT conversion
documents to map from fixed format input/output files to and
from XML.
Distributed Workflow
2. Orchestrate Web Services
with workflow service
WS wrapped
Fortran
binary
BPEL
script
Business Process Execution
Language
Workflow service is exposed to outside world as a web service
Data Representation
CH4
CH4
CH4
CH4
Fortran programs, use lots of different
formats to represent the same thing.
Data Representation
CML
<CH4…/>
Since we provide new WSDL interfaces for each application we
have a perfect opportunity to employ a standard representation
for chemical structures. XML standard in Chemistry is CML
(Chemical Markup Language)
Development of chemical markup language (CML) as a system for handling complex
chemical content. P. Murray-Rust, New Journal of Chemistry, 2001, 25, 618-634.
Integration with Existing Infrastructure
(BPEL)
workflow
Prototype has been successfully deployed.
Integration with Existing Infrastructure
Sun
Grid
Engine
(BPEL)
workflow
Existing grid infrastructure does not integrate easily with web services.
Policy on compute clusters enforced by Sun Grid Engine batch system
Other users of clusters submit jobs via this control software
Building a WSDL binding over the Sun Grid Engine protocol is difficult
Smooth transition from existing infrastructure to WS riskier than thought.
Data Management at CCLRC
• file storage at CCLRC
• distributed file access via Storage Resource Broker
(SDSC)
• catalogue of files using metadata in relational database
• web interface to metadata and files via Data Portal
• metadata editor through browser
Storage Resource Broker
Store data files from simulations in the
Storage Resource Broker
Data Portal
Search for studies in material sciences and download
associated data using the - CCLRC Data Portal
Ongoing and Future Work
• upload files as part of workflow to SRB
• generate metadata
• upload extracted data from files
Acid Sites in Zeolites
•Determine the extra framework
cation position within the zeolite
framework.
•Explore which proton sites are
involved in catalysis and then
characterise the active sites.
•To produce a database with
structural models and associated
vibrational modes for Si/Al ratios.
•Improve understanding of the
role of the Si/Al ratio in zeolite
chemistry.
MC/EM
A combined MC and EM approach has been developed to
model zeolitic materials with low and medium Si/Al ratios.
Firstly Al is inserted into a siliceous unit cell and then a
charge compensating cation.
The zeolite Mordenite, which has a 1 dimensional channel
system, has been studied with a simulation cell containing
two unit cells, which means 296 atoms, with 96 Si centres
(referred to as T sites).
100 Configurations
0
100
Configurations
-12085
5550
-12083
5530
full_TE
full_Vol
-12081
5 per. Mov. Avg. (full_TE)
5510
5490
-12077
5470
-12075
5450
-12073
5430
-12071
5410
-12069
5390
-12067
5370
-12065
5350
It can be seen that there are two distinct regions, -12079eV
to -12076eV and -12075eV to -12073eV, but there is no
obvious correlation between total energy and cell volume.
Cell Vol.
Total Energy (eV)
5 per. Mov. Avg. (full_Vol)
-12079
10000 Configurations
0
10000
configurations
-12090
TE
VOL
200 per. Mov. Avg. (TE)
200 per. Mov. Avg. (VOL)
5550
-12085
5500
TE
VOL
-12080
-12075
5450
-12070
5400
-12065
5350
However, when 10,000 structures are considered it is clear that
the most stable structures correspond to cation placements that
do not cause the cell to expand. This requires that the cations
sit in the large channel.
Comparison of Regions
-12079.5eV
-12075.04eV
What Next
When confirmed the lowest energy
positions of Al the cation is exchanged
for a proton and again energy
minimised.
This method will allow us to construct
realistic models of low and medium
Si/Al zeolites. Such structures can be
used for further simulations and aid the
interpretation of experimental data.
Condor
Extensive use of Condor pools (UCL – 950 nodes in teaching
pools). 48 cpu-years of previously unused compute resource
have been utilised in this study. Close collaboration with the
NERC e-minerals project has allowed access to this resource.
50,000 calculations have been performed each with 488
particles per simulation box, which means a total of 24,000,000
particles have been included in our simulations to date.
Achievements To Date
1. First use of CML schema for defining Web Service port types.
2. Calculation of 50,000 configurations of zeolite Mordenite
(24,000,000 particles) to gain insight into structure when a realistic
ratio of Al substitution is included in model.
3. Successfully exposed Fortran codes as OGSI Web Services prototype application deployed on 80 nodes. The prototype
computational polymorph application is being ported to a larger
production machine.
4. First use of BPEL standard for orchestrating web services in a Grid
application.
5. Open Source BPEL implementation in development enabling late
binding and dynamic deployment of large computational processes.
6. Integration of OGSI and BPEL with Sun Grid Engine.
7. Development of Graphic User Interface for polymorph application connects to relational database via EJB interface.
8. Infrastructure for metadata and data management
9. SRB and dataportal are already being used to hold datasets and
being used for transferring the data between different scientists and
computer applications.
10. Implementation of Condor pool at Ri.
Key Achievement
We are now doing science that was not possible
before the advancements made within e-Science.
Download