I. Introduction to Conformation Searching

advertisement
Chemistry 380.37
Fall 2015
Dr. Jean M. Standard
September 16, 2015
I. Introduction to Conformation Searching
Conformer Distributions
It is useful to consider why it is often necessary to consider not just the global minimum on a potential energy
surface but rather all the low-energy stable conformers. The fractional population pi of the ith conformer in a
distribution at temperature T is related to the Boltzmann factor,
pi =
g i e −E i / RT €
.
−E / RT
gj e j
∑
(1)
j
Here, g i is the degeneracy of the ith conformer, Ei is the energy (the relative energy is normally used, but absolute
energies will give the same results), R is€the gas constant, and T is the temperature. An example is given in Table 1,
illustrating the population fractions for a range of conformers within 5 kcal/mol of the global minimum at room
€ temperature.
€
Table 1. Populations of conformers within 5 kcal/mol of the global minimum.
Fractional pop. pi
Conformer
Relative Energy
Population
(kcal/mol)
at 300 K
at 300 K
1
0.0
1000
2
0.2
0.422
€
0.302
3
0.5
0.182
432
4
1.0
0.079
187
5
2.0
0.015
35
6
5.0
0.0001
0.2
715
Notice that at 300 K, conformers that are within 1 kcal/mol of the global minimum have significant population, but
those that are 2 kcal/mol or more above the global minimum are not significantly populated.
Potential Energy Surfaces and Conformation Searching
Consider carrying out a systematic conformation search to determine the low energy conformers of n-pentane (its
carbon backbone is shown in Figure 1). The angles φ1 and φ2 are the C-C-C-C dihedral angles that can be varied to
yield different stable conformers of n-pentane.
φ1
C
C
φ2
C
C
C
Figure 1. Carbon backbone of n-pentane, showing the two dihedral
angles responsible for generation of different stable conformers.
Even with only two flexible torsional angles, there are several stable low-energy conformers of n-pentane. The
angles φ1 and φ2 are shown in the diagram below. The energy minima for these conformers can be observed in
Figure 2. Note the presence of at least nine energy minima, though some of the energy minima are degenerate due
to symmetry.
2
(a)
(b)
Figure 2. Potential energy surface of n-pentane as a function of its two flexible torsional
angles in (a) a contour format and (b) a surface format (from Molecular Modelling: Principles
and Applications, A. R. Leach, Addison Wesley Longman Limited, Essex, England, 1996.)
A systematic conformation search involves varying the flexible geometrical parameters of a molecule (in this case,
the two torsional angles C1-C2-C3-C4 and C2-C3-C4-C5) in a regular way and calculating the single point energy
for each configuration. To generate the potential energy surfaces for n-butane or n-pentane, for example, the
molecular mechanics energy was evaluated from 0 to 360 degrees in increments of 20 degrees. For n-butane, this
2
corresponds to 18 points, while for n-pentane, 18 = 324 points are required. Then, energy minimizations are carried
out for each set of initial torsional angles to determine the unique stable conformations of the molecule.
For larger molecules with many flexible torsional angles (such as proteins), systematic determination of the potential
energy surface quickly becomes computationally prohibitive. For a molecule with just 6 torsional angles, 186 (or
3.4×107) points are required to generate the torsional potential energy surface using 20-degree increments! Even if
the angular increment is increased from 20 degrees to 60 degrees, 66 or 46,656 points must still be determined.
Conformation Searching of Polypeptides
Consider a fragment of a polypeptide backbone that might be found in a protein shown in Figure 3.
O
C
C
H
φ
C
ψ
N
C
H
O
N
C
Figure 3. A backbone of a polypeptide.
The O-C-N-H torsional angles of the peptide bond are rigid due to resonance. The torsional angles of the
polypeptide backbone, denoted Φ and Ψ, are flexible, however. These torsional angles can be treated much like the
two torsional angles of n-pentane, and a potential energy surface can be mapped. For example, for alanine
dipeptide, a plot of the potential energy surface contours is shown in Figure 4.
3
Figure 4. Contours of alanine dipeptide potential energy surface (from Molecular Modelling:
Principles and Applications, A. R. Leach, Addison Wesley Longman Limited, Essex,
England, 1996.)
In Figure 4, the angle Φ is plotted on the x-axis and the angle Ψ is shown on the y-axis. This plot shows the
different energy minima of alanine dipeptide. For a polypeptide with a geometry described by many more torsional
angles it is computationally difficult to map the full potential energy surface, and a systematic conformer search is
not feasible. For larger systems, other conformational searching techniques will be necessary.
For a wide variety of known systems, it has been shown that the different Φ and Ψ angles in each protein fall in
specific ranges. When the different Φ and Ψ angle pairs found in many proteins are plotted, the graph produced is
called a Ramachandran plot (Figure 5). Note that there are two major basins in which most of the Φ and Ψ angle
pairs fall: one centered around Φ = –120º, Ψ = 135º and a second basin centered at about Φ = –90º, Ψ = –30º.
There is also a much smaller minor basin centered at about Φ = 70º, Ψ = 45º.
Figure 5. Ramachandran plot of Φ and Ψ angles from crystal structures of a variety of
proteins (from Molecular Modeling and Simulation, T. Schlick, Springer, New York, 2002.).
4
II. Conformation Searching Methods
Systematic Methods
Systematic conformation searching is a method used to find stable conformers of relatively small molecules. In this
method, all the flexible torsional angles in the molecule are varied in a systematic fashion in order to generate a set
of initial structures. Each initial structure is then energy minimized and the stable conformations are enumerated.
The advantage of a systematic search is that all the global and local minima will be found as long as the step-size in
the torsional angle is not too large. However, systematic methods are very difficult to apply to molecules with more
than 7 or 8 flexible torsional angles because the number of initial structures generated becomes enormous and the
amount of CPU time to energy minimize each becomes prohibitive.
The "difficulty" of a systematic search can be related to the number of structures to be energy minimized. For linear
Nt
acyclic systems, the difficulty is 6 , where Nt is the number of flexible torsions present. A list of calculated
difficulties is presented in Table 2.
Table 2. Difficulties of systematic searches involving acyclic systems.
Length of Chain
Difficulty
(Atoms)
Number of Flexible
Torsions, Nt
5
2
36
6
3
216
7
4
1296
8
5
7776
9
6
46,656
10
7
280,000
11
8
1,680,000
12
9
1.0×107
:
:
:
17
14
7.8×1010
Nt –5Nr
For cyclic systems, the difficulty is 6
, where Nt is the number of flexible torsions present and Nr is the
number of rings. A list of calculated difficulties for cyclic systems is presented in Table 3 for Nr =1.
Table 3. Difficulties of systematic searches involving cyclic systems with one ring.
Size of Ring
Number of Flexible
Difficulty
(Atoms)
Torsions, Nt
5
5
1
6
6
6
7
7
36
8
8
216
9
9
1296
10
10
7776
11
11
46,656
12
12
280,000
:
:
:
17
17
2.2×109
5
Monte Carlo Methods
Monte Carlo conformation searching methods are used to find stable conformers of large molecules for which
systematic searches are not feasible. A Monte Carlo search can be performed by randomly varying the Cartesian
coordinates of a molecule or by randomly varying selected dihedral angles. In these methods, a new structure is
generated by either shifting the Cartesian coordinates of the atoms by a random amount or by varying the dihedral
angles by a random amount. A flow chart for these procedures is shown in Figure 6.
Figure 6. Flow chart for Monte Carlo conformation search (from Molecular Modelling:
Principles and Applications, 2nd edition. A. R. Leach, Prentice Hall, Harlow, England, 2001.)
Monte Carlo methods that involve randomly varying the cartesian coordinates of a molecule are sometimes referred
to as cartesian stochastic searches. An example of one step in such a procedure is shown in Figure 7.
Figure 7. The process for a cartesian stochastic search.
6
Example: Cycloheptadecane
The objective of the cycloheptadecane paper [M. Saunders, K. N. Houk, Y.-D. Wu, W. C. Still, M. Lipton, G.
Chang, and W. C. Guida, J. Am. Chem. Soc. 1990, 112, 1419-1427] is to benchmark various conformation searching
methods. The authors wanted to find out which methods could find most of the low energy conformers of
cycloheptadecane (within 3 kcal/mol of the global minimum). For each method, the authors compiled how many
conformers were located and how much computer (CPU) time the calculation took.
A typical cycloheptadecane conformer is shown in Figure 8. Table 4 summarizes the numbers of low-energy
conformers of cycloheptadecane found using the MM2 force field.
Figure 8. A typical cycloheptadecane conformer.
Table 4. Numbers of conformations of cycloheptadecane found relative to the global minimum.
within 1 kcal/mol
within 2 kcal/mol
within 3 kcal/mol
11
69
262
Five different types of methods were employed for the conformation search: stochastic (random) cartesian
coordinate searches, systematic torsional tree searches, Monte Carlo torsional searches, distance geometry methods,
and molecular dynamics. A summary of the results is given in Table 5.
Table 5. Summary of conformation searching methods for cycloheptadecane.
Method
# Low Energy
CPU time
Conformers Found
(days)
(% of total)
Stochastic Cartesian
222
90
Coordinate
(85%)
Stochastic Cartesian
178
15
Coordinate - improved
(68%)
Torsional Tree Search I
138
9
(53%)
Torsional Tree Search II
211
30
(81%)
Monte Carlo I
237
30
(90%)
Monte Carlo II
249
30
(95%)
Distance Geometry
176
30
(67%)
Molecular Dynamics
169
30
(65%)
CPU time/
conformer
0.41
0.10
0.065
0.14
0.13
0.12
0.17
0.18
Note that not only is it important for a conformation search to locate low-energy conformers quickly, but it is also
important that the search be able to find a significant proportion of the total. Therefore, while the distance geometry
and molecular dynamics methods are not too much slower than some of the other methods, that they find less than
70% of the total conformers is problematic. For this system, Monte Carlo methods provide a good combination of
efficiency and completeness, locating over 90% of the conformers with a time per conformer of 0.12-0.13.
Download