Chemistry 380.37 Fall 2015 Dr. Jean M. Standard September 16, 2015 I. Introduction to Conformation Searching Conformer Distributions It is useful to consider why it is often necessary to consider not just the global minimum on a potential energy surface but rather all the low-energy stable conformers. The fractional population pi of the ith conformer in a distribution at temperature T is related to the Boltzmann factor, pi = g i e −E i / RT € . −E / RT gj e j ∑ (1) j Here, g i is the degeneracy of the ith conformer, Ei is the energy (the relative energy is normally used, but absolute energies will give the same results), R is€the gas constant, and T is the temperature. An example is given in Table 1, illustrating the population fractions for a range of conformers within 5 kcal/mol of the global minimum at room € temperature. € Table 1. Populations of conformers within 5 kcal/mol of the global minimum. Fractional pop. pi Conformer Relative Energy Population (kcal/mol) at 300 K at 300 K 1 0.0 1000 2 0.2 0.422 € 0.302 3 0.5 0.182 432 4 1.0 0.079 187 5 2.0 0.015 35 6 5.0 0.0001 0.2 715 Notice that at 300 K, conformers that are within 1 kcal/mol of the global minimum have significant population, but those that are 2 kcal/mol or more above the global minimum are not significantly populated. Potential Energy Surfaces and Conformation Searching Consider carrying out a systematic conformation search to determine the low energy conformers of n-pentane (its carbon backbone is shown in Figure 1). The angles φ1 and φ2 are the C-C-C-C dihedral angles that can be varied to yield different stable conformers of n-pentane. φ1 C C φ2 C C C Figure 1. Carbon backbone of n-pentane, showing the two dihedral angles responsible for generation of different stable conformers. Even with only two flexible torsional angles, there are several stable low-energy conformers of n-pentane. The angles φ1 and φ2 are shown in the diagram below. The energy minima for these conformers can be observed in Figure 2. Note the presence of at least nine energy minima, though some of the energy minima are degenerate due to symmetry. 2 (a) (b) Figure 2. Potential energy surface of n-pentane as a function of its two flexible torsional angles in (a) a contour format and (b) a surface format (from Molecular Modelling: Principles and Applications, A. R. Leach, Addison Wesley Longman Limited, Essex, England, 1996.) A systematic conformation search involves varying the flexible geometrical parameters of a molecule (in this case, the two torsional angles C1-C2-C3-C4 and C2-C3-C4-C5) in a regular way and calculating the single point energy for each configuration. To generate the potential energy surfaces for n-butane or n-pentane, for example, the molecular mechanics energy was evaluated from 0 to 360 degrees in increments of 20 degrees. For n-butane, this 2 corresponds to 18 points, while for n-pentane, 18 = 324 points are required. Then, energy minimizations are carried out for each set of initial torsional angles to determine the unique stable conformations of the molecule. For larger molecules with many flexible torsional angles (such as proteins), systematic determination of the potential energy surface quickly becomes computationally prohibitive. For a molecule with just 6 torsional angles, 186 (or 3.4×107) points are required to generate the torsional potential energy surface using 20-degree increments! Even if the angular increment is increased from 20 degrees to 60 degrees, 66 or 46,656 points must still be determined. Conformation Searching of Polypeptides Consider a fragment of a polypeptide backbone that might be found in a protein shown in Figure 3. O C C H φ C ψ N C H O N C Figure 3. A backbone of a polypeptide. The O-C-N-H torsional angles of the peptide bond are rigid due to resonance. The torsional angles of the polypeptide backbone, denoted Φ and Ψ, are flexible, however. These torsional angles can be treated much like the two torsional angles of n-pentane, and a potential energy surface can be mapped. For example, for alanine dipeptide, a plot of the potential energy surface contours is shown in Figure 4. 3 Figure 4. Contours of alanine dipeptide potential energy surface (from Molecular Modelling: Principles and Applications, A. R. Leach, Addison Wesley Longman Limited, Essex, England, 1996.) In Figure 4, the angle Φ is plotted on the x-axis and the angle Ψ is shown on the y-axis. This plot shows the different energy minima of alanine dipeptide. For a polypeptide with a geometry described by many more torsional angles it is computationally difficult to map the full potential energy surface, and a systematic conformer search is not feasible. For larger systems, other conformational searching techniques will be necessary. For a wide variety of known systems, it has been shown that the different Φ and Ψ angles in each protein fall in specific ranges. When the different Φ and Ψ angle pairs found in many proteins are plotted, the graph produced is called a Ramachandran plot (Figure 5). Note that there are two major basins in which most of the Φ and Ψ angle pairs fall: one centered around Φ = –120º, Ψ = 135º and a second basin centered at about Φ = –90º, Ψ = –30º. There is also a much smaller minor basin centered at about Φ = 70º, Ψ = 45º. Figure 5. Ramachandran plot of Φ and Ψ angles from crystal structures of a variety of proteins (from Molecular Modeling and Simulation, T. Schlick, Springer, New York, 2002.). 4 II. Conformation Searching Methods Systematic Methods Systematic conformation searching is a method used to find stable conformers of relatively small molecules. In this method, all the flexible torsional angles in the molecule are varied in a systematic fashion in order to generate a set of initial structures. Each initial structure is then energy minimized and the stable conformations are enumerated. The advantage of a systematic search is that all the global and local minima will be found as long as the step-size in the torsional angle is not too large. However, systematic methods are very difficult to apply to molecules with more than 7 or 8 flexible torsional angles because the number of initial structures generated becomes enormous and the amount of CPU time to energy minimize each becomes prohibitive. The "difficulty" of a systematic search can be related to the number of structures to be energy minimized. For linear Nt acyclic systems, the difficulty is 6 , where Nt is the number of flexible torsions present. A list of calculated difficulties is presented in Table 2. Table 2. Difficulties of systematic searches involving acyclic systems. Length of Chain Difficulty (Atoms) Number of Flexible Torsions, Nt 5 2 36 6 3 216 7 4 1296 8 5 7776 9 6 46,656 10 7 280,000 11 8 1,680,000 12 9 1.0×107 : : : 17 14 7.8×1010 Nt –5Nr For cyclic systems, the difficulty is 6 , where Nt is the number of flexible torsions present and Nr is the number of rings. A list of calculated difficulties for cyclic systems is presented in Table 3 for Nr =1. Table 3. Difficulties of systematic searches involving cyclic systems with one ring. Size of Ring Number of Flexible Difficulty (Atoms) Torsions, Nt 5 5 1 6 6 6 7 7 36 8 8 216 9 9 1296 10 10 7776 11 11 46,656 12 12 280,000 : : : 17 17 2.2×109 5 Monte Carlo Methods Monte Carlo conformation searching methods are used to find stable conformers of large molecules for which systematic searches are not feasible. A Monte Carlo search can be performed by randomly varying the Cartesian coordinates of a molecule or by randomly varying selected dihedral angles. In these methods, a new structure is generated by either shifting the Cartesian coordinates of the atoms by a random amount or by varying the dihedral angles by a random amount. A flow chart for these procedures is shown in Figure 6. Figure 6. Flow chart for Monte Carlo conformation search (from Molecular Modelling: Principles and Applications, 2nd edition. A. R. Leach, Prentice Hall, Harlow, England, 2001.) Monte Carlo methods that involve randomly varying the cartesian coordinates of a molecule are sometimes referred to as cartesian stochastic searches. An example of one step in such a procedure is shown in Figure 7. Figure 7. The process for a cartesian stochastic search. 6 Example: Cycloheptadecane The objective of the cycloheptadecane paper [M. Saunders, K. N. Houk, Y.-D. Wu, W. C. Still, M. Lipton, G. Chang, and W. C. Guida, J. Am. Chem. Soc. 1990, 112, 1419-1427] is to benchmark various conformation searching methods. The authors wanted to find out which methods could find most of the low energy conformers of cycloheptadecane (within 3 kcal/mol of the global minimum). For each method, the authors compiled how many conformers were located and how much computer (CPU) time the calculation took. A typical cycloheptadecane conformer is shown in Figure 8. Table 4 summarizes the numbers of low-energy conformers of cycloheptadecane found using the MM2 force field. Figure 8. A typical cycloheptadecane conformer. Table 4. Numbers of conformations of cycloheptadecane found relative to the global minimum. within 1 kcal/mol within 2 kcal/mol within 3 kcal/mol 11 69 262 Five different types of methods were employed for the conformation search: stochastic (random) cartesian coordinate searches, systematic torsional tree searches, Monte Carlo torsional searches, distance geometry methods, and molecular dynamics. A summary of the results is given in Table 5. Table 5. Summary of conformation searching methods for cycloheptadecane. Method # Low Energy CPU time Conformers Found (days) (% of total) Stochastic Cartesian 222 90 Coordinate (85%) Stochastic Cartesian 178 15 Coordinate - improved (68%) Torsional Tree Search I 138 9 (53%) Torsional Tree Search II 211 30 (81%) Monte Carlo I 237 30 (90%) Monte Carlo II 249 30 (95%) Distance Geometry 176 30 (67%) Molecular Dynamics 169 30 (65%) CPU time/ conformer 0.41 0.10 0.065 0.14 0.13 0.12 0.17 0.18 Note that not only is it important for a conformation search to locate low-energy conformers quickly, but it is also important that the search be able to find a significant proportion of the total. Therefore, while the distance geometry and molecular dynamics methods are not too much slower than some of the other methods, that they find less than 70% of the total conformers is problematic. For this system, Monte Carlo methods provide a good combination of efficiency and completeness, locating over 90% of the conformers with a time per conformer of 0.12-0.13.