Molecular Modeling Short Course Wavefunction, Inc. Goals To formulate molecular mechanics and quantum chemical models and to assess these models for the calculation of equilibrium conformations and geometries, reaction energies and infrared and NMR spectra. To formulate, assess and illustrate graphical models to anticipate molecular properties and chemical reactivity/selectivity. To show the use of databases of calculated geometries, energies, properties and spectra. Overall … what can be calculated, how it relates to what is measured and what to expect of the results of the calculations. Energy Surfaces and Geometries What is an Equilibrium Geometry? In one dimension, a minimum on a curve of energy as a function of coordinate corresponds to an equilibrium geometry (or equilibrium structure). This does not necessarily mean that a molecule with this structure can be isolated and characterized but only that it is possible for it to exist. Mathematical Description of Energy Surfaces It is not possible to actually visualize a multidimensional energy surface. It is possible to provide a unique mathematical definition of those few points on such a surface that correspond to stable molecules. Note, however, that it is not possible to specify a unique pathway (“reaction coordinate”) connecting two molecules. Stationary Points Stable molecules are all stationary points on an energy surface, that is, points for which the first derivative of the energy with respect to each geometrical coordinate is zero. In words, this means that the energy surface is “flat”. in one dimension: dE/dR = 0 in many dimensions: E/Ri = 0 i = 1,2,3 . . . 3N–6 Energy Minima in One Dimension It is easy to pick out a stable molecule on a one dimensional energy surface. It is a flat point where the energy curve “goes up” on both sides. In mathematical terms, this means that the second derivative of the energy is positive. d2E/dR2 > 0 Energy Minima in Many Dimensions For a molecule with N atoms, 3N-6 second derivatives are associated with each first derivative. This makes it impossible to say whether a particular stationary point is an energy minimum or maximum. However, the original coordinates, may be combined into a new set referred to as normal coordinates, ξ, that lead to a second derivative matrix that is diagonal. ∂2E/∂Ri∂Rj ∂2E/∂ξi∂ξj = δij ∙∂2E/∂ξi∂ξj δij is 1 for i=j and 0 otherwise. Each stationary point is now associated with a single second derivative, making it possible to say whether the energy is at a minimum. How to Determine an Equilibrium Geometry? Finding an equilibrium structure involves an iterative process that is terminated when all first derivatives fall below a preset tolerance, assuming that neither the energy nor the coordinates have changed significantly from their values in the previous iteration. The number of iterations required will typically be of the same order as the number of independent variables. Thus, finding an equilibrium structure is likely to be one to two orders of magnitude more costly in terms of computer time than calculating an energy at a single geometry. How to Confirm an Equilibrium Geometry? The infrared frequency for a diatomic molecule is proportional to the square root of the second derivative of the energy with regard to change in coordinate (the force constant) divided by a mass (the reduced mass). Similarly, each of the 3N-6 frequencies for a molecule with N atoms (3N-5 in the case of a linear molecule) is proportional to the square root of the corresponding force constant. An energy minimum will give rise to a “normal” infrared spectrum, with frequencies that are all real numbers. Collections of Experimental Structures Cambridge Structural Database The Cambridge Structural Database (CSD) is a collection of nearly 500,000 experimental X-ray crystal structures for organic and organometallic molecules. It is a virtual gold mine of experimental structures and also serves to identify molecules that can be (and have been) synthesized and purified. There are >100 derivatives of camphor in CSD formed by substitution on the methyl group at C1. H3C CH3 CH3 O Molecular Mechanics Models Molecular Mechanics Molecular mechanics represents a molecule in terms of a Lewis structure and assumes that bond lengths and angles depend predictably on atom types and hybrids. Strain energy (the price to distort from ideal bond distances, angles and torsion angles) needs to be added to non-bonded energy (to account for van der Waals and Coulomb interactions). steric interaction distance distortion dihedral distortion + Coulombic interaction angle distortion – Stretch and Bend Strain Energy bonds Estrain = bond angles EAstretch + EAbend A A torsion angles + E torsion A + A non-bonded atoms non-bonded EAB A B Bond stretching and angle bending terms are quadratic forms. Estretch (r) = 1 kstretch (r - req) 2 2 Ebend () = 1 kbend ( - eq)2 2 r and are the bond distance and angle, req and eq are the ideal bond length and bond angle, and kstretch and kbend are parameters. Higher-order contributions and cross terms are included in real molecular mechanics models. Torsional Strain Energy bonds Estrain = bond angles EAstretch + EAbend A A torsion angles + EAtorsion + A non-bonded atoms non-bonded EAB A B The torsional energy term must reflect the inherent periodicity of rotation about a single bond. Etorsion () = ktorsion1 [1 - cos ( - eq)] + ktorsion2 [1 - cos 2 ( - eq)] + ktorsion3 [1 - cos 3 ( - eq )] is the torsion angle, eq is the ideal torsion angle and ktorsion1, ktorsion2 and ktorsion3 are parameters. Higher-order contributions are included in real molecular mechanics models. Non-Bonded Energy bonds Estrain = bond angles EAstretch + EAbend A A torsion angles + EAtorsion + A non-bonded atoms non-bonded EAB A B Enon-bonded (r) = EVDW (r) + ECoulombic (r) Non-bonded interactions involve a sum of van der Waals (VDW) interactions and Coulombic interactions. EVDW (r) = ro r 12 ro - 2 r 6 r is the non-bonded distance, and ro are parameters and q are atomic charges. Coulombic E qq´ (r) = r Force Fields Molecular mechanics models need to be parameterized either to experimental data or to the results of “good” calculations. The combination of functional form and parameters, known as a Force Field, defines a particular molecular mechanics model. Spartan uses the MMFF (Merck Molecular Force Field) model, specifically parameterized to describe conformational preferences of organic molecules. Limitations of Molecular Mechanics Because the strain energy is referenced to an individual Lewis structure, molecular mechanics may not be used to compare energies of molecules that are represented by different Lewis structures. It may be used to compare energies of isomers that share the same Lewis structure, for example, stereoisomers, and most important different conformers of a molecule. Molecular mechanics may be thought of as an elaborate interpolation scheme. It cannot be expected to perform well as it ventures outside the range of its experience (parameterization). Range of Molecular Mechanics Molecular mechanics calculations are dominated by calculation of the non-bonded energy which scales as the square of the total number of atoms. Molecular mechanics calculations are practical on molecules containing thousands of atoms, for example, proteins, and (as will be described later) on smaller molecules that may have thousands of different shapes (conformers). Molecular Structure in 3D The combination of a builder and molecular mechanics provides a powerful tool for the examination and comparison of the molecular structures and properties derived from these structures, for example, volumes, surface areas and polar surface areas. PSA and Absorption through a Membrane Polar surface area (PSA), usually defined as the area of a spacefilling model due to nitrogen, oxygen and attached hydrogens, anticipates the ability of molecule to move across a membrane. High PSA signifies a hydrophilic molecule and little incentive to move into a non-polar membrane while low PSA signifies a hydrophobic molecule and greater incentive. How Chemists Use Molecular Structure? Chemists use molecular structure (“sterics”) to make predictions about reactivity and selectivity. For example, a chemist might to assign the product of LiAlH4 reduction of the bicyclic ketone by assuming that the hydride attacks the less-crowded carbonyl face. O Me Me HO LiAlH4 H Me Me Paquette, 1979 CH2 This is reasonable for a rigid molecule,CH2 but is problematic for a flexible molecule where the 3-dimensional shape may not be known. Very Few Molecules are Rigid! The acyclic ketone shown below may exist in a variety of shapes, and different shapes lead to different products based on which carbonyl face is likely to be less crowded. O OH Me Ph Me Me L selectride Me Ph Me Me Tsuchihashi, 1984 How Many Conformers Are There? A good rule of thumb is that each additional single bond multiplies the number of conformers by three. Start with butane which has three conformers. Add another carbon to give pentane with nine conformers, another to give hexane with 27 conformers, and so on. Note, that not are conformers are distinct, for example butane has only two distinct conformers. Three-member rings are rigid, four and five-member rings may be assumed to be rigid, and six-member rings comprising only sp3 centers typically exist only as (two) “chair” conformers. Seven-member and larger rings generally exhibit several conformers. Properties of Flexible Molecules To obtain the value of a property of a flexible molecule, it is necessary to average over all possible conformers, weighing each based on its energy according to the Boltzmann equation. An energy difference of 4 kJ/mol leads to a Boltzmann weight of ~0.1 (10%) at room temperature, an energy difference of 8 kJ/mol to a weight of ~0.05 (5%) and a difference of 12 kJ/mol to a weight of ~0.01 (1%). While only a few conformers are likely to contribute significantly to the average, it may be necessary to examine all conformers to identify these few. Systematic Searching Systematically “walking through” all combinations of bond and ring conformations is the only procedure that actually guarantees that the lowest energy conformer will be located, and that a “correct” Boltzmann distribution will be obtained. “Real molecules” may exhibit hundreds or thousands of conformers, and systematic searching rapidly becomes impractical. Monte-Carlo Searching An alternative approach generates conformers by random conformational changes, with the decision to keep or discard a conformer (as a starting point for the next random move) based on its energy relative to the best conformer yet found. A so-called Monte-Carlo search nearly always locates the lowest-energy conformer (or a conformer that is very close in energy), even though they examine only a tiny fraction of the possible number of conformers. It also produces a correct Boltzmann distribution in the limit of a large number of moves. Systematic vs. Monte-Carlo Searches Lovastatin is representative of an important class of drugs that increase the synthesis of LDL receptor proteins and are widely used to reduce cholesterol levels. A Monte-Carlo search limited to 200 steps yields the same lowest-energy as a full systematic search (>4000 conformers). Spanning all Possible Molecular Shapes In some applications it is more important to span the range of possible shapes that a molecule might assume than to provide a correct Boltzmann distribution. The most common example of this is “similarity analysis”, attempting to establish if a molecule with a particular shape is “similar” to any conformer of a flexible molecule. Where the flexible molecule has too many conformers to be examined systematically, an alternative is to randomly sample all (systematically-generated) conformers. We refer to this as a conformer distribution. Performance of the MMFF Model At first glance, conformer energy differences obtained from the MMFF model appear to be quite good … in fact as good as we can reasonably expect from the best practical quantum chemical models to be introduced shortly. Conformations of Acyclic Molecules molecule n-butane 1-butene 1,3-butadiene acrolein N-methylformamide N-methylacetamide formic acid methyl formate methyl acetate propanal 2-methylpropanal ethanol methyl ethyl ether methyl vinyl ether mean absolute error low-energy/high-energy conformer MMFF expt. trans/gauche skew/cis trans/gauche trans/cis trans/cis trans/cis cis/trans cis/trans cis/trans eclipsed/anti eclipsed/anti anti/gauche anti/gauche cis/skew 3.3 1.3 10.5 8.0 5.4 10.9 20.5 22.1 34.7 2.1 2.5 0.8 6.3 9.2 2.8 0.9 12.1 7.1 5.9 9.6 16.3 19.9 35.6 2.8 3.3 0.5 6.3 7.1 1.1 – A Closer Look at MMFF A closer looks reveals that these same molecules were used to parameterize MMFF. Comparisons that include molecules that were not employed in the parameterization reveals a more accurate (and discouraging) picture. Note in particular, the poor result for 2-fluorotetrahydropyran (a model for carbohydrates). Conformations of Cyclic Molecules molecule methylcyclohexane tert-butylcyclohexane cis-1,3-dimethylcyclohexane fluorocyclohexane chlorocyclohexane piperidine N-methylpiperidine 2-chlorotetrahydropyran 2-methylcyclohexanone 3-methylcyclohexanone 4-methylcyclohexanone mean absolute error low-energy/high-energy conformer equatorial/axial equatorial/axial equatorial/axial equatorial/axial equatorial/axial equatorial/axial equatorial/axial axial/equatorial equatorial/axial equatorial/axial equatorial/axial MMFF expt. 5.9 26.4 21.3 -1.7 -1.3 3.8 13.8 -0.4 5.4 2.1 5.4 7.3 22.6 23.0 0.7 2.1 2.2 13.2 7.5 8.8 5.7,6.5 7.3,8.8 3.3 – Aside …Interpreting Energy Profiles It is not only possible to identify stable conformers and furnish conformer energy differences but conformational energy profiles may also be drawn and interpreted. The functional form of the torsional energy E(φ) is actually a truncated Fourier series, the individual terms of which are independent and may be interpreted independently. 1 1 1 V () = V1(1 - cos) + V2 (1 - cos2) + V3(1 - cos3) 2 2 2 . = V1 () + V2 () + V3 () One, Two and Threefold Terms The V1 term in n-butane gives the difference between syn and anti conformers and reflects the crowding of methyl groups. In 1,2-difluoroethane V1 reflects interactions of bond dipoles. CH3 F CH3 CH3 F F VS. VS. F CH3 "crowded" "not crowded" bond dipoles add bond dipoles cancel The V2 term in 1,3-butadiene reflects desire to keep π systems coplanar. V2 in hydrazine it reflects the desire to keep the lone pairs perpendicular. The V3 term reflects the need for single bonds to stagger. Dimethyl Peroxide The energy profile for rotation about the OO bond in dimethyl peroxide shows a single broad minimum. According to the Fourier analysis, this arises due to a combination of the V1 term (keeping the methyl group apart) and the V2 term (keeping the lone pairs perpendicular). The V3 term is much less important. E(ø) = 26 (1-cosø) +11 (1-cos2ø) +2 (1-cos3ø) Larger Molecules Is the MMFF model good enough to distinguish conformers that are “reasonable” (and therefore need further consideration) from those that are “unreasonable” (and may be discarded)? Lack of reliable experimental data for any but very simple molecules means that it is necessary to use calculated conformer energy differences as a standard to tell. A high-level quantum chemical model will be used for this purpose. We shall see later that the model performs well for simple molecules where experimental conformational energy differences are known. 2-Benzylamino-1-propanol 4,6-Dimethyl-1-phenyl-5-hepten-3-one Utility of the at MMFF Model The MMFF model appears to be good enough to distinguish conformers that are “reasonable” (and therefore need further consideration) from those that are “unreasonable” (and may be discarded). Conformational analysis (at least the first steps of conformational analysis) is the most important application of molecular mechanics to “small-molecule” chemistry. Identifying the “Important” Conformer If the quantity of interest pertains to a system at equilibrium or to the product of a reaction under thermodynamic control, then the “important” conformer is the lowest-energy conformer. Where there are several low-energy conformers, these need to be weighted according to the Boltzmann equation. Note that quantum chemical and molecular mechanics models apply to the gas, and the lowest-energy conformer may not be the conformer found in the solid state or in solution or for a molecule that is bound to a protein. Conformational changes may occur to allow effective crystal packing or to maximize interactions with a solvent or a protein host. Gas vs. Aqueous Phase Conformation The lowest-energy conformation of an isolated molecule is not necessarily that in solution. The most conspicuous difference is that (intramolecular) hydrogen bonding is likely to be less important. For example, the lowest-energy conformer of the modified nucleoside acyclovir exhibits a hydrogen bond in the gas phase but not in aqueous solution. O N HN H2N N N O OH Acyclovir Solid vs. Gas-Phase Conformations Are the conformations of molecules in the gas phase mirrored by conformations observed in the solid? If they are, then the ~500,000 experimental X-ray crystal structures are a very rich resource of conformational preferences for isolated molecules. Limited comparisons suggest that the factors responsible for crystal packing do not necessarily lead to large changes in conformation. The conformer found in the crystal is typically either the same as the best gas-phase conformer or a conformer that is only a few kJ/mol higher in energy. There will be exceptions, the most obvious being for cases that are able to form intramolecular hydrogen bonds. Solid vs. Gas-Phase Conformations for Common Drugs molecule clozapine loratadine dextromoramide chloropromazine tamoxifen thioridazine risperidone quinidine loperamide (imodium) haloperidol astemizole cimetidine number of conformers 4 16 25 35 43 56 71 73 157 161 210 229 E(best) – E(cystal) 0 (same) 0 (same) 0 (same) 2 6 0 (same) 2 8 3 5 7 0 (same) Conformations of Free vs. Protein-Bound Molecules The most obvious exceptions occur where hydrogen bonding is possible, for example, molecules bound to proteins. The important question is whether knowledge of the conformation of an isolated molecule offers any insight as to whether it will “fit” inside a protein. Limited comparisons do not lead to a clear picture. While the bound conformers of some compounds are the same or very similar to those for the free molecule, the conformations of others are quite different. This suggests that specific binding offers greater “rewards” than crystal packing forces. Free vs. Protein-Bound of Conformations of Small Molecules molecule podophyllotoxin indomethacin ibuprofen zopolrestat penicillin G mesopram piclamilast loracarbef cilomilast diphenhydramine ampicillin gleevec trifluoroperazine protein PDB ID(s) 1sa1 1s2a 1eqg 1mar,1frb 1fxv 1xm6 1xm4,1xon 1fcn 1xlx 2aot 1nx9 1opj 1a29 # conf 17 17 18 19 21 30 31 34 35 39 45 60 87 E(best) - E(cystal) 2 4 0 (same) 0 (same 1 3 3 11 4 6 10 9 16 Quantum-Chemical Models Schrödinger Equation to Molecular Orbital Theory The Schrödinger Equation . . . Hydrogen Atom The motions and interactions of nuclei and electrons are described by the Schrödinger equation which may be solved exactly for the hydrogen atom. 1 2 2 Z r y(r) = Ey(r) The quantity in brackets gives the energy (E) of an electron at a distance r from a nucleus of charge Z. The wavefunction (y) depends on the electron coordinates and are the familiar s, p, d atomic orbitals. y2 times a small volume is the electron density (probability of finding the electron inside this volume). This is the quantity measured in an X-ray diffraction experiment. The Generalized Schrödinger Equation The Schrödinger equation may be generalized. ˆ = E H ˆ is the Hamiltonian and describes both the kinetic and potential H energies of nuclei and electrons. In atomic units, it is given by. Hˆ = electron s 1 2i 2 i 1 2 nucle i A 1 2 A MA electron s nucle i i A electron s ZA + riA i < j 1 + r ij nucle i A<B ZA ZB RAB Z is the nuclear charge, MA is the ratio of atomic and electron masses and RAB,RiA and Rij are distances involving nuclei A,B and electrons i,j. Atomic Units The atomic unit for length is the bohr and the atomic unit for energy is the hartree. 1 bohr = 0.52917Ǻ= 5.2917 pm 1 hartree = 2625 kJ/mol = 627.5 kcal/mol The energy of the proton is 0 hartrees (there is no electron). The energy of hydrogen atom is -0.5 hartrees. Born-Oppenheimer Approximation The Born-Oppenheimer approximation assumes that, because nuclei are much more massive than electrons, they do not move. This leads to the electronic Schrödinger equation. Hˆ elel = Eelel Hˆ el = 1 2 electron s i 2i electron s nucle i i A ZA riA electron s + i<j 1 r ij The nuclear kinetic energy is zero, and the Coulomb repulsion energy between nuclei is a constant (added to the electronic energy). Nuclear mass does not appear in the electronic Schrödinger equation and isotope effects have a different origin. Hartree-Fock Approximation The Hartree-Fock approximation insists that the electrons move independently of each other. In practice, each electron is confined to a spin orbital, made up of a spatial function or molecular orbital, y, and a spin function, or . The latter can be thought of as an “accounting” device, ensuring that no more than two electrons occupy each molecular orbital. The manyelectron wavefunction needs to be written as a sum of products of spin orbitals in the form of a determinant. 1(1) 2( 1) n(1) = 1 N! 1(2) 2(2) n(2) 1(N) 2(N) n(N) Solving the Hartree-Fock Equations The Hartree-Fock equations need to be solved using an iterative procedure. An electron is selected and a one-dimensional “Schrödinger equation” is solved in the presence of a field that is made up of all the remaining electrons. The solution is then folded back into the field, another electron is selected and the process repeated. After all electrons have been considered, the resulting field is compared with that at the start. If they are the same within some preset tolerance, then the process is judged to have converged and is terminated. If they are different, the entire process is repeated. Such a procedure is commonly referred to as a self-consistent-field (SCF) procedure. LCAO Approximation The molecular orbitals are written as linear combinations of a finite set (a basis set) of prescribed functions known as basis functions, . basis functions yi= ci c are the (unknown) molecular orbital coefficients. Because the are centered on the atoms, they are commonly referred to as atomic orbitals. This is known as the Linear Combinations of Atomic Orbitals or LCAO approximation. Roothaan-Hall Equations Taken together, the Hartree-Fock and LCAO approximations lead to the Roothaan-Hall equations, the form of which are nearly the same as the Schrödinger equation. Fc = Sc are orbital energies, S is the overlap matrix (a measure of the extent that basis functions “see each other”), and F is the Fock matrix (analogous to Ĥ in the Schrödinger equation). Solution leads to the set of molecular orbital coefficients, c, and their associated energies, Origin of Hartree-Fock Models Schrödinger equation nuclei don't move electronic Schrödinger equation 1. electrons move independently 2. molecular "solutions" written in terms of atomic solutions Hartree-Fock "Molecular Orbital" Models . Graphical Models Chemistry in Pictures As we shall see shortly, quantum chemical models are able to provide a variety of quantitative data, in the form of structures, energies and spectra. Before we get into these, it may be useful to talk about a more qualitative aspect, and in particular about information that is better related as “pictures” rather than tables of numbers. These include the electron density which relates to molecular size and shape and the electrostatic potential which relates to molecular charge distribution. Isosurfaces There are two obvious ways to display a function that depends on three coordinates, in this case the x,y,z Cartesian coordinates of a molecule, on a two-dimensional screen (or printed page) One is to draw a two-dimension cut (“slice”) through the surface and display contour lines. The other is to define a surface of constant value, a so-called isovalue surface or isosurface. f(x,y,z) = constant The constant is chosen to reflect a particular physical observable of interest, for example, the shape of a molecule in the case of the electron density. Molecular Orbitals Molecular orbitals are commonly related to bonds and lone pairs. Their shapes may be able to suggest why a particular chemical reaction occurs as it does or does not occur as expected. For example, the fact that the HOMO (HighestOccupied Molecular Orbital) in cyanide anion is more concentrated on carbon (on the right) than on nitrogen, shows that cyanide will act as a carbon nucleophile. N C– CH3 I N C CH3 + I – Unoccupied Molecular Orbitals Unoccupied molecular orbitals may also be informative. For example, the shape of the LUMO (Lowest-Unoccupied Molecular Orbital) of methyl iodide shows why iodide leaves following nucleophilic attack by cyanide. The LUMO is antibonding between carbon and iodine, meaning that donation of the electron pair from cyanide will cause the carbon-iodine bond to weaken and eventually break. Hammond Postulate The reason that molecular orbitals are able to anticipate chemical reactivity and selectivity follows from the Hammond Postulate. This says that for exothermic reactions (all useful reactions are exothermic) the transition state will resemble the reactants. Thus, the properties of reactants are expected to mirror those of transition states. Fukui-Woodward-Hoffmann Rules The Fukui-Woodward-Hoffmann Orbital Symmetry Rules are a direct consequence of the Hammond Postulate. They use the shapes (symmetries) of the HOMO and LUMO (Frontier Molecular Orbitals) to understand why some chemical reactions proceed whereas other reactions do not. For example, the fact that the HOMO in 1,3-butadiene may interact constructively with the LUMO in ethylene, suggests that the two molecules will undergo Diels-Alder cycloaddition to form cyclohexene. + HOMO-LUMO Gap and Diels-Alder Rates Frontier orbitals may be used to anticipate the rates of reactions. For example, the rates of Diels-Alder reactions are known to increase with π donors on the diene and π acceptors on the dienophile. Donors raise the HOMO energy and acceptors lower the LUMO energy. Decrease in HOMO-LUMO gap leads to stronger interaction of the diene and dienophile and decrease in activation energy. LUMO diene Orbital Energy HOMO dienophile Electron Density The sum of products of coefficients over all occupied molecular orbitals gives rise to the density matrix, P, the elements of which are given by. occupied molecular orbitals P = 2 cici i The product of an element of the density matrix and its associated pair of atomic orbitals at a point in space, summed over all pairs of orbitals, gives the number of electrons at that point. This is termed the electron density, and is the quantity obtained in an X-ray diffraction experiment. . Different Regions of Electron Density The electron density is largest aroundnon-hydrogen atoms. This is the basis of the X-ray diffraction experiment. Regions of lower density reveal hydrogen atoms as well as “bonds” between atoms. An even lower value of the density provides overall molecular size and shape. The last is analogous to a conventional space-filling model and corresponds to 98-99% enclosure of the total number of electrons. Electrostatic Potential The electrostatic potential is the energy of interaction of a positive point charge with the nuclei and electrons of the molecule. A surface of constant negative electrostatic potential delineates regions in a molecule that are subject to electrophilic attack, for example, above and below the plane of the ring in benzene, and in the ring plane at nitrogen in pyridine. Property Map Because an density surface that encloses ~98-99% of the total number of electrons corresponds to the overall size and shape of a molecule, it can be used as a “canvas” on which to “paint” information about how a molecule presents itself to the world, for example, whether it is hydrophilic or hydrophobic. A property map follows is by coloring each point on such a surface according to value of some “property”, for example, the electrostatic potential. By convention, colors toward red are used to designate small property values while colors toward blue are used to designate large property values. Electrostatic Potential Map An electrostatic potential map presents the value of the electrostatic potential at locations on a density surface. Red regions show excess negative charge and blue regions excess positive charge. "electron density" The map for benzene clearly shows that the π system attracts the positive charge while the σ system repels the charge. Benzene Dimer The electrostatic potential map for benzene shows opposite charge distribution for the σ and π systems. This explains why benzene dimer prefers to adopt a perpendicular instead of a parallel geometry. Benzene Crystal … and it explains why benzene does not crystallize in a stack, but instead prefers a perpendicular arrangement. Potential Area and Absorption We saw previously that polar surface area provided a semiquantitative account or the rates of transport across a biological membrane. An even better correlation is found with the polar area, the area on an electrostatic potential map where the absolute value of the electrostatic potential is >100 kJ/mol. This signals hydrophilic behavior. Electrostatic Potential Maps and pKa’s acid pKa acid pKa Cl3CCO2H 0.7 HCO2H 3.75 HO2CCO2H 1.23 trans-ClCH=CHCO2H 3.79 Cl2CHCO2H 1.48 C6H5CO2H 4.19 NCCH2CO2H 2.45 p-ClC6H4CH=CHCO2H 4.41 ClCH2CO2H 2.85 trans-CH3CH=CHCO2H 4.70 trans-HO2CCH=CHCO2H 3.10 CH3CO2H 4.75 p-HO2CC6H4CO2H 3.51 (CH3)3CCO2H 5.03 Visual comparison shows that the electrostatic potential at the acidic hydrogen anticipates acid strength. A reasonable correlation is found between the maximum value of the electrostatic potential and experimental pKa. Chromium Tricarbonyl as a Substituent In the absence of electron withdrawing groups, benzene resists nucleophilic aromatic substitution. For example, anisole is nonreactive while 4-cyanoanisole is reactive. Chromium tricarbonyl benzene complexes are also highly reactive. OMe OMe ••Nu no reaction OMe ••Nu OMe ••Nu substitution Cr(CO) 3 Nu Electrostatic potential CN maps for anisole, 4-cyanoanisole and anisole chromium tricarbonyl show that the effect of the chromium tricarbonyl group is similar to that of a para cyano group in promoting nucleophilic reactivity. Vitamin E Radicals can damage cells through reaction with unsaturated fatty acids found in cellular membranes. Vitamin E plays an role in defending cells by transferring hydrogen to radicals to give stable products that can then be excreted. In order to be effective, vitamin E must be soluble in the cellular membrane An electrostatic potential map shows a hydrophobic “hydrocarbon tail”, allowing it to collect in cellular membranes. Spin Density Map The spin density is the difference between the number of “spin up” and “spin down” electrons. A spin density map paints the value of the spin density onto the electron density surface. It provides a measure of “radical character”. "electron density" Back to Vitamin E The spin density map for the radical formed upon hydrogen atom removal from vitamin E shows extensive delocalization of the unpaired electron. Vitamin E should form a stable radical. Beyond Hartree-Fock Models Electron Correlation The Hartree-Fock approximation replaces instantaneous interactions between individual electrons by interactions between each electron and the field created by all the other electrons. As a result, electrons “get in each others way” more than they should, leading to overestimation of the electronelectron repulsion energy and to too high a total energy. Electron correlation accounts for coupling or correlation of electron motions and lowers the electron-electron repulsion energy (and the total energy). The correlation energy is defined as difference between the Hartree-Fock energy and the experimental energy. Configuration Interaction Models One way to calculate the correlation energy is to combine the Hartree-Fock wavefunction with wavefunctions formed by promoting electrons from molecular orbitals that are occupied to molecular orbitals that are unoccupied. You can think of this as combining “ground-state” and “excited-state” wavefunctions unoccupied molecular orbitals electron promotion occupied molecular orbitals Full Configuration Interaction It can be shown that the energy corresponding to a wavefunction formed by combining the Hartree-Fock wavefunction o and wavefunctions resulting from all possible electron promotions s, is identical to that obtained from exact solution of the Schrödinger equation. = ao o + a s s s>o To do this, referred to as full configuration interaction, is not possible because the number of excited states is infinite. Limited Configuration Interaction Models To reduce the number of electron promotions (excited states) specify the number of electrons involved, for example, limit to promotions involving only one electron, promotions involving two electrons, etc. Considering single electron promotions only (Configuration Interaction Singles or CIS), does not lower the Hartree-Fock energy. CID and CISD Configuration Interaction Models CID (Configuration Interaction Doubles) involving two-electron promotions is the simplest procedure that actually leads to lowering of the Hartree-Fock energy. CISD (Configuration Interaction Singles and Doubles) is a slightly better model that involves both one and two-electron promotions. Møller-Plesset Models Another approach, leading to what are commonly known as Møller-Plesset models, is to assume that the Hartree-Fock energy E0 and wavefunction 0 are solutions to an equation involving a Hamiltonian, Ĥ0, that is very close to the exact Schrödinger Hamiltonian, Ĥ. This being the case, Ĥ can be written as a sum of Ĥ0 a small correction, V. is a dimensionless parameter. ˆ Succession of Møller-Plesset Models Expanding the exact energy and wavefunction in terms of a power series of the Hartree-Fock energy and wavefunction yields: Substituting these expansions into the Schrödinger equation and collecting terms in powers of leads to explicit expressions for the energy and wavefunction corrections. The sum of E(0) and E(1) is the Hartree-Fock energy. The MP2 Model E(2) (the first correction to the Hartree-Fock energy) may be written as a sum over occupied and unoccupied molecular orbitals from the Hartree-Fock wavefunction. i, and j are energies of occupied molecular orbitals, a, and b energies of unoccupied molecular orbitals and (ij||ab) are integrals that account for changes in electron-electron interactions as a result of electron promotion. A correction can also be made to the wavefunction. The resulting model is termed MP2 (second-order Møller-Plesset) Origin of the MP2 Model Properties of “Limiting” Hartree-Fock and MP2 Models Assessing Limiting Hartree-Fock and MP2 Models Two classes of models are now defined: Hartree-Fock models represent the simplest possible treatment resulting from the Schrödinger equation and MP2 models represent the simplest possible treatment that accounts for electron correlation. Up to this point we have been ignoring the LCAO approximation and will continue to do so for a bit longer. The geometries and reactions presented on the following slides make use of this approximation, but use a collection of atomic functions that is large enough to allow us not to worry about the details.. Equilibrium Geometries How Good Are Experimental Geometries? Gas-phase structures for ~1000 small molecules have been determined by microwave spectroscopy, but this field is not very active and very few additional structures can be expected. Bond lengths are typically accurate to better than 0.01Ǻ. Structures of ~800,000 crystalline solids from X-ray diffraction have been determined and are available in several well-maintained collections. Except for bonds to hydrogen which are poorly described, bond lengths from solid-phase structures are generally accurate to within 0.02-0.04Ǻ. Calculations within 0.02Ǻ of experiment are a reasonable target. Bond Lengths in Small Molecules molecule diborane ethane methyamine methanol methyl fluoride methyl silane methyl phosphine methane thiol methyl chloride hydrazine hydrogen peroxide fluorine disilane hydrogen disulfide chlorine mean absolute error HF/6-311+G** MP2/6-311+G** expt. 1.779 1.527 1.454 1.399 1.364 1.878 1.856 1.819 1.792 1.412 1.388 1.330 2.373 2.075 2.000 0.024 1.768 1.529 1.465 1.422 1.389 1.877 1.856 1.813 1.776 1.430 1.450 1.417 2.342 2.083 2.025 0.010 1.763 1.531 1.471 1.421 1.383 1.867 1.862 1.819 1.781 1.449 1.452 1.412 2.327 2.055 1.988 – Rationalizing Changes in Geometry Bond lengths increase in moving from Hartree-Fock to MP2 models. This may be rationalized by noting that MP2 involves electron promotion from occupied molecular orbitals in the Hartree-Fock description to unoccupied orbitals. Where there are sufficient or excess electrons, occupied orbitals are either net bonding or non-bonding and unoccupied orbitals net antibonding. Electron promotion leads to bond lengthening. Bond Lengths in Hydrocarbons bond hydrocarbon HF/6-311+G** MP2/6-311+G** expt. C–C but-1-yne-3-ene propyne 1,3-butadiene propene cyclopropane propane cyclobutane 1.438 1.466 1.467 1.502 1.500 1.528 1.548 1.430 1.464 1.460 1.502 1.511 1.529 1.550 1.431 1.459 1.483 1.501 1.510 1.526 1.548 C=C cyclopropene allene propene cyclobutene but-1-yne-3-ene 1,3-butadiene cyclopentadiene 1.276 1.295 1.320 1.323 1.322 1.324 1.330 1.305 1.314 1.341 1.352 1.347 1.347 1.359 1.300 1.308 1.318 1.332 1.341 1.345 1.345 0.010 0.006 – mean absolute error Overall … Both Hartree-Fock and MP2 models provide solid accounts of equilibrium geometry. Specifically, both give bond lengths to within 0.02Ǻ, the error commonly associated with molecular structures obtained from X-ray crystallography. Bond lengths from the Hartree-Fock model are almost always shorter than experimental values (and from bond lengths obtained from the MP2 model). This may be rationalized by recognizing that improvement of the Hartree-Fock model involves mixing of excited-state wavefunctions into the ground-state wavefunction. Reaction Energies Total Energy vs. Heat of Formation The heat of formation is the enthalpy at 298K of a reaction in which a molecule is converted to a set of standard products, one for each different element. For example, the heat of formation of ethylene is given by the reaction: C2H4 2C (graphite) + 2H2 (gas) The total energy is the energy at 0K of a reaction that splits a molecule into its isolated nuclei and electrons. For example, the total energy of ethylene is given by the reaction: C2H4 2C+6 + 4H+ + 16e– Either is OK for thermochemical calculations. Potential Energy Surfaces . . . Thermodynamics The relative energy of reactants and products (minima on a potential energy surface) is related to the thermodynamic heat or enthalpy (∆H) of reaction. The ratio of products to reactants depends on temperature as given by the Boltzmann equation. Eproducts and Ereactants are the energies of products and reactants on the potential energy diagram, T is the temperature (in K) and k is the Boltzmann constant. Potential Energy Surfaces . . . Thermodynamic Product Ratios Product ratios follow directly from energy differences. At room temperature. ΔE (kJ/mol) 2 4 8 12 major:minor 80:20 90:10 95:5 99:1 Potential Energy Surfaces . . . Thermodynamic Product The thermodynamic product for a reaction where two or more different products are possible is that with the lowest energy irrespective of pathway. energy reaction coordinate Thermodynamic product ratios depend only on the difference in reactant and product energies and not on the pathway connecting the two. They also depend on temperature. Relating Calculations to Experimental Thermochemical Data Two corrections are needed to allow calculated energies to be compared with experimental enthalpies with calculated energies. The first accounts for the change in temperature from 0K to some value T: H(T) = Htr(T) + Hrot(T) + Hvib(T) + RT Htr(T) = 3/2RT Hrot(T) = 3/2RT (RT for a linear molecule) normal modes Hvib(T) = Hvib(T) – Hvib(0) = Nh i i (ehvi /kT – 1) νi are vibrational frequencies, R, k and h are the gas, Boltzmann’s and Planck’s constants and N is Avogadro’s number. Relating Calculations to Experimental Thermochemical Data The second correction accounts for the fact that the calculation refers to a stationary molecule whereas the experiment refers to a molecule in its lowest vibrational state. The so-called zeropoint energy is given by: Hvib(0) = zero-point 1 = 2 normal modes h i i νi are vibrational frequencies and h is Planck’s constant. Entropies and Gibbs Energies Calculation of the Gibbs energy (G=H–TS) requires the entropy. n is the number of moles, M is the mass, I are the moments of inertia I, ν are the frequencies and s is the symmetry number. S = Str + Srot + Svib Str = nR Srot = nR Svib = nR 3 + ln 2 3 + ln 2MkT 2 nRT P (vA vBvC )1/2 2 i 3/2 s (uieui – 1)–1 – ln (1 – e–ui) vA = h2/8IA kT, vB = h2 /8IB kT, vC = h2/8IC kT i = hi/kT Be Careful! These expressions make use of the harmonic approximation, which while valid for large frequencies is not appropriate for very low frequencies. In particular, the vibrational component of the temperature correction to the enthalpy and to the entropy are both dominated by low-frequency vibrations, and are subject to considerable uncertainty. In practice, these are set to ½ R and ½ RT for each mode with a frequency below 300 cm-1. Reaction Types Chemical reactions may be divided into one of three categories depending on the extent to which overall bonding is maintained Reactions that lead to a change in the total number of electron pairs, for example, homolytic bond dissociation reactions. H-H H• +H• homolytic bond dissociation You can immediately see the problem. The energy of the product (separated hydrogen atoms) is exact within the Hartree-Fock approximation, while the energy of the product is too high. The Hartree-Fock bond energy will be too large. Homolytic Bond Dissociation Energies bond dissociation reaction HF/6-311+G** MP2/6-311+G** expt. CH3 – CH3 CH3• + CH3• 276 406 406 CH3 – NH2 CH3• + NH2• 238 389 389 CH3 – OH CH3• + OH • 243 410 410 CH3 – F CH3• + F• 289 469 477 NH2 – NH2 NH2• + NH2• 138 310 305 -8 218 230 F – F F• + F• -163 121 159 mean absolute error 168 9 – HO – OH OH• + OH• Reaction Types Reactions that conserve the total number of bonds and the total number of lone pairs, for example, structural isomerization. CH2=C=CH2 CH3CHCH2 structural isomerization These and related reactions are among the most common. Energies of Structural Isomers formula (reference) isomer HF/6-311+G** MP2/6-311+G** expt. C2H3N (acetonitrile) methyl isocyanide 88 112 88 C2H4O (acetaldehyde) oxirane 134 117 113 C2H4O2 (acetic acid) methyl formate 71 75 75 C2H6O (ethanol) dimethyl ether 46 59 50 C3H4 (propyne) allene cyclopropene 8 117 21 100 4 92 C3H6 (propene) cyclopropane 42 21 29 C4H6 (1,3-butadiene) 2-butyne cyclobutene bicyclo [1.1.0] butane 29 63 138 21 38 92 38 46 109 12 11 – mean absolute error Reaction Types Reactions that conserve the numbers of each kind of chemical bond, for example, basicity comparisons. (CH3)3NH+ + NH3 (CH3)3N + NH4+ relative basicity In this category are many important reactions, including those that compare regioisomers and stereoisomers. We use basicity comparisons to take advantage of the availabilty of high-quality experimental thermochemical data (gas-phase proton affinities). Relative Base Strengths base HF/6-311+G** MP2/6-311+G** expt. aniline methylamine aziridine ethylamine dimethylamine pyridine tert-butylamine cyclohexylamine azetidine pyrrolidine trimethylamine piperidine diazabicyclooctane N-methylpyrrolidine N-methylpiperidine quinuclidine 25 49 66 61 81 77 83 83 95 103 102 106 124 117 117 141 21 46 46 54 75 63 71 75 79 92 92 92 105 105 109 121 29 45 52 58 76 76 81 81 90 95 95 100 110 112 118 130 mean absolute error 6 6 – Overall … Homolytic bond dissociation energies from Hartree-Fock models are always significantly smaller than experimental enthalpies. This can be traced back to the difference in the number of electron pairs. Bond energies from MP2 models are in good accord with experimental values The energies of reactions that maintain overall bond count are reasonably well described with both Hartree-Fock and MP2 models. Energies of reactions that maintain individual bond counts are reasonably well described with both Hartree-Fock and MP2 models. Conformational Energy Differences Conformations of Acyclic Molecules molecule low-energy/ high-energy HF/ MP2/ conformer 6-311+G** 6-311+G** expt. n-butane 1-butene 1,3-butadiene acrolein N-methylformamide N-methylacetamide formic acid methyl formate methyl acetate propanal 2-methylpropanal ethanol methyl ethyl ether methyl vinyl ether trans/gauche skew/cis trans/gauche trans/cis trans/cis trans/cis cis/trans cis/trans cis/trans eclipsed/anti eclipsed/anti anti/gauche anti/gauche cis/skew mean absolute error 4.2 2.9 13.4 8.4 4.6 12.1 22.6 25.1 45.2 3.3 1.7 0.8 7.5 7.5 2.1 2.1 10.5 9.2 5.0 10.0 19.2 23.8 41.0 3.8 1.7 0.0 5.9 10.9 2.8 0.9 12.1 7.1 5.9 9.6 16.3 19.9 35.6 2.8 3.3 0.5 6.3 7.1 2.5 1.9 – Conformations of Cyclic Molecules molecule low-energy/ high-energy/ conformer HF/ MP2/ 6-311+G** 6-311+G** expt. methylcyclohexane equatorial/axial tert-butylcyclohexane equatorial/axial cis-1,3-dimethylcyclohexaneequatorial/axial fluorocyclohexane equatorial/axial chlorocyclohexane equatorial/axial piperidine equatorial/axial N-methylpiperidine equatorial/axial 2-chlorotetrahydropyran axial/equatorial 2-methylcyclohexanone equatorial/axial 3-methylcyclohexanone equatorial/axial 4-methylcyclohexanone equatorial/axial 9.6 25.9 27.6 0.4 3.8 3.8 16.3 11.3 8.4 7.1 8.8 7.1 21.3 22.2 0.4 2.1 3.3 15.5 12.1 6.3 6.7 5.4 7.3 22.6 23.0 0.7 2.1 2.2 13.2 7.5 8.8 5.7,6.5 7.3,8.8 mean absolute error 2.0 2.1 – Overall … Both Hartree-Fock and MP2 models appear to provide a reasonable account of conformational energy differences. The two sets of results are not the same but, given the paucity of experimental data and significant error bounds on some of these data, it is difficult to make generalizations. Density Functional Theory While MP2 models offer only modest improvements over Hartree-Fock models for geometries and for some types of energy comparisons, they are required to describe the energies of reactions where bonds are made or broken, including activation energies (barriers to chemical reactions). MP2 models are significantly more costly than Hartree-Fock models and are more limited in their range of application. Density functional models provide an alternative for estimating the correlation energy, by replacing the need to mix ground and excited-state wavefunctions with an explicit term in the Hamiltonian. This results in significantly lower computation cost and consequently in a significantly larger range of application. Formulation of Density Functional Theory The hydrogen atom is not the only problem for which the Schrödinger equation can be solved exactly. Another is an electron gas, solution of which leads to a functional form for the exchange/correlation energy, Exc, in terms of the electron density and the gradient of the density. Exc may be combined with Hartree-Fock terms for the kinetic energy, ET, electron-nuclear interaction energy, EV, and Coulomb energy, EJ. E = ET + EV + EJ + EXC Minimizing E with respect to the unknown orbital coefficients yields the Kohn-Sham equations, that are analogous to the Roothaan-Hall equations in Hartree-Fock theory. Origin of Density Functional Models Problems with Density Functional Models The magnitude of the error in the energy obtained from density functional models does not scale with the size of the molecule. Size consistency (as this behavior is commonly termed) is important its absence is potentially a serious problem. Even more important is the absence of a systematic way to improve functionals in order to achieve an arbitrary level of accuracy Comparing B3LYP Density Functional and MP2 Models Bond Lengths in Small Molecules molecule B3LYP/6-311+G** MP2/6-311+G** expt. diborane ethane methylamine methanol methyl fluoride methyl silane methyl phosphine methane thiol methyl chloride hydrazine hydrogen peroxide fluorine disilane hydrogen disulfide chlorine 1.765 1.530 1.465 1.424 1.395 1.886 1.873 1.836 1.806 1.432 1.454 1.408 2.356 2.114 2.054 1.768 1.529 1.465 1.422 1.389 1.877 1.856 1.813 1.776 1.430 1.450 1.417 2.342 2.083 2.025 1.763 1.531 1.471 1.421 1.383 1.867 1.862 1.819 1.781 1.449 1.452 1.412 2.327 2.055 1.988 mean absolute error 0.018 0.010 – Bond Lengths in Hydrocarbons bond hydrocarbon B3LYP/6-311+G** MP2/6-311+G** expt. C–C but-1-yne-3-ene propyne 1,3-butadiene propene cyclopropane propane cyclobutane 1.423 1.457 1.456 1.500 1.509 1.532 1.554 1.430 1.464 1.460 1.502 1.511 1.529 1.550 1.431 1.459 1.483 1.501 1.510 1.526 1.548 C=C cyclopropene allene propene cyclobutene but-1-yne-3-ene 1,3-butadiene cyclopentadiene 1.291 1.304 1.331 1.339 1.338 1.338 1.348 1.305 1.314 1.341 1.352 1.347 1.347 1.359 1.300 1.308 1.318 1.332 1.341 1.345 1.345 0.008 0.006 – mean absolute error Homolytic Bond Dissociation Energies bond dissociation reaction B3LYP/6-311+G** MP2/6-311+G** expt. CH3 – CH3 CH3• + CH3• 384 406 406 CH3 – NH2 CH3• + NH2• 364 389 389 CH3 – OH CH3• + OH • 389 410 410 CH3 – F CH3• + F• 460 469 477 NH2 – NH2 NH2• + NH2• 289 310 305 HO – OH OH• + OH• 205 218 230 F – F F• + F• 134 121 159 mean absolute error 21 9 – Energies of Structural Isomers formula (reference) isomer B3LYP/6-311+G** MP2/6-311+G** expt. C2H3N (acetonitrile) methyl isocyanide 100 112 88 C2H4O (acetaldehyde) oxirane 121 117 113 C2H4O2 (acetic acid) methyl formate 67 75 75 C2H6O (ethanol) dimethyl ether 46 59 50 C3H4 (propyne) allene cyclopropene -8 100 21 100 4 92 C3H6 (propene) cyclopropane 38 21 29 C4H6 (1,3-butadiene) 2-butyne cyclobutene bicyclo [1.1.0] butane 38 63 130 21 38 92 38 46 109 11 11 – mean absolute error Relative Base Strengths Me 3N + NH 4+ base Me3NH + + NH 3 B3LYP/6-311+G** MP2/6-311+G** expt. aniline methylamine aziridine ethylamine dimethylamine pyridine tert-butylamine cyclohexylamine azetidine pyrrolidine trimethylamine piperidine diazabicyclooctane N-methylpyrrolidine N-methylpiperidine quinuclidine 24 46 56 60 76 79 82 83 89 99 94 102 115 110 117 132 21 46 46 54 75 63 71 75 79 92 92 92 105 105 109 121 29 45 52 58 76 76 81 81 90 95 95 100 110 112 118 130 mean absolute error 2 6 – Conformations of Acyclic Molecules molecule low-energy/ high-energy B3LYP/ MP2/ conformer 6-311+G** 6-311+G** expt. n-butane 1-butene 1,3-butadiene acrolein N-methylformamide N-methylacetamide formic acid methyl formate methyl acetate propanal 2-methylpropanal ethanol methyl ethyl ether methyl vinyl ether trans/gauche skew/cis trans/gauche trans/cis trans/cis trans/cis cis/trans cis/trans cis/trans eclipsed/anti eclipsed/anti anti/gauche anti/gauche cis/skew mean absolute error 3.8 2.9 14.6 9.2 4.2 10.0 18.8 22.1 38.0 3.8 1.7 0.0 6.3 8.4 2.1 2.1 10.5 9.2 5.0 10.0 19.2 23.8 41.0 3.8 1.7 0.0 5.9 10.9 2.8 0.9 12.1 7.1 5.9 9.6 16.3 19.9 35.6 2.8 3.3 0.5 6.3 7.1 1.4 1.9 – Conformations of Cyclic Molecules molecule low-energy/ high-energy/ conformer B3LYP/ MP2/ 6-311+G** 6-311+G** expt. methylcyclohexane equatorial/axial tert-butylcyclohexane equatorial/axial cis-1,3-dimethylcyclohexaneequatorial/axial fluorocyclohexane equatorial/axial chlorocyclohexane equatorial/axial piperidine equatorial/axial N-methylpiperidine equatorial/axial 2-chlorotetrahydropyran axial/equatorial 2-methylcyclohexanone equatorial/axial 3-methylcyclohexanone equatorial/axial 4-methylcyclohexanone equatorial/axial 10.0 22.2 25.1 0.8 2.9 2.9 16.7 15.5 8.4 6.7 8.4 7.1 21.3 22.2 0.4 2.1 3.3 15.5 12.1 6.3 6.7 5.4 7.3 22.6 23.0 0.7 2.1 2.2 13.2 7.5 8.8 5.7,6.5 7.3,8.8 mean absolute error 2.9 2.1 – Performance of B3LYP and MP2 Models Geometries obtained from B3LYP and MP2 models are very similar and generally improved over geometries obtained from Hartree-Fock models. There are exceptions. For example, the geometries of molecules incorporating second-row elements, are better described with Hartree-Fock models. Bond dissociation energies from the B3LYP model are slightly inferior to those from MP2 model, although both are greatly improved relative to Hartree-Fock models. B3LYP and MP2 models provide comparable results for reactions where bonding is partially or fully maintained. Conformational energy differences obtained from B3LYP and MP2 models are comparable in quality. Practical Models …. LCAO Approximation The third approximation connecting the Schrödinger equation to Hartree-Fock and post-Hartree-Fock models is the LCAO approximation, which writes the molecular orbitals ψ as linear combinations of a finite set (a basis set) of prescribed functions (basis functions). basis functions yi= ci c are the (unknown) molecular orbital coefficients. There are two issues, the kind of functions and the number of functions. Gaussian Basis Sets The obvious choice for the individual functions is a polynomial in the Cartesian coordinates times an exponential function, that is, the form of the exact solutions for the hydrogen atom. However, exponential functions give rise to expressions that are difficult to solve analytically. Gaussian functions (an exponential in the square of the distance from the origin rather than the distance itself), lead to simpler mathematics and are used instead. All practical quantum chemical calculations now use Gaussian functions, although in the past there was interest in “thinking” about using exponential functions. Minimal Basis Sets … STO-3G A minimal basis set comprises the smallest set of functions required to accommodate all of the atom’s electrons, and still maintain its overall spherical symmetry. This is a single (1s) function for hydrogen and helium, a set of five functions (1s, 2s, 2px, 2py, 2pz) for lithium to neon and a set of nine functions (1s, 2s, 2px, 2py, 2pz, 3s, 3px, 3py, 3pz) for sodium to argon. Each of the functions in the STO-3G minimal basis set is expanded in terms of three Gaussian functions. Gaussian exponents and linear coefficients have been determined by least squares as best fits to so-called Slater-type (exponential) functions, that is, hydrogen atom solutions. “Have your cake and eat it too” approach. Shortcomings of Minimal Basis Sets Minimal basis sets suffer from two shortcomings. The first is that the basis functions are either themselves spherical or come in sets that taken together describe a sphere. This implies that atoms with spherical or nearly spherical molecular environments will be better described than atoms with aspherical environments. Split-Valence Basis Sets … 3-21G A split-valence basis set provides “inner” and “outer” sets of valence basis functions that may be combined to account for different environments. For example, a p orbital from the inner set may emphasized to construct a bond, while a p orbital from the outer set may be emphasized to construct a bond. p = inner + outer p = inner + outer The 3-21G basis set uses three Gaussians for each of the core orbitals and two and one Gaussians for each of the orbitals in the inner and outer valence sets. Basis Set Nomenclature Basis sets may be thought of as divided into core and valence regions. “Chemistry” is a function of the valence, and to the maximum extent possible this is where emphasis (functions) need to be placed. Typically each core atomic orbital is represented by a single set of functions and each valence atomic orbital by two (or more) sets of functions. The number in the basis set designation to the left of the “–” indicates the number of Gaussian functions use to construct core atomic orbitals, and the numbers to the right indicate the numbers of Gaussians used to construct valence atomic orbitals. For example, 3-21G uses three Gaussians to describe each of the core orbitals and two and one Gaussians to describe each of the orbitals in the inner and outer valence sets. Shortcomings of Minimal Basis Sets The second shortcoming of a minimal basis set is that the basis functions are atom centered. This restricts their flexibility to describe off-center electron distributions. The “obvious” solution is to move the functions away from the nuclei. This is not a viable option as raises the question of “where to put the functions” Polarization Basis Sets 6-31G* and 6-31G** A polarization basis set provides d-type functions on maingroup elements (*) and (optionally) p-type functions on hydrogen (**) to allow displacement of electron distributions from the nuclei. These are available to combine (hybridize) with the valence orbitals. + + 6-31G* and 6-31G** basis sets are examples of the two types of polarization basis sets. Here, each polarization function is made up of a single Gaussian function. Larger Basis Sets … 6-311+G** Larger basis sets have been formulated to provide more extensive splitting of the valence shell and to incorporate functions that extend far from the nuclei, so-called diffuse or “+” functions The latter may be needed for calculations on anions and on excited states, where electrons may drift far from nuclear centers. The 6-311+G** basis set splits the valence into three parts, adds polarization functions to all atoms and adds a diffuse function (made up of a single Gaussian function) to each non-hydrogen atom. Theoretical Models Specification of approximations to the Schrödinger equation and a basis set leads to a Theoretical Model. Most important, a theoretical model needs to be well defined, meaning that it depends only on the locations and identities of the nuclei, on the total number of electrons and the number of unpaired electrons, and it needs to be practical for molecules and problems of interest. Only slightly less important, it should be unbiased, meaning that little or no “chemical intuition” is used in its formulation, and it should be size consistent, meaning that the error in the energy scales with the size of the molecule. If possible, a theoretical model should be variational, meaning that the energy is higher than the energy from exact solution of the Schrödinger equation. Properties of Theoretical Models Hartree-Fock models are well defined, practical for molecules with up to 50 heavy (non-hydrogen) atoms, unbiased, size consistent and variational. MP2 models (and all MPN models) are well defined, applicable to molecules with less than 20 heavy atoms, unbiased and size consistent but they are not variational. Density functional models are applicable to molecules with less than 50 heavy atoms and are unbiased but are neither size consistent nor variational. While density functional models are well defined given the form of the term added to the Hartree-Fock Hamiltonian, there is no obvious way to “improve” on this term Simplifying Hartree-Fock Models Are Simpler Models Justified? Is there a need for quantum chemical models that are simpler (“less costly”) than any that have presented thus far? Certainly much less than there was only a decade ago and much more than there will be in another decade. Present generation personal computers can easily perform Hartree-Fock and density functional on molecules comprising 50 heavy atoms and more (getting close to the maximum size of molecules that can actually be made). Missing are quantum chemical models that are applicable to biopolymers (proteins), with thousands to tens of thousands of atoms. While these can already be handled using molecular mechanics, methods based on quantum mechanics might be of interest. Semi-Empirical Models The NDDO Approximation Semi-empirical models follow from the Hartree-Fock model by introducing a rather draconian approximation. Known as NDDO (Neglect of Diatomic Differential Overlap) approximation, this insists that atomic orbitals residing on different atomic centers do not overlap (“see each other”). This reduces the size dependence of the limiting step in Hartree-Fock models from O(N4) to O(N2), where N is the total number of functions in the atomic basis set. A minor step in Hartree-Fock models (matrix diagonalization) is O(N3) and dominates the computation for semi-empirical models, limiting practical calculations to 200300 atoms at most. Proteins are not in the cards. Form and Parameters for Semi-Empirical Models The functional form of all present generation semi-empirical models is nearly the same. What differs are the values of atomic parameters introduced in order to overcome the “damage” incurred by the NDDO approximation. In practice, upwards of 20 parameters per atom are used in an attempt to reproduce geometries, heats of formation, dipole moments and ionization potentials. Experience suggests that this is too ambitious a goal. Basis Sets for Semi-Empirical Models Semi-empirical models use a minimal valence basis set of exponential functions. Hydrogen is represented by a single (1s) function. Main-group elements are represented by a single s-type and set of three p-type functions. 2s, 2px,2py,2pz first-row element 3s, 3px,3py,3pz second-row element Transition metals are represented by set of five d-type, one s-type and set of three p-type functions. 3dxx-yy,3dzz,3dxy,3dxz,3dyz,4s,4px,4py,4pz first-row 4dxx-yy,4dzz,4dxy,4dxz,4dyz,5s,5px,5py,5pz second-row Origin of Semi-Empirical Models Choosing a Theoretical Model The choice of theoretical model ultimately rests with a balance of how well it performs for the quantity of interest (geometry, reaction energy or conformation as already discussed, or other chemically important quantities such as activation energies and spectra) and how easily (if at all) it can be applied to the molecules of interest. Theoretical Model Chemistry A theoretical model gives rise to a “chemistry” (a Theoretical Model Chemistry), that is, a set of results realized from its application. This chemistry is distinct from that provided by any other theoretical model, and from experiment. As the severity of approximations used to construct the model is reduced, results should approach experimental results. Performance of Practical Theoretical Models Practical Theoretical Models Among the quantum chemical models that have proven to be reliable and practical for routine use are Hartree-Fock models with 3-21G and 6-31G* basis sets, the B3LYP/6-31G* density functional model and the MP2/6-31G* model. Less reliable but more widely applicable is the PM3 semi-empirical model. The utility of these five models rests on their ability to provide accurate molecular properties. We start with equilibrium geometries, reaction energies and conformational energy differences. Geometries of Small Molecules molecule Hartree-Fock 3-21G 6-31G* B3LYP 6-31G* MP2 6-31G* PM3 expt. diborane ethane methylamine methanol methyl fluoride methyl silane methyl phosphine methane thiol methyl chloride hydrazine hydrogen peroxide fluorine disilane hydrogen disulfide chlorine 1.786 1.542 1.471 1.441 1.404 1.883 1.855 1.823 1.806 1.451 1.473 1.402 2.342 2.057 1.996 1.778 1.527 1.453 1.400 1.365 1.888 1.861 1.817 1.785 1.413 1.393 1.345 2.353 2.064 1.990 1.769 1.531 1.465 1.419 1.383 1.889 1.876 1.836 1.804 1.437 1.456 1.403 2.351 2.098 2.042 1.754 1.527 1.465 1.424 1.392 1.884 1.860 1.817 1.778 1.439 1.467 1.421 2.338 2.069 2.015 1.773 1.504 1.469 1.395 1.351 1.863 1.866 1.801 1.764 1.440 1.482 1.350 2.396 2.034 2.035 1.763 1.531 1.471 1.421 1.383 1.867 1.862 1.819 1.781 1.449 1.452 1.412 2.327 2.055 1.988 mean absolute error 0.012 0.020 0.016 0.009 0.025 – Geometries of Hydrocarbons bond hydrocarbon Hartree-Fock 3-21G 6-31G* C-C but-1-yne-3-ene propyne 1,3-butadiene propene cyclopropane propane cyclobutane 1.432 1.466 1.479 1.510 1.513 1.541 1.543 1.439 1.468 1.467 1.503 1.497 1.528 1.548 1.424 1.461 1.458 1.502 1.509 1.532 1.553 1.429 1.463 1.458 1.499 1.504 1.526 1.545 1.414 1.433 1.456 1.480 1.499 1.512 1.542 1.431 1.459 1.483 1.501 1.510 1.526 1.548 C=C cyclopropene allene propene cyclobutene but-1-yne-3-ene 1,3-butadiene cyclopentadiene 1.282 1.292 1.316 1.326 1.320 1.320 1.329 1.276 1.296 1.318 1.322 1.322 1.323 1.329 1.295 1.307 1.333 1.341 1.341 1.340 1.349 1.303 1.313 1.338 1.347 1.344 1.344 1.354 1.314 1.297 1.328 1.349 1.332 1.331 1.352 1.300 1.308 1.318 1.332 1.341 1.345 1.345 0.011 0.011 0.006 0.007 0.015 – mean absolute error B3LYP 6-31G* MP2 6-31G* PM3 expt. [18] Annulene There are cases where HF/6-31G* and B3LYP/6-31G* models disagree structure of. For example, the Hartree-Fock model finds a geometry for [18] annulene with alternating single and double bonds, while the density functional model shows that carboncarbon bond lengths vary only slightly. The latter is in much better accord with the experimental (X-ray) structure which shows that the bonds vary only slightly from 1.38 to 1.41Ǻ. . [18] Annulene Geometries of Organometallics organometallic bond CO3Cr (benzene) B3LYP/6-31G* PM3 expt. 1.85 1.90 1.84 CO4Cr (Dewar benzene) ax eq 1.90 1.86 1.96 1.92 1.86 1.83 CO5Cr=C(Me)NH(Me) ax eq 1.89 1.90 1.91 1.91 1.86-1.88 1.88-1.91 CO3Fe (cyclobutadiene) 1.78 1.74 1.79 CO3Fe (butadiene) 1.78 1.75 1.76 CO4Fe (acetylene) ax eq 1.85 1.79 1.82 1.75 1.77 1.76 CO4Fe (ethylene) ax eq 1.81 1.79 1.81 1.75 1.78 1.81 CO3Co (allyl) 1.78 1.81 1.77 mean absolute error 0.02 0.04 – Dichloro [ethane-1,2-diylbis(cyclopentadienyl)]zirconium Dichloro[ethane-1,2-diyl-bis(cyclopentadienyl)]zirconium acts as a catalyst in homogenous olefin polymerization. Cl Zr Cl The PM3 equilibrium geometry of this complex is nearly identical to the X-ray structure in the Cambridge Structural Database. Overall … All five models generally provide a solid account of equilibrium geometries. The B3LYP/6-31G* and MP2/6-31G* models are best and the PM3 models is worst, but the differences are not large. Hartree-Fock models and the MP2/6-31G* model provide a poor account of the geometries of transition-metal inorganic and organometallic compounds. The B3LYP/6-31G* model and, to a lesser extent, the PM3 model provide reasonable equilibrium geometries. Bond Dissociation Energies bond dissociation reaction Hartree-Fock 3-21G 6-31G* B3LYP 6-31G* MP2 6-31G* PM3 expt. CH3-CH3 CH3+ CH3 285 293 406 414 310 406 CH3-NH2 CH3 + NH2 247 243 372 385 285 389 CH3-OH CH3 + OH 222 247 402 410 347 410 CH3-F CH3 + F 247 289 473 473 423 477 NH2-NH2 NH2 + NH2 155 142 293 305 209 305 HO-OH OH + OH 13 0 226 230 192 230 F-F F + F -121 -138 176 159 247 159 mean absolute error 190 171 9 2 77 – Energies of Structural Isomers Hartree-Fock 3-21G 6-31G* B3LYP/ MP2/ 6-31G* 6-31G* formula (reference) isomer PM3 expt. C2H3N (acetonitrile) methyl isocyanide 88 100 113 121 130 88 C2H4O (acetaldehyde) oxirane 142 130 117 112 151 113 C2H4O2 (acetic acid) methyl formate 54 54 50 59 63 75 C2H6O (ethanol) dimethyl ether 25 29 21 38 38 50 C3H4 (propyne) allene cyclopropene 13 167 8 109 -8 92 21 96 29 117 4 92 C3H6 (propene) cyclopropane 59 33 33 17 42 29 C4H6 (1,3-butadiene) 2-butyne 17 cyclobutene 75 bicyclo [1.1.0] butane 192 29 54 126 33 50 117 17 33 88 -4 29 160 38 46 109 mean absolute error 32 13 11 15 29 – Nitrogen Base Strengths base Hartree-Fock 3-21G 6-31G* B3LYP 6-31G* MP2 6-31G* PM3 expt. aniline methylamine aziridine ethylamine dimethylamine pyridine tert-butylamine cyclohexylamine azetidine pyrrolidine trimethylamine piperidine diazabicyclooctane N-methylpyrrolidine N-methylpiperidine quinuclidine 4 42 67 54 71 59 75 75 92 96 88 92 109 105 109 121 29 46 59 59 75 75 79 84 92 96 92 100 117 109 117 130 21 42 46 54 67 67 79 79 79 92 79 92 100 100 105 117 29 42 38 50 67 54 71 75 75 88 79 88 96 96 100 113 3 -8 -25 0 -17 0 25 21 0 0 -25 13 -21 -8 4 4 29 45 52 58 76 76 81 81 90 95 95 100 110 112 118 130 mean absolute error 8 2 8 12 86 – Overall … Only B3LYP/6-31G* and MP2/6-31G* models properly account for homolytic bond dissociation energies. Except for PM3, all models provide acceptable descriptions of the energetics of reactions in which total electron-pair count is maintained and good descriptions of reactions that maintain individual bond counts. The PM3 model appears to be unreliable for all reaction energy calculations. Aside … Heats of Formation Were calculations able to reliably provide accurate heats of formation, we would not have to worry about carefully choosing reactions. There have been numerous attempts, the simplest of which (known as the G3(MP2) recipe) is able to reproduce experimental heats to within 8 kJ/mol (mean absolute error). However, G3(MP2) requires an MP2/6-31G* geometry, an HF/6-31G* frequency, a very large basis set MP2 energy calculation and a QCISD(T)/6-31G* energy calculation. The last of these is most problematic as it scales with the seventh power of the size. In practice, G3(MP2) is applicable only to very small molecules (less than 10 heavy atoms). The T1 Recipe The T1 recipe has recently been formulated with the objective of closely reproducing G3(MP2) heats of formation, but require two to three orders of magnitude less computation. It replaces the MP2/6-31G* geometry with an HF/6-31G* geometry, eliminates both the HF/6-31G* frequency calculation and most importantly the QCISD(T)/6-31G* energy calculation. It substitutes the large basis set MP2 energy calculation with an RI-MP2 calculation using a dual basis set. Parameters based on bond orders are introduced. T1 is easily applicable to molecules 25-30 heavy atoms. At present, it is restricted to uncharged, closed-shell molecules comprising H, C, N, O, Si, P, S, F, Cl and Br only. T1 vs. G3(MP2) Heats of formation obtained from the T1 and G3(MP2) recipes differ by less than 2 kJ/mol (mean absolute error). In effect the two procedures yield identical heats. T1 vs. Experimental Heats of Formation The mean absolute error between T1 and experimental heats of formation for the molecules in the NIST database is <9 kJ/mol. T1 Reaction Energies T1 reproduces experimental reaction energies better than any other practical theoretical model surveyed, and typically leads to mean absolute errors that are <4 kJ/mol (comparable to errors in the experimental data). It may offer a viable alternative to experimental data which are both limited (combustion requires large amounts of material which is in turn destroyed), and prone to error. Conformational Energy Differences How Well do Quantum Chemical Models Reproduce Conformational Differences? Experimental data with which to assess the theoretical models are limited to systems with a single degree of freedom. PM3 and HF/3-21G models yield poor results, comparisons are restricted to HF/6-31G*, B3LYP/6-31G* and MP2/6-31G* models. Experimental data on conformer energy differences derive primarily from equilibrium measurements, and may become less and less accurate as the conformer energy differences increase. Conformations of Acyclic Molecules molecule n-butane 1-butene 1,3-butadiene acrolein N-methylformamide N-methylacetamide formic acid methyl formate methyl acetate propanal 2-methylpropanal ethanol methyl ethyl ether methyl vinyl ether mean absolute error low-energy/ high-energy/ conformer HF/ 6-31G* B3LYP/ 6-31G* MP2/ 6-31G* expt. trans/gauche skew/cis trans/gauche trans/cis trans/cis trans/cis cis/trans cis/trans cis/trans eclipsed/anti eclipsed/anti anti/gauche anti/gauche cis/skew 4.2 2.9 13.0 7.1 4.6 12.6 25.5 25.9 39.3 4.6 3.3 0.4 7.1 8.4 3.3 1.7 15.1 7.1 2.9 11.3 21.8 22.1 35.6 5.0 3.3 -1.3 5.9 9.6 2.9 2.1 10.9 6.3 4.2 11.7 25.5 25.9 46.0 4.6 4.2 0.4 7.1 7.9 2.8 0.9 12.1 7.1 5.9 9.6 16.3 19.9 35.6 2.8 3.3 0.5 6.3 7.1 2.3 1.7 2.7 1.9 – Conformations of Cyclic Molecules molecule low-energy/ high-energy/ conformer HF/ B3LYP/ 6-31G* 6-31G* MP2/ 6-31G* expt. methylcyclohexane equatorial/axial tert-butylcyclohexane equatorial/axial cis-1,3-dimethylcyclohexaneequatorial/axial fluorocyclohexane equatorial/axial chlorocyclohexane equatorial/axial piperidine equatorial/axial N-methylpiperidine equatorial/axial 2-chlorotetrahydropyran axial/equatorial 2-methylcyclohexanone equatorial/axial 3-methylcyclohexanone equatorial/axial 4-methylcyclohexanone equatorial/axial 9.6 25.5 27.2 -1.3 4.2 3.3 15.1 10.5 9.6 7.1 8.8 8.8 22.2 25.1 -0.8 3.8 1.3 14.2 15.5 10.0 6.7 8.4 7.9 23.4 23.8 -2.9 2.9 2.5 15.1 11.7 9.2 3.8 6.3 7.3 22.6 23.0 0.7 2.1 2.2 13.2 7.5 8.8 5.7,6.5 7.3,8.8 mean absolute error 2.1 2.1 1.7 2.1 – T1 Conformer Energy Differences As measured by mean absolute error (kJ/mol), T1 is the most successful practical model surveyed with regard to reproducing conformer energy differences. acyclic molecules cyclic molecules T1 1.1 1.1 HF/6-31G* 2.3 2.1 B3LYP/6-31G* 1.7 2.1 B3LYP/6-311+G** MP2/6-31G* 1.4 2.7 2.9 1.7 MP2/6-311+G** MMFF 1.9 1.1 2.1 3.3 Standards for Conformational Analysis The lack of reliable experimental data for any but very simple molecules means that it is necessary to use calculated conformer energy differences as a standard with which to judge the performance of practical theoretical models. The standard needs to accurately reproduce existing experimental data and be applicable to more complex molecules. The B3LYP/6-311+G**//6-31G* model (B3LYP/6-311+G** energy based on HF/6-31G* geometry) is used here as the standard. In the future, this standard will probably be replaced by T1. 2-Benzylamino-1-propanol 4,6-Dimethyl-1-phenyl-5-hepten-3-one Performance of Practical Theoretical Models for Conformational Analysis From comparisons with B3LYP/6-311+G** results: Both HF/6-31G* and B3LYP/6-31G* models properly identify the lowest-energy conformer (or suggest a very similar “best” conformer), and provide a reasonable account of conformer energy differences. HF/3-21G and PM3 models commonly fail to properly assign the lowest-energy conformer, and neither provides a good account of conformer energy differences. What Does it Cost to Fit a Molecule into a Protein? The anti cancer drug gleevec has been crystallized inside a protein and this complex is available in the PDB database (accessible from Spartan). According to the Hartree-Fock 6-31G* model, the protein bound conformer is 9 kJ/mol less stable than the lowest-energy conformer for the isolated molecule. Molecular Charge Distributions Dipole Moments In addition to geometry (sterics”), organic chemists often refer to charge-charge interactions (“electrostatics”) to judge whether a molecule is likely to be favorable. The dipole moment reflects opposing contributions of positively-charged nuclei and negatively-charged electrons and accounts for overall polarity. (debyes) = 2.5416 [ ZArA – Pr] Summation are over atoms (A) and pairs of atomic basis functions (j). ZA is the atomic number of atom A, rA is the position of atom A (relative to an arbitrary origin), P is an element of the density matrix and r is an integral. r = ∫ jrjdt Dipole Moments for Small Molecules Molecule Hartree-Fock 3-21G 6-31G* B3LYP/ 6-31G* MP2/ 6-31G* PM3 expt. CO PH3 H2S HCl NH3 HF H2O CH3F CH3Cl CS H2CO HCN LiH LiF LiCl 0.4 0.9 1.4 1.5 1.8 2.2 2.4 2.3 2.3 1.4 2.7 3.0 6.0 5.8 7.8 0.3 0.9 1.4 1.5 1.9 2.0 2.2 2.0 2.3 1.3 2.7 3.2 6.0 6.2 7.7 0.1 1.0 1.4 1.5 1.9 1.9 2.1 1.7 2.1 1.5 2.2 2.9 5.6 5.6 7.1 0.2 1.0 1.5 1.5 2.0 1.9 2.2 1.9 2.0 2.0 2.3 3.0 5.8 5.9 7.3 0.2 1.2 1.8 1.4 1.6 1.4 1.7 1.4 1.4 1.4 2.2 2.7 5.7 5.3 6.5 0.11 0.58 0.97 1.08 1.47 1.82 1.85 1.85 1.87 1.98 2.34 2.99 5.83 6.28 7.12 mean absolute error 0.4 0.3 0.4 0.2 0.4 – Dipole Moments for Hydrocarbons Hartree-Fock 3-21G 6-31G B3LYP/ 6-31G* MP2/ 6-31G* PM3 expt. formula hydrocarbon C3H4 propyne cyclopropene 0.7 0.5 0.6 0.6 0.7 0.5 0.6 0.5 0.4 0.4 0.75 0.45 C3H6 propene 0.3 0.3 0.4 0.3 0.2 0.36 C4H6 cyclobutene 1,2-butadiene 1-butyne methylenecyclopropane bicyclo[1.1.0]butane 1-methylcyclopropene 0.1 0.4 0.7 0.3 0.8 0.9 0.0 0.4 0.7 0.4 0.7 0.9 0.1 0.4 0.7 0.4 0.8 0.9 0.1 0.3 0.6 0.3 0.8 0.8 0.2 0.2 0.3 0.2 0.4 0.6 0.13 0.40 0.80 0.40 0.68 0.80 C4H8 isobutene cis-2-butene cis-1-butene methylcyclopropane 0.5 0.1 0.4 0.1 0.5 0.1 0.4 0.1 0.5 0.2 0.4 0.1 0.4 0.2 0.3 0.1 0.4 0.3 0.2 0.1 0.50 0.26 0.44 0.14 C5H6 cyclopentadiene 0.4 0.3 0.4 0.4 0.5 0.42 0.1 0.1 0.0 0.1 0.2 – mean absolute error Atomic Charges The “charge” on an atom in a molecule (the sum of its nuclear charge and the charges of “associated” electrons) may neither be measured nor calculated unambiguously. The problem is how to associate electrons with a particular atom. Consider an electron density surface for hydrogen fluoride that encloses a large fraction of the electrons. H F While it shows that a large fraction of the electrons “belong” to fluorine, the is no “correct” way to actually know this fraction. Atomic Charges from Fits to Electrostatic Potentials Define atomic charges such that they reproduce as closely as possible the electrostatic potential: i) Select points located outside the van der Waals surface. The number and location of the points are not unique, meaning that the resulting charges are not unique. ii) Calculate the electrostatic potential at these points. the points to a potential in which atomic charges have replaced nuclei and electrons, subject to the requirement that the charges sum to the total charge on the molecule. iii) Fit Spectroscopy Infrared Spectra As mentioned earlier, the frequency of each of the lines in the infrared spectrum is proportional to the square root of the second derivative of the energy with respect to change in coordinate (the force constant). The force constant is the first finite term in a power series expansion of the energy as a function of the coordinate. E = Eo + dE/dR + d2E/dR2 + higher-order terms Eo is a constant. dE/dR is zero only at a stationary point, which means that infrared spectra calculations need to be carried out using the correct equilibrium geometry. Higher-order (anharmonic) terms are ignored and the energy goes to infinity and not to zero (separated atoms) with increasing distance. This means that the potential will be too steep and the calculated frequency will be too large. Calculation of Infrared Spectra Evaluation of infrared frequencies involves calculation of the second derivatives of the energy with respect to displacements of the Cartesian coordinates. For Hartree-Fock and density functional models, the “computational cost” is roughly four to six or steps of geometry optimization, and molecules with molecular weights <400 amu are in range. Infrared spectra calculation with MP2 models is significantly more difficult and presently limited to small molecules. The intensity of an infrared absorption is proportional to the change in the dipole moment in response to motion along the coordinate. Performance of Theoretical Models for Infrared Spectra Hartree-Fock models overestimate frequencies associated with bond-stretching motions by ~12%, for example, the CO bond stretch in cyclohexanone (left). The direction of the error is consistent with the tendency to underestimate bond lengths. It is greatly reduced for B3LYP models (right). NIST Infrared Database Spartan provides on-line access to the NIST database of infrared spectra (~6,000 compounds). Measured spectra may be displayed on top of calculated spectra. Greenhouse Gases Blackbody radiation from the earth exhibits “holes” due to absorption in the infrared of “greenhouse gases”, CO2 most important among them. MTBE is a Greenhouse Gas The fuel additive MTBE (methyl tert-butyl ether) is also likely to be a greenhouse gas. Matching Calculated Infrared Spectra Scaled to account for systematic errors and broadened to simulate finite temperatures, calculated infrared spectra closely fit experimental spectra. This suggests that a database of calculated infrared spectra could be used lieu of experimental spectra to identify unknown molecules. Spartan Infrared Spectral Database The Spartan Infrared Database contains ~50,000 spectra for organic molecules obtained from the EDF2/6-31G* density functional model. The database may be searched by “pattern matching” to a input (experimental) spectrum, where overall scaling and peak width at half height are individually optimized. Because each spectrum in the database needs to be optimized for best fit to the unknown, searching will require significantly more computer time than searching a database of measures spectra. NMR Spectroscopy Nuclear spins either align parallel or antiparallel to an applied magnetic field, giving rise to nuclear spin states, the difference in energy (ΔE) between which depends on the type of nucleus and on the strength of the magnetic field (B0) at the nucleus. ΔE = γħB0 γ is the gyromagnetic ratio which depends on the nucleus and ħ is Planck’s constant/2π. The applied magnetic field is weakened by electrons around the nucleus. Nuclei that are well shielded by the electron cloud experience a lesser field than those that are poorly shielded, and show a smaller energy splitting. The splittings, relative to a standard, are termed chemical shifts. Calculation of Chemical Shifts Chemical shifts require calculation of the second derivatives of the energy with respect to an external magnetic field at each nuclear position. The “computational cost” of chemical shift calculations follows the cube of the total number of basis functions, and in practice is comparable to two or three steps of geometry optimization. Molecules with molecular weights <500 amu are within range. NMR shift calculations are available for both Hartree-Fock and density functional models. Presentation of Spectra Only chemical shifts are calculated. Intensities for both proton and 13C spectra are proportional to the number of equivalent protons/carbons. HH coupling constants are obtained from an empirical fit to the geometry. 13C DEPT spectra may be drawn. Chemical Shift Database Spartan provides on-line access to a database of ~15,000 compounds from the University of Cologne. Measured spectra may be displayed on top of calculated spectra. Direct Calculation of 13C Chemical Shifts The simplest approach is to fit the calculated shifts to the experimental using least-squares. B3LYP/6-31G* calculations show an (rms) error of 4.4 ppm and significant outliers. The HF/631G* model gives a poorer correlation. 250 200 expt. 150 100 50 0 0 50 100 B3LYP 150 200 250 13C . Chemical Shifts in Organometallics Cyclopentenebromonium Ion The 13C spectrum of cyclopentenebromonium ion contains lines at 18.8, 31.8 and 114.6 ppm. These might either arise from a structure with bromine bonded to both sp2 carbons (bridged) or from an equilibrium between a pair of equivalent structures with bromine attached only to one carbon (open). + H + H Br bridged 13C H Br H open chemical shifts for the bridged structure fit the observed NMR spectrum much more closely than those for open form. 9,10-Dihydroxytetrahydrodicyclopentadiene Calculated 13C shifts (in red) are sufficiently accurate to allow the exo and endo stereoisomers of 9,10-dihydroxytetrahydrodicyclopentadiene to be distinguished. 7 HO 3 9 HO HO 10 position(s) 1,3 9,10 HO 1 exo endo 30.3 (33.1) 25.0 (25.8) 71.5 (73.5) 67.7 (68.9) ∆ 5.3 (7.3) 3.8 (4.6) What Works? The two forms of cyclopentenebronomium ion are very different and their NMR spectra are easily distinguished. Stereoisomers of 9,10-dihydroxytetrahydrodicyclopentadiene are very similar and the calculations benefit from error cancellation. A good example of what happens where the molecules are not different enough for their NMR spectra to be easily distinguished, and not similar enough to benefit from error cancellation, is provided by attempts to assign the product of biosynthesis related to the known pathway for lambertellol (Masaru Hashimoto, Hirosaki University, Japan). Biosynthetic Pathway OH O O O H3C OH OH O O OH O O H3C O neolambertellin H3 C OH O H3C O OH O O O O OH O O H3C O O O H3C O O OH O lambertellin H3C isolated O product? Identification of Biosynthesis Products The data are not sufficiently accurate to allow definitive structure assignment. None of the possibilities (including the correct one shown below) reproduces the overall “pattern” of the experimental spectrum. Restricted Calculation of 13C Chemical Shifts Chemical shift predictions can be improved by restricting comparisons to carbons that are closely related, for example, to carbons in alkenes (left) and methyl group carbons (right). Regression Fits to 13C Chemical Shifts This suggests that calculated shifts can be “improved” simply by taking the local environment into account, for example, by “counting” the number of each kind of bond attached to a particular carbon. In effect, this corresponds to using a series of “standards” instead of just a single standard (tetramethylsilane). The scheme implemented in Spartan’08 is based on a linear regression with the calculated chemical shift and bond counts as variables. An improved scheme that uses bond orders instead of bond counts will be available in Spartan’10. Regression Fits to 13C Chemical Shifts Regression fits to experimental 13C shifts for sp2 (left) and sp3 (right) carbons show a factor of two reduction in (rms) error over simple least-squares fits. There are no significant outliers. Identification of Biosynthesis Products Corrected shifts for one of the isomers matches the experimental 13C spectrum, while those for the remaining structures do not match. This is the structure supported by labeling experiments and by comparison of calculated and experimental UV/vis spectrum. Honest Assessment 13C chemical shifts that have been “corrected” by taking the local environment about the carbon into account offer significant improvement over “uncorrected” shifts, and may now be adequate for the purpose of distinguishing among molecules that are structurally close (the “difficult” cases). This situation will continue to improve with the development of more accurate correction schemes. Strychnine Hesperidin δ-3,4-trans-Tetrahydrocannabinol Onocerin Nicotine Galanthamine Caulophylline Chamazulene Prednisone Cnicin Morphine NMR Timescale The time required for a transition between nuclear spin states is comparable to that required for some chemical processes, for example, protonation/deprotonation, and several orders of magnitude longer than that required to reach equilibrium among conformers. This means that NMR spectra may depend on temperature, and that a “high-temperature” spectrum represents an average of the spectra of individual chemical species or individual conformers of one species. NMR Spectra of Flexible Molecules Because the time for relaxation of nuclear spin states is much longer than the time required to reach conformational equilibrium, the NMR spectrum of a flexible molecule needs to be calculated as a Boltzmann-weighted sum of spectra of the individual conformers. While only a few conformers are likely to contribute significantly to the average, it is necessary to consider all conformers to identify these few. Boltzmann Distributions The difficulty is in calculating accurate Boltzmann weights. The T1 recipe (described in the next section) appears to provide reliable results, and shows that ~94% of atropine molecules exist in one of five conformers. Averaging over these allows calculation of the 13C spectrum for atropine. Proton Chemical Shifts Proton chemical shifts correlate reasonably well with experimental shifts for both HF/6-31G* and B3LYP/6-31G* models, for example, for methyl group hydrogens. A correction scheme to take account of local environment is under development. Application … Magnetic Anisotropy In response to an external magnetic field, the π electrons in benzene generate a local field that subtracts from the external field directly above and below the ring plane (“shielding”), and adds to the external field in the periphery (“deshielding”). shielding region deshielding region deshielding region shielding region Protons that are shielded will exhibit smaller chemical shifts than expected while those for protons that are deshielded will exhibit larger shifts. m-Cyclophane The proton spectrum of m-cyclophane, shows resonances assigned to the two benzene rings at 4.27, 6.97 and 7.24 ppm in a ratio of 1:2:1. The spectrum obtained from the HF/6-31G* model is in agreement. 2D Spectra In addition to proton and 13C spectra, Spartan can display COSY, NOESY, HSQC and HMBC “2D” spectra. Note that Spartan only calculates chemical shifts. Coupling constants are not calculated, but instead are assigned from knowledge of the actual structure (the opposite to what is done experimentally). The objective is to provide connections between what is actually observed in an NMR experiment the quantities that are actually calculated. HMBC Spectrum of Cnicin HSQC Spectrum of Capsanthin UV/Visible Spectra UV/visible spectra (λmax) calculations require consideration of both the ground state and a series of excited states. B3LYP models (so-called time dependent density functional models for excited states) generally provide an acceptable account, although basis sets that incorporate diffuse functions may be needed. In practice, UV/visible spectra calculations are limited to molecules with 20-30 heavy atoms. Display of UV/Visible Spectra The resulting data may either be presented in terms of a series of ground to excited state energy transitions or more conventionally as a spectrum. The experimental spectrum displayed with the calculated spectrum is from the NIST database (comprising ~1500 compounds) accessible on-line from Spartan. λmax Calculations for Related Compounds Absorption maxima obtained from the B3LYP/6-31+G* model (6-31G* with diffuse functions on heavy atoms) for series coumarin derivatives correlate with experimental λmax values. Emission Spectra of Coumarin Derivatives Some molecules emit light following absorption. While it not practical to directly calculate emission spectra, they may be related to quantities that can be calculated, specifically, λmax and absorption intensity. Chemical Reactivity and Selectivity Revisiting the Hammond Postulate As stated earlier, the justification for using key molecular orbitals on the reactants to anticipate chemical reactivity and selectivity follows from the Hammond Postulate: the transition state for an exothermic reaction will resemble the reactants. This same justification allows application of other graphical models. We will consider two of these, local ionization potential maps to anticipate reactivity/selectivity of electrophilic additions and LUMO maps to do the same for nucleophilic additions. Local Ionization Potential Map The local ionization potential indicates the ease of electron removal (ionization) at a location in the vicinity of a molecule. As such, it is an indicator of electrophilic reactivity. The lower the local ionization potential, the more loosely bound the electron and the greater the likelihood for electrophilic attack. A local ionization potential map paints the value of the local ionization potential onto a density surface. Easily ionized regions are colored red and regions that are difficult to ionize are colored blue. "electron density" Local Ionization Potential Map Local ionization potential maps for benzene, aniline and nitrobenzene show both positional selectivity in electrophilic aromatic substitution (NH2 directs ortho/para, NO2 directs meta), and the fact that π-donors such as NH2 activate benzene while π-acceptors such as NO2 deactivate benzene. Stereospecific Alkylation of Enolates Alkylation of each of the enolates formed from the substituted cyclodecanones shown below gives rise to a single product. CH3O2C CH3O2C LiH; CH3I O LiH; CH3I O H O O H H CN H CN Local ionization potential maps properly show the observed alkylation product. LUMO Map The lowest-unoccupied molecular orbital or LUMO indicates where a pair of electrons (a nucleophile) will be most likely to add to a molecule. A LUMO map paints the absolute value of the LUMO onto a density surface, and provides a model for nucleophilic reactivity. Regions colored blue have high LUMO concentration and regions colored red have low LUMO concentration. LUMO non-bonded electron pair "electron density" LUMO Map for Cyclohexenone The map for cyclohexenone exhibits two regions of high LUMO concentration. One is over the carbonyl carbon, consistent with nucleophilic addition, while that over the carbon, consistent with conjugate or Michael addition. HO O CH3 CH3Li carbonyl addition O (CH3 )2CuLi Michael addition CH3 Nucleophilic Addition to Camphor Addition of LiAlH4 to 2-norbornone occurs from the equatorial face of the carbonyl group, while addition to camphor occurs from the axial face. Examination of LUMO maps for 2-methyl-2-norbornene and 7,7-dimethyl-2-norbornene clearly show that this change is due to the pair of methyl groups at the 7 position. Silaolefins With the exception of phosphorous ylides, compounds incorporating a double bond between carbon and a second-row element are rare. A search of CSD turns up only a very few compounds incorporating a carbon-silicon double bond. Reactivity of Silaolefins What all the known compounds have in common is a crowded environment around the double bond. This suggests that it is necessary to keep reagents away. Local ionization potential and LUMO maps for tetramethylsilaethylene, Me2Si=CMe2, and its carbon analog 2,3-dimethyl-2-butene, Me2C=CMe2, show that the silaolefin more reactive toward both electrophiles and nucleophiles than the olefin. Back to Basics … Energy Surfaces What is a Transition State? In one dimension, a maximum on an energy curve corresponds to a transition state. This does not necessarily mean that it corresponds to a transition state for a “useful” chemical reaction, but merely that it connects two minima (stable molecules). What is a Reaction Coordinate? Chemists represent multi-dimensional systems in terms of reaction coordinate diagrams where focus is drawn to the “important” coordinate (the reaction coordinate). For example, interconversion of equivalent chair conformers of cyclohexane through a twist-boat intermediate is thought of in terms of a continuous motion. transition state energy transition state twist boat chair chair reaction coordinate Reaction Coordinate for Cyclohexane Interconversion In reality, the motion is probably much more complex than portrayed in such a diagram. But the important point is that the pathway followed from reactants to products is ill defined and a reaction coordinate diagram is nothing more than an expression of preconceived ideas. The real world analogy (albeit only in 2D) is climbing a mountain. The starting point (“reactants”) and the summit (“transition state”) are well defined, but there are many possible paths (“reaction coordinates”) connecting the two. Diels-Alder Cycloaddition of 1,3-Butadiene and Acrylonitrile A further example is provided by the Diels-Alder cycloaddition of 1,3-butadiene and acrylonitrile to form 4-cyanocyclohexene. As the transition state involves hybridization and bond length changes (relative to reactants), it seems unlikely that any single simple coordinate will be able to provide an adequate description. If we give up the idea of actually “seeing” a multi-dimensional energy surface, we can make progress, as it is possible to provide a mathematical definition of those few points on such a surface that correspond to transition states. Back to Basics … As stated earlier, the “important” points are all stationary points, that is, points for which the first derivative of the energy with respect to each geometrical coordinate is zero. in one dimension: dE/dR = 0 in many dimensions: E/Ri = 0 i = 1,2,3 . . . 3N–6 Energy Minimum vs. Energy Maximum In one dimension, an energy maximum (transition state) is a point where the second derivative is negative. d2E/dR2 < 0 Recall that we can generalize this to many dimensions simply by replacing the original coordinates, R, by normal coordinates, ξ, that lead to a second derivative matrix that is diagonal. ∂2E/∂Ri∂Rj ∂2E/∂ξi∂ξj = δij ∙∂2E/∂ξi∂ξj δij is 1 for i=j and 0 otherwise. Each stationary point may now be assigned as either an energy minimum or an energy maximum Defining a Transition State We have already stated that points at which the energy is a minimum in all dimensions correspond to “stable” molecules. Points at which the energy is a minimum for all but one dimension and a maximum in one dimension correspond to “transition states”. Liken the latter to a mountain pass. One does not go over a summit (an energy maximum) to cross a mountain range but rather through a pass. To repeat an earlier comment, while the energy minima and the transition state are well defined, there are many pathways (“reaction coordinates”), in the same way that there can be many roads leading up to and away from a mountain pass. How to Guess a Transition State? We start at what appears to be a disadvantage, as there are no experimental transition-state structures. Transition states cannot even be detected, let alone isolated and characterized, simply because they “do not exist”. However, there is a significant body of information from quantum chemical calculations about the geometries of transition states. This can be searched by substructure for a “best match” to the reaction of interest. Alternatively, Spartan will do this automatically. How to Find a Transition State? The procedure used to locate a transition state is identical to that used to find an equilibrium structure, except that the algorithm is instructed to search out a geometry that is an energy maximum in one dimension. The procedure terminates only when all energy first derivatives closely approach zero and all geometrical variables reach constant values. Success requires a good guess, but even if you have one, locating a transition will generally require two or three times the number of steps required to as find an equilibrium geometry. Finally, don’t be surprised if you turn up a transition state for an “unexpected” reaction. How to Verify a Transition State? The “infrared spectrum” of a transition state needs to contain a single imaginary frequency. This follows from the fact that frequency is proportional to the square root of the second derivative (divided by a mass). One of the second derivatives for a transition state is negative (the energy curves upwards), meaning that the square root is an imaginary number. Existence of a single imaginary frequency is a necessary, but not sufficient, requirement for a transition state. In addition, an acceptable transition state needs be on a pathway that actually connects the reactants and products of the chemical reaction of interest. Performance of Practical Theoretical Models for Transition-State Geometries There are no experimental geometries for transition states, and the only way to assess the performance of practical models is to compare them with results obtained from a model that has been previously established to properly describe the structures of stable molecules. The MP2/6-311+G** model has been selected as the standard. According to this measure, all five models are satisfactory (with the B3LYP and MP2 models the best), although large individual deviations are seen for single bonds that are breaking or forming. While not shown here, the PM3 model occasionally fails to provide a reasonable transition state. Transition State Geometries reaction transition state O O b bond length H C c O CH2 a d H2C CH2 f b + C 2 H4 H C H e H2 C c HC CH2 a d H2C CH2 f b O + CO 2 O H2 C e c HC O a d HC C f mean absolute error H C H2 e O Hartree-Fock/ 3-21G 6-31G* B3LYP/ 6-31G* MP2/ 6-31G* PM3 MP2/ 6-311+G** a b c d e f 1.88 1.29 1.37 2.14 1.38 1.39 1.92 1.26 1.37 2.27 1.38 1.39 1.90 1.29 1.38 2.31 1.38 1.40 1.80 1.31 1.38 2.20 1.39 1.41 1.68 1.30 1.40 1.94 1.40 1.42 1.80 1.30 1.39 2.22 1.39 1.41 a b c d e f 1.40 1.37 2.11 1.40 1.45 1.35 1.40 1.38 2.12 1.40 1.45 1.36 1.42 1.39 2.12 1.41 1.48 1.32 1.43 1.39 2.03 1.41 1.55 1.25 1.41 1.39 1.97 1.40 1.51 1.29 1.43 1.39 2.07 1.41 1.53 1.25 a b c d e f 1.39 1.37 2.12 1.23 1.88 1.40 1.38 1.37 2.26 1.22 1.74 1.43 1.40 1.38 2.19 1.24 1.78 1.42 1.40 1.38 2.08 1.25 1.83 1.41 1.40 1.38 2.02 1.24 1.93 1.40 1.40 1.38 2.06 1.24 1.83 1.41 0.05 0.05 0.03 0.01 0.05 – Ene Reaction The ene reaction involves addition of a electron-poor double bond to an alkene with a allylic hydrogen. The hydrogen is transferred and a new carbon-carbon bond formed, for example, in the addition of maleic anhydride and propene. O O O + O H O H O An animation of the motion associated with the imaginary frequency shows that bond making and bond breaking occur simultaneously, consistent with this being a concerted reaction. Transition States for Derivative Reactions Transition states for reactions that differ only by remote changes in structure or in substitution closely resemble each other. This means that transition states for simplified reactions may be used to guess transition states for complex reactions. A comparison of transition states for pyrolysis of ethyl formate (leading to ethylene and formic acid) and cyclohexyl formate (leading to cyclohexene and formic acid) shows that the parts in common are nearly identical. Stereoselective Claisen Rearrangements In some cases, simply looking at transition-state geometries leading to different products may suggest why one is favored over the other. For example, the observed product in the Claisen rearrangement shown below arises from a chair-like transition state whereas the product that is not observed would require a boat-like transition state. Absolute Reaction Rate Absolute reaction rate depends on the product of a rate constant and the concentrations of the reactants, [A]a, [B]b.... rate = rate constant [A]a [B]b [C]c ... Arrhenius Equation The rate constant provides a “molecular interpretation” of reaction rate, and is contained in the Arrhenius equation. k = Ae– E‡/RT E‡ is the activation energy (difference between the transition state and the reactants), T is the temperature and R is the gas constant. A accounts for the efficiency of molecular collisions. The underlying assumption behind the Arrhenius equation is that all molecules pass through the transition state. This allows us to use thermodynamic arguments. This is not entirely reasonable, as some molecules will have excess energy and be able to “fly over” the transition state as they move from reactants to products. Too Slow and Too Fast A good rule of thumb is that reactions with activation energies >200 kJ/mol will not occur at normal temperatures, while reactions with activation energies <100 kJ/mol will be unstoppable under the same conditions. Nexium The anti-ulcer drug esomeprazole (Nexium) is actually the S enantiomer of an older unresolved drug no longer on patent. While both enantiomers are active, the R enantiomer is metabolized faster than the S enantiomer. This means that the pure S compound is longer lasting than the racemic mixture. In order for esomeprazole to qualify as a “new drug” subject to patent protection it must not racemize. Will Esomeprazole Racemize? Chirality in esomeprazole is due to the sulfoxide group. There are two distinct racemization pathways. The obvious one involves inversion at sulfur through a planar transition state, and the less obvious one involves a pair of [2,3] sigmatropic rearrangements and an achiral intermediate. Performance of Practical Models for Absolute Activation Energy Comparisons The Arrhenius is a simplified model of “reality” and absolute activation energies derived from experimental rates based on it may not accurately reflect calculated values. It is probably more appropriate to compare calculated activation energies with results obtained from a model that has been previously established to properly describe the structures of stable molecules. We will use the MP2/6-311+G** model as the standard. Absolute Activation Energies Hartree-Fock 3-21G 6-31G* reaction CH3NC CH3CN HCO2CH2CH3 O HCO2H + C2H4 O + + C2H4 B3LYP/ 6-31G* MP2/ 6-31G* MP2/ 6-311+G** PM3 238 192 172 180 172 243 259 293 222 251 234 251 192 238 142 117 109 - 176 205 121 109 109 146 126 167 84 50 38 134 314 356 243 251 230 255 81 93 15 11 - 91 H mean absolute error Absolute Activation Energies (con’t) Hartree-Fock 3-21G 6-31G* reaction B3LYP/ 6-31G* MP2/ 6-31G* MP2/ 6-311+G** PM3 N HCNO + C2H2 O 105 146 50 33 38 427 230 247 163 159 142 - 276 197 151 155 142 267 + CO2 247 251 167 184 172 276 + SO 2 205 205 92 105 92 234 81 93 15 11 - 91 O O SO2 mean absolute error Performance of Practical Theoretical Models for Absolute Activation Energies Absolute activation energies from Hartree-Fock models are larger than those obtained from the MP2/6-311+G** model. This parallels the fact that Hartree-Fock models underestimate bond dissociation energies. Transition states are likely to be more compact than reactants, meaning that electron motions are likely to be more tightly coupled. Therefore, Hartree-Fock models will do better for reactants than for transition states. B3LYP/6-31G* and MP2/6-31G* models provide a better account of absolute activation energies, although both show sizable errors in some cases. The PM3 model does not provide a satisfactory account. Combining Different Theoretical Models While geometry is well described by simple models, accurate description of reaction and activation energies typically requires “better” and more costly (in terms of computation) models. It may be advantageous to combine different models rather than to use a single model, for example, to use the HF/6-31G* model to furnish geometry and to use the B3LYP/6-311+G** model to provide energies. Ireland Rearrangement The Ireland rearrangement provides a route to allyl vinyl ethers from allyl esters. The second step in this reaction is a Claisen rearrangement. Activation energies from the B3LYP/6-31G* model for the Claisen step (for R=Me) change only slightly with use of HF/3-21G or HF/6-31G* structures instead of the “exact” (B3LYP/6-31G*) structures. A larger change is seen with use of PM3 structures. Kinetic Product The kinetic product is that resulting from the lowest-energy transition state, irrespective of whatever or not this is lowestenergy product. energy reaction coordinate Kinetic product ratios depend on activation energy differences in the same way that thermodynamic product ratios depend on the difference in reactant and product energies. Relative Activation Energies Establishing the kinetic product of a reaction typically does not require knowledge of absolute activation energy, but rather only of the difference in activation energies for closely-related reactions. For example, the kinetic preference of a particular regio or stereoisomer formed in a reaction requires only differences in transition-state energies leading from a common set of reactants to each of the products. Analogy with reaction energies (thermodynamics) suggests that such types of comparisons conserve bonding and should benefit from error cancellation. Even simple quantum chemical models swould be expected to yield acceptable results. Chiral Hydroboration Hydroboration may occur from either the “top” or “bottom” of the alkene shown below. However, only one diastereomer results. (Oxidation follows hydroboration.) OH O H3C H CH2OCH2Ph CH3 CH2OCH2Ph O CH3 CH3 H Relative transition state energies show a strong preference for formation of the observed diastereomer. Thermodynamic vs. Kinetic Control of Chemical Reactions If thermodynamic and kinetic products differ, it may be possible to control the overall product distribution by changing reaction conditions, in particular, by changing the temperature. Calculations provide the means to say if thermodynamic and kinetic preferences are different and, if they are, which better fits the experimental data. This knowledge can then be used to suggest reaction conditions that yield the desired products. Radical Ring Closure Reactions Loss of bromine from 6-bromohexene yields products derived primarily from cyclopentylmethyl radical, rather than from cyclohexyl radical. Bu3SnH AlBN Br • hex-5-enyl radical 17% • rearrangement cyclopentylmethyl radical • cyclohexyl radical 81% 2% Thermodynamic vs. Kinetic Product Cyclohexyl radical is more stable than cyclopentylmethyl radical, not unexpected given that secondary radicals are more stable than primary radicals and six-membered rings are more stable than five-membered rings. The fact that cyclohexane is not the observed ring-closure product means that the reaction is not under thermodynamic control. The transition state for closure of hex-5-enyl radical to methylcyclopentyl radical is lower in energy than that for closure to cyclohexyl radical. Methylcyclopentane is the kinetic product. The fact that it is the observed product suggests that the reaction is under kinetic control. Polymerization of Cyclopentadiene Cyclopentadiene undergoes a Diels-Alder reaction with itself. The resulting dimer adds cyclopentadiene yielding a trimer, and so forth. However, the reaction stops around the 20-mer. N Addition may either occur with exo or endo stereochemisty. The former is thermodynamically favored while the latter is kinetically favored. + exo addition + endo addition Thermodynamic vs. Kinetic Product The exo polymer (the thermodynamic product) shows a helical structure. The endo polymer (the kinetic product) closes on itself around the 20-mer. The reaction that prematurely terminates appears to be under kinetic control, and raising the temperature might cause a longer (and different) polymer to form. exo polymer endo polymer Reactions Without Transition States Some reactions proceed without barriers and discernible transition states. Radicals combine without a barrier, for example, two methyl radicals to form ethane, and typically add to multiple bonds with little or no barrier. Ions add to neutral molecules in the gas phase without activation barrier. For example, in the gas phase SN2 addition of an anionic nucleophile to an alkyl halide occurs without activation energy. The known barrier for the process in solution is a consequence of the solvent. Reactions of Flexible Molecules. The “Important” Conformer For kinetically controlled processes, the important conformer will not necessarily be the lowest-energy conformer. A good example is provided by the Diels-Alder cycloaddition of 1,3butadiene with acrylonitrile. + CN CN The diene exists primarily in a trans conformation which is unable to react. The “reactive” cis conformer is 8 kJ/mol less stable and accounts for only about 5% of butadiene molecules at room temperature. Nevertheless, the reaction occurs. Curtin-Hammett Principle Assuming that equilibration among conformers is much faster than chemical reaction, whatever higher-energy conformer that reacts will be replenished before it is needed again. This is the Curtin-Hammett Principle. chemical reaction "high-energy process" E equilibration among conformers "low-energy process" It means that the products of a kinetically-controlled reaction need not derive from the lowest-energy conformer. Spartan Molecular Database (SMD) Large collections of data obtained from quantum chemical calculations are available. These include equilibrium geometries, energies and atomic charges for ~150,000 organic and maingroup inorganic molecules from the HF/3-21G, HF/6-31G*, B3LYP/6-31G*, EDF1/6-31G* and MP2/6-31G* models (not all molecules are represented by all models) based on the “best conformation” assigned from the MMFF molecular mechanics model. Smaller collections (several thousand molecules) are available for HF, B3LYP and MP2 models using the 6-311+G** basis set, for the G3(MP2) model and for the transition-metal inorganic and organometallic compounds using the B3LYP/631G* model. These may be searched by substructure or by name. Spartan Molecular Database (SMD) A collection of ~100,000 closed-shell organic molecules obtained from the EDF2/6-31G* using the best conformer assigned from the T1 model is under development. This will include equilibrium geometries, energies (T1) heats of formation, infrared and proton and 13C NMR spectra. The database entries will also include the wavefunction allowing “on-the-fly” generation of graphical displays (molecular orbitals, electrostatic potential maps, etc). In addition to substructure and name searching, this data may be searched for a match to an “unknown” infrared spectrum. Data Mining Tools for “mining” the data in SMD are available. These allow graphical and statistical comparisons of different atomic and molecular properties of related molecules, of a single property of related molecules from different theoretical models and of energies (or more generally changes in any property) for a specified chemical reaction or for a series of related reactions. Concluding Remarks Quantum chemical calculations have long been successfully employed to interpret and rationalize experimental observation, but more often than not, as an afterthought. They should now be recognized as a legitimate means to explore chemistry alongside of experiment or as an alternative to experiment. There is a learning curve, just as there is a learning curve for techniques of experimental chemistry. The focus should be on understanding the capabilities and limitations of practical quantum chemical models rather than on the underlying theory. As with experimental chemistry, practical expertise and confidence can and will follow only by “doing”. Contact Information General Sales & Licensing Questions: sales@wavefun.com Demo Requests: Sue Kurz sue@wavefun.com Academic Licensing: Tyler Netherton tyler@wavefun.com Commercial Licensing: Sean Ohlinger Invoicing & Payments: Michelle Fitzpatrick Webmaster: Pamela Ohsan Wavefunction Support Team: sean@wavefun.com michelle@wavefun.com pam@wavefun.com support@wavefun.com Sean Ohlinger, Phil Klunzinger, Jurgen Schnitker, Warren Hehre. . .