Molecular Modeling Short Course

advertisement
Molecular Modeling
Short Course
Wavefunction, Inc.
Goals
To formulate molecular mechanics and quantum chemical
models and to assess these models for the calculation of
equilibrium conformations and geometries, reaction energies and
infrared and NMR spectra.
To formulate, assess and illustrate graphical models to anticipate
molecular properties and chemical reactivity/selectivity.
To show the use of databases of calculated geometries, energies,
properties and spectra.
Overall … what can be calculated, how it relates to what is
measured and what to expect of the results of the calculations.
Energy Surfaces and Geometries
What is an Equilibrium Geometry?
In one dimension, a minimum on a curve of energy as a function
of coordinate corresponds to an equilibrium geometry (or
equilibrium structure). This does not necessarily mean that a
molecule with this structure can be isolated and characterized
but only that it is possible for it to exist.
Mathematical Description
of Energy Surfaces
It is not possible to actually visualize a multidimensional energy
surface. It is possible to provide a unique mathematical definition
of those few points on such a surface that correspond to stable
molecules. Note, however, that it is not possible to specify a
unique pathway (“reaction coordinate”) connecting two molecules.
Stationary Points
Stable molecules are all stationary points on an energy surface,
that is, points for which the first derivative of the energy with
respect to each geometrical coordinate is zero. In words, this
means that the energy surface is “flat”.
in one dimension:
dE/dR = 0
in many dimensions: E/Ri = 0
i = 1,2,3 . . . 3N–6
Energy Minima in One Dimension
It is easy to pick out a stable molecule on a one dimensional
energy surface. It is a flat point where the energy curve “goes
up” on both sides. In mathematical terms, this means that the
second derivative of the energy is positive.
d2E/dR2 > 0
Energy Minima in Many Dimensions
For a molecule with N atoms, 3N-6 second derivatives are
associated with each first derivative. This makes it impossible to
say whether a particular stationary point is an energy minimum or
maximum. However, the original coordinates, may be combined
into a new set referred to as normal coordinates, ξ, that lead to a
second derivative matrix that is diagonal.
∂2E/∂Ri∂Rj  ∂2E/∂ξi∂ξj = δij ∙∂2E/∂ξi∂ξj
δij is 1 for i=j and 0 otherwise. Each stationary point is now
associated with a single second derivative, making it possible to
say whether the energy is at a minimum.
How to Determine an Equilibrium Geometry?
Finding an equilibrium structure involves an iterative process
that is terminated when all first derivatives fall below a preset
tolerance, assuming that neither the energy nor the coordinates
have changed significantly from their values in the previous
iteration. The number of iterations required will typically be of
the same order as the number of independent variables. Thus,
finding an equilibrium structure is likely to be one to two orders
of magnitude more costly in terms of computer time than
calculating an energy at a single geometry.
How to Confirm an Equilibrium Geometry?
The infrared frequency for a diatomic molecule is proportional
to the square root of the second derivative of the energy with
regard to change in coordinate (the force constant) divided by a
mass (the reduced mass).
Similarly, each of the 3N-6 frequencies for a molecule with N
atoms (3N-5 in the case of a linear molecule) is proportional to
the square root of the corresponding force constant. An energy
minimum will give rise to a “normal” infrared spectrum, with
frequencies that are all real numbers.
Collections of Experimental Structures
Cambridge Structural Database
The Cambridge Structural Database (CSD) is a collection of
nearly 500,000 experimental X-ray crystal structures for organic
and organometallic molecules. It is a virtual gold mine of
experimental structures and also serves to identify molecules
that can be (and have been) synthesized and purified.
There are >100 derivatives of camphor in CSD formed by
substitution on the methyl group at C1.
H3C
CH3
CH3
O
Molecular Mechanics Models
Molecular Mechanics
Molecular mechanics represents a molecule in terms of a Lewis
structure and assumes that bond lengths and angles depend
predictably on atom types and hybrids. Strain energy (the price
to distort from ideal bond distances, angles and torsion angles)
needs to be added to non-bonded energy (to account for van der
Waals and Coulomb interactions).
steric interaction
distance
distortion
dihedral
distortion
+
Coulombic
interaction
angle
distortion
–
Stretch and Bend Strain Energy
bonds
Estrain =
bond angles
EAstretch +
EAbend
A
A


torsion angles
+
E
torsion
A
+
A
non-bonded atoms
non-bonded
EAB
A B

Bond stretching and angle bending terms are quadratic forms.
Estretch (r) = 1 kstretch (r - req) 2
2
Ebend () = 1 kbend ( - eq)2
2
r and  are the bond distance and angle, req and eq are the ideal
bond length and bond angle, and kstretch and kbend are parameters.
Higher-order contributions and cross terms are included in real
molecular mechanics models.
Torsional Strain Energy
bonds
Estrain =
bond angles
EAstretch +
EAbend
A
A


torsion angles
+

EAtorsion +
A
non-bonded atoms
non-bonded
EAB
A B

The torsional energy term must reflect the inherent periodicity of
rotation about a single bond.
Etorsion () = ktorsion1 [1 - cos ( - eq)] + ktorsion2 [1 - cos 2 ( - eq)]
+ ktorsion3 [1 - cos 3 ( - eq )]
 is the torsion angle, eq is the ideal torsion angle and ktorsion1,
ktorsion2 and ktorsion3 are parameters. Higher-order contributions
are included in real molecular mechanics models.
Non-Bonded Energy
bonds
Estrain =
bond angles
EAstretch +
EAbend
A
A


torsion angles

+
EAtorsion +
A
non-bonded atoms
non-bonded
EAB
A B

Enon-bonded (r) = EVDW (r) + ECoulombic (r)
Non-bonded interactions involve a sum of van der Waals (VDW)
interactions and Coulombic interactions.
EVDW (r) = 
ro
r
12
ro
- 2
r
6
r is the non-bonded distance,  and ro are parameters and q are
atomic charges.
Coulombic
E
qq´
(r) =
r
Force Fields
Molecular mechanics models need to be parameterized either to
experimental data or to the results of “good” calculations. The
combination of functional form and parameters, known as a Force
Field, defines a particular molecular mechanics model.
Spartan uses the MMFF (Merck Molecular Force Field) model,
specifically parameterized to describe conformational preferences
of organic molecules.
Limitations of Molecular Mechanics
Because the strain energy is referenced to an individual Lewis
structure, molecular mechanics may not be used to compare
energies of molecules that are represented by different Lewis
structures. It may be used to compare energies of isomers that
share the same Lewis structure, for example, stereoisomers, and
most important different conformers of a molecule.
Molecular mechanics may be thought of as an elaborate
interpolation scheme. It cannot be expected to perform well as it
ventures outside the range of its experience (parameterization).
Range of Molecular Mechanics
Molecular mechanics calculations are dominated by calculation
of the non-bonded energy which scales as the square of the total
number of atoms. Molecular mechanics calculations are practical
on molecules containing thousands of atoms, for example,
proteins, and (as will be described later) on smaller molecules
that may have thousands of different shapes (conformers).
Molecular Structure in 3D
The combination of a builder and molecular mechanics provides
a powerful tool for the examination and comparison of the
molecular structures and properties derived from these structures,
for example, volumes, surface areas and polar surface areas.
PSA and Absorption through a Membrane
Polar surface area (PSA), usually defined as the area of a spacefilling model due to nitrogen, oxygen and attached hydrogens,
anticipates the ability of molecule to move across a membrane.
High PSA signifies a hydrophilic molecule and little incentive
to move into a non-polar membrane while low PSA signifies a
hydrophobic molecule and greater incentive.
How Chemists Use Molecular Structure?
Chemists use molecular structure (“sterics”) to make predictions
about reactivity and selectivity. For example, a chemist might to
assign the product of LiAlH4 reduction of the bicyclic ketone by
assuming that the hydride attacks the less-crowded carbonyl face.
O
Me
Me
HO
LiAlH4
H
Me
Me
Paquette, 1979
CH2
This is reasonable for
a rigid molecule,CH2 but is problematic for a
flexible molecule where the 3-dimensional shape may not be known.
Very Few Molecules are Rigid!
The acyclic ketone shown below may exist in a variety of
shapes, and different shapes lead to different products based on
which carbonyl face is likely to be less crowded.
O
OH
Me
Ph
Me
Me
L selectride
Me
Ph
Me
Me
Tsuchihashi, 1984
How Many Conformers Are There?
A good rule of thumb is that each additional single bond multiplies
the number of conformers by three. Start with butane which has
three conformers. Add another carbon to give pentane with nine
conformers, another to give hexane with 27 conformers, and so on.
Note, that not are conformers are distinct, for example butane has
only two distinct conformers. Three-member rings are rigid, four
and five-member rings may be assumed to be rigid, and six-member
rings comprising only sp3 centers typically exist only as (two)
“chair” conformers. Seven-member and larger rings generally
exhibit several conformers.
Properties of Flexible Molecules
To obtain the value of a property of a flexible molecule, it is
necessary to average over all possible conformers, weighing
each based on its energy according to the Boltzmann equation.
An energy difference of 4 kJ/mol leads to a Boltzmann weight of
~0.1 (10%) at room temperature, an energy difference of 8
kJ/mol to a weight of ~0.05 (5%) and a difference of 12 kJ/mol
to a weight of ~0.01 (1%). While only a few conformers are
likely to contribute significantly to the average, it may be
necessary to examine all conformers to identify these few.
Systematic Searching
Systematically “walking through” all combinations of bond and ring
conformations is the only procedure that actually guarantees that the
lowest energy conformer will be located, and that a “correct”
Boltzmann distribution will be obtained. “Real molecules” may
exhibit hundreds or thousands of conformers, and systematic
searching rapidly becomes impractical.
Monte-Carlo Searching
An alternative approach generates conformers by random
conformational changes, with the decision to keep or discard a
conformer (as a starting point for the next random move) based on
its energy relative to the best conformer yet found. A so-called
Monte-Carlo search nearly always locates the lowest-energy
conformer (or a conformer that is very close in energy), even
though they examine only a tiny fraction of the possible number
of conformers. It also produces a correct Boltzmann distribution
in the limit of a large number of moves.
Systematic vs. Monte-Carlo Searches
Lovastatin is representative of an important class of drugs that
increase the synthesis of LDL receptor proteins and are widely
used to reduce cholesterol levels.
A Monte-Carlo search limited to 200 steps yields the same
lowest-energy as a full systematic search (>4000 conformers).
Spanning all Possible Molecular Shapes
In some applications it is more important to span the range of
possible shapes that a molecule might assume than to provide a
correct Boltzmann distribution. The most common example of
this is “similarity analysis”, attempting to establish if a molecule
with a particular shape is “similar” to any conformer of a flexible
molecule. Where the flexible molecule has too many conformers
to be examined systematically, an alternative is to randomly
sample all (systematically-generated) conformers. We refer to this
as a conformer distribution.
Performance of the MMFF Model
At first glance, conformer energy differences obtained from the
MMFF model appear to be quite good … in fact as good as we
can reasonably expect from the best practical quantum chemical
models to be introduced shortly.
Conformations of Acyclic Molecules
molecule
n-butane
1-butene
1,3-butadiene
acrolein
N-methylformamide
N-methylacetamide
formic acid
methyl formate
methyl acetate
propanal
2-methylpropanal
ethanol
methyl ethyl ether
methyl vinyl ether
mean absolute error
low-energy/high-energy conformer
MMFF
expt.
trans/gauche
skew/cis
trans/gauche
trans/cis
trans/cis
trans/cis
cis/trans
cis/trans
cis/trans
eclipsed/anti
eclipsed/anti
anti/gauche
anti/gauche
cis/skew
3.3
1.3
10.5
8.0
5.4
10.9
20.5
22.1
34.7
2.1
2.5
0.8
6.3
9.2
2.8
0.9
12.1
7.1
5.9
9.6
16.3
19.9
35.6
2.8
3.3
0.5
6.3
7.1
1.1
–
A Closer Look at MMFF
A closer looks reveals that these same molecules were used to
parameterize MMFF. Comparisons that include molecules that
were not employed in the parameterization reveals a more
accurate (and discouraging) picture. Note in particular, the poor
result for 2-fluorotetrahydropyran (a model for carbohydrates).
Conformations of Cyclic Molecules
molecule
methylcyclohexane
tert-butylcyclohexane
cis-1,3-dimethylcyclohexane
fluorocyclohexane
chlorocyclohexane
piperidine
N-methylpiperidine
2-chlorotetrahydropyran
2-methylcyclohexanone
3-methylcyclohexanone
4-methylcyclohexanone
mean absolute error
low-energy/high-energy conformer
equatorial/axial
equatorial/axial
equatorial/axial
equatorial/axial
equatorial/axial
equatorial/axial
equatorial/axial
axial/equatorial
equatorial/axial
equatorial/axial
equatorial/axial
MMFF
expt.
5.9
26.4
21.3
-1.7
-1.3
3.8
13.8
-0.4
5.4
2.1
5.4
7.3
22.6
23.0
0.7
2.1
2.2
13.2
7.5
8.8
5.7,6.5
7.3,8.8
3.3
–
Aside …Interpreting Energy Profiles
It is not only possible to identify stable conformers and furnish
conformer energy differences but conformational energy profiles
may also be drawn and interpreted. The functional form of the
torsional energy E(φ) is actually a truncated Fourier series, the
individual terms of which are independent and may be
interpreted independently.
1
1
1
V () =
V1(1 - cos) +
V2 (1 - cos2) +
V3(1 - cos3)
2
2
2
.
= V1 () + V2 () + V3 ()
One, Two and Threefold Terms
The V1 term in n-butane gives the difference between syn and
anti conformers and reflects the crowding of methyl groups. In
1,2-difluoroethane V1 reflects interactions of bond dipoles.
CH3
F
CH3
CH3
F
F
VS.
VS.
F
CH3
"crowded"
"not crowded"
bond dipoles add
bond dipoles cancel
The V2 term in 1,3-butadiene reflects desire to keep π systems
coplanar. V2 in hydrazine it reflects the desire to keep the lone
pairs perpendicular.
The V3 term reflects the need for single bonds to stagger.
Dimethyl Peroxide
The energy profile for rotation about the OO bond in dimethyl
peroxide shows a single broad minimum. According to the
Fourier analysis, this arises due to a combination of the V1 term
(keeping the methyl group apart) and the V2 term (keeping the
lone pairs perpendicular). The V3 term is much less important.
E(ø) = 26 (1-cosø) +11 (1-cos2ø) +2 (1-cos3ø)
Larger Molecules
Is the MMFF model good enough to distinguish conformers that
are “reasonable” (and therefore need further consideration) from
those that are “unreasonable” (and may be discarded)? Lack of
reliable experimental data for any but very simple molecules
means that it is necessary to use calculated conformer energy
differences as a standard to tell. A high-level quantum chemical
model will be used for this purpose. We shall see later that the
model performs well for simple molecules where experimental
conformational energy differences are known.
2-Benzylamino-1-propanol
4,6-Dimethyl-1-phenyl-5-hepten-3-one
Utility of the at MMFF Model
The MMFF model appears to be good enough to distinguish
conformers that are “reasonable” (and therefore need further
consideration) from those that are “unreasonable” (and may be
discarded). Conformational analysis (at least the first steps of
conformational analysis) is the most important application of
molecular mechanics to “small-molecule” chemistry.
Identifying the “Important” Conformer
If the quantity of interest pertains to a system at equilibrium or
to the product of a reaction under thermodynamic control, then
the “important” conformer is the lowest-energy conformer.
Where there are several low-energy conformers, these need to be
weighted according to the Boltzmann equation.
Note that quantum chemical and molecular mechanics models
apply to the gas, and the lowest-energy conformer may not be
the conformer found in the solid state or in solution or for a
molecule that is bound to a protein. Conformational changes
may occur to allow effective crystal packing or to maximize
interactions with a solvent or a protein host.
Gas vs. Aqueous Phase Conformation
The lowest-energy conformation of an isolated molecule is not
necessarily that in solution. The most conspicuous difference is
that (intramolecular) hydrogen bonding is likely to be less
important. For example, the lowest-energy conformer of the
modified nucleoside acyclovir exhibits a hydrogen bond in the
gas phase but not in aqueous solution.
O
N
HN
H2N
N
N
O
OH
Acyclovir
Solid vs. Gas-Phase Conformations
Are the conformations of molecules in the gas phase mirrored by
conformations observed in the solid? If they are, then the
~500,000 experimental X-ray crystal structures are a very rich
resource of conformational preferences for isolated molecules.
Limited comparisons suggest that the factors responsible for
crystal packing do not necessarily lead to large changes in
conformation. The conformer found in the crystal is typically
either the same as the best gas-phase conformer or a conformer
that is only a few kJ/mol higher in energy.
There will be exceptions, the most obvious being for cases that
are able to form intramolecular hydrogen bonds.
Solid vs. Gas-Phase Conformations
for Common Drugs
molecule
clozapine
loratadine
dextromoramide
chloropromazine
tamoxifen
thioridazine
risperidone
quinidine
loperamide (imodium)
haloperidol
astemizole
cimetidine
number of conformers
4
16
25
35
43
56
71
73
157
161
210
229
E(best) – E(cystal)
0 (same)
0 (same)
0 (same)
2
6
0 (same)
2
8
3
5
7
0 (same)
Conformations of Free vs. Protein-Bound
Molecules
The most obvious exceptions occur where hydrogen bonding is
possible, for example, molecules bound to proteins. The
important question is whether knowledge of the conformation of
an isolated molecule offers any insight as to whether it will “fit”
inside a protein.
Limited comparisons do not lead to a clear picture. While the
bound conformers of some compounds are the same or very
similar to those for the free molecule, the conformations of
others are quite different. This suggests that specific binding
offers greater “rewards” than crystal packing forces.
Free vs. Protein-Bound of Conformations
of Small Molecules
molecule
podophyllotoxin
indomethacin
ibuprofen
zopolrestat
penicillin G
mesopram
piclamilast
loracarbef
cilomilast
diphenhydramine
ampicillin
gleevec
trifluoroperazine
protein PDB ID(s)
1sa1
1s2a
1eqg
1mar,1frb
1fxv
1xm6
1xm4,1xon
1fcn
1xlx
2aot
1nx9
1opj
1a29
# conf
17
17
18
19
21
30
31
34
35
39
45
60
87
E(best) - E(cystal)
2
4
0 (same)
0 (same
1
3
3
11
4
6
10
9
16
Quantum-Chemical Models
Schrödinger Equation to
Molecular Orbital Theory
The Schrödinger Equation . . .
Hydrogen Atom
The motions and interactions of nuclei and electrons are
described by the Schrödinger equation which may be solved
exactly for the hydrogen atom.
1 2

2
Z
r
y(r) = Ey(r)
The quantity in brackets gives the energy (E) of an electron at a
distance r from a nucleus of charge Z. The wavefunction (y)
depends on the electron coordinates and are the familiar s, p, d
atomic orbitals. y2 times a small volume is the electron density
(probability of finding the electron inside this volume). This is
the quantity measured in an X-ray diffraction experiment.
The Generalized Schrödinger Equation
The Schrödinger equation may be generalized.
ˆ = E
H
ˆ is the Hamiltonian and describes both the kinetic and potential
H
energies of nuclei and electrons. In atomic units, it is given by.
Hˆ =
electron s
1
2i
2
i

1
2
nucle i

A
1 2
A
MA
electron s nucle i
 
i
A
electron s
ZA
+
riA

i < j
1
+
r ij
nucle i

A<B
ZA ZB
RAB
Z is the nuclear charge, MA is the ratio of atomic and electron
masses and RAB,RiA and Rij are distances involving nuclei A,B
and electrons i,j.
Atomic Units
The atomic unit for length is the bohr and the atomic unit for
energy is the hartree.
1 bohr = 0.52917Ǻ= 5.2917 pm
1 hartree = 2625 kJ/mol = 627.5 kcal/mol
The energy of the proton is 0 hartrees (there is no electron). The
energy of hydrogen atom is -0.5 hartrees.
Born-Oppenheimer Approximation
The Born-Oppenheimer approximation assumes that, because
nuclei are much more massive than electrons, they do not move.
This leads to the electronic Schrödinger equation.
Hˆ elel = Eelel
Hˆ el =
1
2
electron s

i
2i
electron s nucle i
 
i
A
ZA
riA

electron s
+
i<j
1
r ij
The nuclear kinetic energy is zero, and the Coulomb repulsion
energy between nuclei is a constant (added to the electronic
energy). Nuclear mass does not appear in the electronic
Schrödinger equation and isotope effects have a different origin.
Hartree-Fock Approximation
The Hartree-Fock approximation insists that the electrons
move independently of each other. In practice, each electron is
confined to a spin orbital, made up of a spatial function or
molecular orbital, y, and a spin function,  or . The latter can
be thought of as an “accounting” device, ensuring that no more
than two electrons occupy each molecular orbital. The manyelectron wavefunction needs to be written as a sum of products
of spin orbitals in the form of a determinant.
1(1)
2( 1) n(1)
 =
1
N!
1(2)
2(2)
n(2)
1(N)
2(N)
n(N)
Solving the Hartree-Fock Equations
The Hartree-Fock equations need to be solved using an iterative
procedure. An electron is selected and a one-dimensional
“Schrödinger equation” is solved in the presence of a field that is
made up of all the remaining electrons. The solution is then
folded back into the field, another electron is selected and the
process repeated. After all electrons have been considered, the
resulting field is compared with that at the start. If they are the
same within some preset tolerance, then the process is judged to
have converged and is terminated. If they are different, the entire
process is repeated. Such a procedure is commonly referred to as
a self-consistent-field (SCF) procedure.
LCAO Approximation
The molecular orbitals are written as linear combinations of a
finite set (a basis set) of prescribed functions known as basis
functions, .
basis functions
yi=
ci 


c are the (unknown) molecular orbital coefficients. Because the
 are centered on the atoms, they are commonly referred to as
atomic orbitals. This is known as the Linear Combinations of
Atomic Orbitals or LCAO approximation.
Roothaan-Hall Equations
Taken together, the Hartree-Fock and LCAO approximations
lead to the Roothaan-Hall equations, the form of which are
nearly the same as the Schrödinger equation.
Fc = Sc
are orbital energies, S is the overlap matrix (a measure of the
extent that basis functions “see each other”), and F is the Fock
matrix (analogous to Ĥ in the Schrödinger equation). Solution
leads to the set of molecular orbital coefficients, c, and their
associated energies, 

Origin of Hartree-Fock Models
Schrödinger equation
nuclei don't move
electronic Schrödinger equation
1. electrons move independently
2. molecular "solutions" written
in terms of atomic solutions
Hartree-Fock
"Molecular Orbital" Models
.
Graphical Models
Chemistry in Pictures
As we shall see shortly, quantum chemical models are able to
provide a variety of quantitative data, in the form of structures,
energies and spectra. Before we get into these, it may be useful
to talk about a more qualitative aspect, and in particular about
information that is better related as “pictures” rather than tables
of numbers. These include the electron density which relates to
molecular size and shape and the electrostatic potential which
relates to molecular charge distribution.
Isosurfaces
There are two obvious ways to display a function that depends
on three coordinates, in this case the x,y,z Cartesian coordinates
of a molecule, on a two-dimensional screen (or printed page)
One is to draw a two-dimension cut (“slice”) through the surface
and display contour lines. The other is to define a surface of
constant value, a so-called isovalue surface or isosurface.
f(x,y,z) = constant
The constant is chosen to reflect a particular physical observable
of interest, for example, the shape of a molecule in the case of
the electron density.
Molecular Orbitals
Molecular orbitals are commonly related to bonds and lone
pairs. Their shapes may be able to suggest why a particular
chemical reaction occurs as it does or does not occur as
expected. For example, the fact that the HOMO (HighestOccupied Molecular Orbital) in cyanide anion is more
concentrated on carbon (on the right) than on nitrogen,
shows that cyanide will act as a carbon nucleophile.
N C–
CH3 I
N C
CH3 + I –
Unoccupied Molecular Orbitals
Unoccupied molecular orbitals may also be informative. For
example, the shape of the LUMO (Lowest-Unoccupied
Molecular Orbital) of methyl iodide shows why iodide leaves
following nucleophilic attack by cyanide.
The LUMO is antibonding between carbon and iodine, meaning
that donation of the electron pair from cyanide will cause the
carbon-iodine bond to weaken and eventually break.
Hammond Postulate
The reason that molecular orbitals are able to anticipate chemical
reactivity and selectivity follows from the Hammond Postulate. This
says that for exothermic reactions (all useful reactions are exothermic)
the transition state will resemble the reactants. Thus, the properties of
reactants are expected to mirror those of transition states.
Fukui-Woodward-Hoffmann Rules
The Fukui-Woodward-Hoffmann Orbital Symmetry Rules are a direct
consequence of the Hammond Postulate. They use the shapes (symmetries)
of the HOMO and LUMO (Frontier Molecular Orbitals) to understand
why some chemical reactions proceed whereas other reactions do not. For
example, the fact that the HOMO in 1,3-butadiene may interact
constructively with the LUMO in ethylene, suggests that the two molecules
will undergo Diels-Alder cycloaddition to form cyclohexene.
+
HOMO-LUMO Gap and Diels-Alder Rates
Frontier orbitals may be used to anticipate the rates of reactions.
For example, the rates of Diels-Alder reactions are known to
increase with π donors on the diene and π acceptors on the
dienophile. Donors raise the HOMO energy and acceptors lower
the LUMO energy. Decrease in HOMO-LUMO gap leads to
stronger interaction of the diene and dienophile and decrease in
activation energy.
LUMO
diene
Orbital
Energy
HOMO
dienophile
Electron Density
The sum of products of coefficients over all occupied molecular
orbitals gives rise to the density matrix, P, the elements of
which are given by.
occupied
molecular orbitals
P = 2
cici

i
The product of an element of the density matrix and its
associated pair of atomic orbitals at a point in space, summed
over all pairs of orbitals, gives the number of electrons at that
point. This is termed the electron density, and is the quantity
obtained in an X-ray diffraction experiment.
.
Different Regions of Electron Density
The electron density is largest aroundnon-hydrogen atoms. This is
the basis of the X-ray diffraction experiment. Regions of lower
density reveal hydrogen atoms as well as “bonds” between atoms.
An even lower value of the density provides overall molecular size
and shape. The last is analogous to a conventional space-filling
model and corresponds to 98-99% enclosure of the total number of
electrons.
Electrostatic Potential
The electrostatic potential is the energy of interaction of a
positive point charge with the nuclei and electrons of the
molecule. A surface of constant negative electrostatic potential
delineates regions in a molecule that are subject to electrophilic
attack, for example, above and below the plane of the ring in
benzene, and in the ring plane at nitrogen in pyridine.
Property Map
Because an density surface that encloses ~98-99% of the total
number of electrons corresponds to the overall size and shape of
a molecule, it can be used as a “canvas” on which to “paint”
information about how a molecule presents itself to the world,
for example, whether it is hydrophilic or hydrophobic. A
property map follows is by coloring each point on such a
surface according to value of some “property”, for example, the
electrostatic potential. By convention, colors toward red are
used to designate small property values while colors toward blue
are used to designate large property values.
Electrostatic Potential Map
An electrostatic potential map presents the value of the
electrostatic potential at locations on a density surface. Red
regions show excess negative charge and blue regions excess
positive charge.
"electron density"
The map for benzene clearly shows that the π system attracts the
positive charge while the σ system repels the charge.
Benzene Dimer
The electrostatic potential map for benzene shows opposite
charge distribution for the σ and π systems. This explains why
benzene dimer prefers to adopt a perpendicular instead of a
parallel geometry.
Benzene Crystal
… and it explains why benzene does not crystallize in a stack,
but instead prefers a perpendicular arrangement.
Potential Area and Absorption
We saw previously that polar surface area provided a semiquantitative account or the rates of transport across a biological
membrane. An even better correlation is found with the polar
area, the area on an electrostatic potential map where the
absolute value of the electrostatic potential is >100 kJ/mol. This
signals hydrophilic behavior.
Electrostatic Potential Maps and pKa’s
acid
pKa
acid
pKa
Cl3CCO2H
0.7
HCO2H
3.75
HO2CCO2H
1.23
trans-ClCH=CHCO2H
3.79
Cl2CHCO2H
1.48
C6H5CO2H
4.19
NCCH2CO2H
2.45
p-ClC6H4CH=CHCO2H
4.41
ClCH2CO2H
2.85
trans-CH3CH=CHCO2H
4.70
trans-HO2CCH=CHCO2H
3.10
CH3CO2H
4.75
p-HO2CC6H4CO2H
3.51
(CH3)3CCO2H
5.03
Visual comparison shows that the electrostatic potential at the
acidic hydrogen anticipates acid strength. A reasonable correlation
is found between the maximum value of the electrostatic potential
and experimental pKa.
Chromium Tricarbonyl as a Substituent
In the absence of electron withdrawing groups, benzene resists
nucleophilic aromatic substitution. For example, anisole is nonreactive while 4-cyanoanisole is reactive. Chromium tricarbonyl
benzene complexes are also highly reactive.
OMe
OMe
••Nu
no reaction
OMe
••Nu
OMe
••Nu
substitution
Cr(CO) 3
Nu
Electrostatic potential CN
maps for anisole, 4-cyanoanisole and
anisole chromium tricarbonyl show that the effect of the
chromium tricarbonyl group is similar to that of a para cyano
group in promoting nucleophilic reactivity.
Vitamin E
Radicals can damage cells through reaction with unsaturated fatty
acids found in cellular membranes. Vitamin E plays an role in
defending cells by transferring hydrogen to radicals to give stable
products that can then be excreted. In order to be effective, vitamin
E must be soluble in the cellular membrane An electrostatic
potential map shows a hydrophobic “hydrocarbon tail”, allowing it
to collect in cellular membranes.
Spin Density Map
The spin density is the difference between the number of “spin
up” and “spin down” electrons. A spin density map paints the
value of the spin density onto the electron density surface. It
provides a measure of “radical character”.
"electron density"
Back to Vitamin E
The spin density map for the radical formed upon hydrogen
atom removal from vitamin E shows extensive delocalization of
the unpaired electron. Vitamin E should form a stable radical.
Beyond Hartree-Fock Models
Electron Correlation
The Hartree-Fock approximation replaces instantaneous
interactions between individual electrons by interactions
between each electron and the field created by all the other
electrons. As a result, electrons “get in each others way” more
than they should, leading to overestimation of the electronelectron repulsion energy and to too high a total energy.
Electron correlation accounts for coupling or correlation of
electron motions and lowers the electron-electron repulsion
energy (and the total energy). The correlation energy is defined
as difference between the Hartree-Fock energy and the
experimental energy.
Configuration Interaction Models
One way to calculate the correlation energy is to combine the
Hartree-Fock wavefunction with wavefunctions formed by
promoting electrons from molecular orbitals that are occupied to
molecular orbitals that are unoccupied. You can think of this as
combining “ground-state” and “excited-state” wavefunctions
unoccupied molecular orbitals
electron promotion
occupied molecular orbitals
Full Configuration Interaction
It can be shown that the energy corresponding to a wavefunction
 formed by combining the Hartree-Fock wavefunction o and
wavefunctions resulting from all possible electron promotions
s, is identical to that obtained from exact solution of the
Schrödinger equation.
 = ao  o +
a 
s
s
s>o
To do this, referred to as full configuration interaction, is not
possible because the number of excited states is infinite.
Limited Configuration Interaction Models
To reduce the number of electron promotions (excited states)
specify the number of electrons involved, for example, limit to
promotions involving only one electron, promotions involving
two electrons, etc.
Considering single electron promotions only (Configuration
Interaction Singles or CIS), does not lower the Hartree-Fock
energy.
CID and CISD
Configuration Interaction Models
CID (Configuration Interaction Doubles) involving two-electron
promotions is the simplest procedure that actually leads to
lowering of the Hartree-Fock energy.
CISD (Configuration Interaction Singles and Doubles) is a
slightly better model that involves both one and two-electron
promotions.
Møller-Plesset Models
Another approach, leading to what are commonly known as
Møller-Plesset models, is to assume that the Hartree-Fock
energy E0 and wavefunction 0 are solutions to an equation
involving a Hamiltonian, Ĥ0, that is very close to the exact
Schrödinger Hamiltonian, Ĥ. This being the case, Ĥ can be
written as a sum of Ĥ0 a small correction, V.  is a
dimensionless parameter.
ˆ
Succession of Møller-Plesset Models
Expanding the exact energy and wavefunction in terms of a
power series of the Hartree-Fock energy and wavefunction
yields:
Substituting these expansions into the Schrödinger equation and
collecting terms in powers of  leads to explicit expressions for
the energy and wavefunction corrections. The sum of E(0) and
E(1) is the Hartree-Fock energy.
The MP2 Model
E(2) (the first correction to the Hartree-Fock energy) may be
written as a sum over occupied and unoccupied molecular
orbitals from the Hartree-Fock wavefunction.
i, and j are energies of occupied molecular orbitals, a, and b
energies of unoccupied molecular orbitals and (ij||ab) are
integrals that account for changes in electron-electron
interactions as a result of electron promotion. A correction can
also be made to the wavefunction. The resulting model is termed
MP2 (second-order Møller-Plesset)
Origin of the MP2 Model
Properties of “Limiting” Hartree-Fock
and MP2 Models
Assessing Limiting Hartree-Fock
and MP2 Models
Two classes of models are now defined: Hartree-Fock models
represent the simplest possible treatment resulting from the
Schrödinger equation and MP2 models represent the simplest
possible treatment that accounts for electron correlation. Up to
this point we have been ignoring the LCAO approximation and
will continue to do so for a bit longer. The geometries and
reactions presented on the following slides make use of this
approximation, but use a collection of atomic functions that is
large enough to allow us not to worry about the details..
Equilibrium Geometries
How Good Are Experimental Geometries?
Gas-phase structures for ~1000 small molecules have been
determined by microwave spectroscopy, but this field is not very
active and very few additional structures can be expected. Bond
lengths are typically accurate to better than 0.01Ǻ. Structures of
~800,000 crystalline solids from X-ray diffraction have been
determined and are available in several well-maintained
collections. Except for bonds to hydrogen which are poorly
described, bond lengths from solid-phase structures are generally
accurate to within 0.02-0.04Ǻ.
Calculations within 0.02Ǻ of experiment are a reasonable target.
Bond Lengths in Small Molecules
molecule
diborane
ethane
methyamine
methanol
methyl fluoride
methyl silane
methyl phosphine
methane thiol
methyl chloride
hydrazine
hydrogen peroxide
fluorine
disilane
hydrogen disulfide
chlorine
mean absolute error
HF/6-311+G**
MP2/6-311+G**
expt.
1.779
1.527
1.454
1.399
1.364
1.878
1.856
1.819
1.792
1.412
1.388
1.330
2.373
2.075
2.000
0.024
1.768
1.529
1.465
1.422
1.389
1.877
1.856
1.813
1.776
1.430
1.450
1.417
2.342
2.083
2.025
0.010
1.763
1.531
1.471
1.421
1.383
1.867
1.862
1.819
1.781
1.449
1.452
1.412
2.327
2.055
1.988
–
Rationalizing Changes in Geometry
Bond lengths increase in moving from Hartree-Fock to MP2
models. This may be rationalized by noting that MP2 involves
electron promotion from occupied molecular orbitals in the
Hartree-Fock description to unoccupied orbitals.
Where there are sufficient or excess electrons, occupied orbitals
are either net bonding or non-bonding and unoccupied orbitals
net antibonding. Electron promotion leads to bond lengthening.
Bond Lengths in Hydrocarbons
bond
hydrocarbon
HF/6-311+G**
MP2/6-311+G**
expt.
C–C
but-1-yne-3-ene
propyne
1,3-butadiene
propene
cyclopropane
propane
cyclobutane
1.438
1.466
1.467
1.502
1.500
1.528
1.548
1.430
1.464
1.460
1.502
1.511
1.529
1.550
1.431
1.459
1.483
1.501
1.510
1.526
1.548
C=C
cyclopropene
allene
propene
cyclobutene
but-1-yne-3-ene
1,3-butadiene
cyclopentadiene
1.276
1.295
1.320
1.323
1.322
1.324
1.330
1.305
1.314
1.341
1.352
1.347
1.347
1.359
1.300
1.308
1.318
1.332
1.341
1.345
1.345
0.010
0.006
–
mean absolute error
Overall …
Both Hartree-Fock and MP2 models provide solid accounts of
equilibrium geometry. Specifically, both give bond lengths to
within 0.02Ǻ, the error commonly associated with molecular
structures obtained from X-ray crystallography.
Bond lengths from the Hartree-Fock model are almost always
shorter than experimental values (and from bond lengths
obtained from the MP2 model). This may be rationalized by
recognizing that improvement of the Hartree-Fock model
involves mixing of excited-state wavefunctions into the
ground-state wavefunction.
Reaction Energies
Total Energy vs. Heat of Formation
The heat of formation is the enthalpy at 298K of a reaction in
which a molecule is converted to a set of standard products, one
for each different element. For example, the heat of formation of
ethylene is given by the reaction:
C2H4  2C (graphite) + 2H2 (gas)
The total energy is the energy at 0K of a reaction that splits a
molecule into its isolated nuclei and electrons. For example, the
total energy of ethylene is given by the reaction:
C2H4  2C+6 + 4H+ + 16e–
Either is OK for thermochemical calculations.
Potential Energy Surfaces . . .
Thermodynamics
The relative energy of reactants and products (minima on a
potential energy surface) is related to the thermodynamic heat or
enthalpy (∆H) of reaction. The ratio of products to reactants
depends on temperature as given by the Boltzmann equation.
Eproducts and Ereactants are the energies of products and reactants on
the potential energy diagram, T is the temperature (in K) and k is
the Boltzmann constant.
Potential Energy Surfaces . . .
Thermodynamic Product Ratios
Product ratios follow directly from energy differences. At room
temperature.
ΔE (kJ/mol)
2
4
8
12
major:minor
80:20
90:10
95:5
99:1
Potential Energy Surfaces . . .
Thermodynamic Product
The thermodynamic product for a reaction where two or more
different products are possible is that with the lowest energy
irrespective of pathway.
energy
reaction coordinate
Thermodynamic product ratios depend only on the difference in
reactant and product energies and not on the pathway connecting
the two. They also depend on temperature.
Relating Calculations to Experimental
Thermochemical Data
Two corrections are needed to allow calculated energies to be
compared with experimental enthalpies with calculated energies.
The first accounts for the change in temperature from 0K to some
value T:
H(T) = Htr(T) + Hrot(T) + Hvib(T) + RT
Htr(T) = 3/2RT
Hrot(T) = 3/2RT (RT for a linear molecule)
normal modes
Hvib(T) = Hvib(T) – Hvib(0) = Nh

i
i
(ehvi /kT – 1)
νi are vibrational frequencies, R, k and h are the gas, Boltzmann’s
and Planck’s constants and N is Avogadro’s number.
Relating Calculations to Experimental
Thermochemical Data
The second correction accounts for the fact that the calculation
refers to a stationary molecule whereas the experiment refers to
a molecule in its lowest vibrational state. The so-called zeropoint energy is given by:
Hvib(0) =
zero-point
1
=
2
normal modes
h
i
i
νi are vibrational frequencies and h is Planck’s constant.
Entropies and Gibbs Energies
Calculation of the Gibbs energy (G=H–TS) requires the entropy. n is
the number of moles, M is the mass, I are the moments of inertia I, ν
are the frequencies and s is the symmetry number.
S = Str + Srot + Svib
Str = nR
Srot = nR
Svib = nR
3
+ ln
2
3
+ ln
2MkT
2
nRT
P
(vA vBvC )1/2
2
i
3/2
s
(uieui – 1)–1 – ln (1 – e–ui)
vA = h2/8IA kT, vB = h2 /8IB kT, vC = h2/8IC kT
i = hi/kT
Be Careful!
These expressions make use of the harmonic approximation, which
while valid for large frequencies is not appropriate for very low
frequencies. In particular, the vibrational component of the temperature
correction to the enthalpy and to the entropy are both dominated by
low-frequency vibrations, and are subject to considerable uncertainty. In
practice, these are set to ½ R and ½ RT for each mode with a frequency
below 300 cm-1.
Reaction Types
Chemical reactions may be divided into one of three categories
depending on the extent to which overall bonding is
maintained
Reactions that lead to a change in the total number of electron
pairs, for example, homolytic bond dissociation reactions.
H-H  H• +H• homolytic bond dissociation
You can immediately see the problem. The energy of the product
(separated hydrogen atoms) is exact within the Hartree-Fock
approximation, while the energy of the product is too high. The
Hartree-Fock bond energy will be too large.
Homolytic Bond Dissociation Energies
bond dissociation reaction
HF/6-311+G**
MP2/6-311+G**
expt.
CH3 – CH3  CH3• + CH3•
276
406
406
CH3 – NH2  CH3• + NH2•
238
389
389
CH3 – OH  CH3• + OH •
243
410
410
CH3 – F  CH3• + F•
289
469
477
NH2 – NH2  NH2• + NH2•
138
310
305
-8
218
230
F – F  F• + F•
-163
121
159
mean absolute error
168
9
–
HO – OH  OH• + OH•
Reaction Types
Reactions that conserve the total number of bonds and the total
number of lone pairs, for example, structural isomerization.
CH2=C=CH2  CH3CHCH2
structural isomerization
These and related reactions are among the most common.
Energies of Structural Isomers
formula (reference)
isomer
HF/6-311+G**
MP2/6-311+G**
expt.
C2H3N (acetonitrile)
methyl isocyanide
88
112
88
C2H4O (acetaldehyde)
oxirane
134
117
113
C2H4O2 (acetic acid)
methyl formate
71
75
75
C2H6O (ethanol)
dimethyl ether
46
59
50
C3H4 (propyne)
allene
cyclopropene
8
117
21
100
4
92
C3H6 (propene)
cyclopropane
42
21
29
C4H6 (1,3-butadiene)
2-butyne
cyclobutene
bicyclo [1.1.0] butane
29
63
138
21
38
92
38
46
109
12
11
–
mean absolute error
Reaction Types
Reactions that conserve the numbers of each kind of chemical
bond, for example, basicity comparisons.
(CH3)3NH+ + NH3  (CH3)3N + NH4+ relative basicity
In this category are many important reactions, including those
that compare regioisomers and stereoisomers. We use basicity
comparisons to take advantage of the availabilty of high-quality
experimental thermochemical data (gas-phase proton affinities).
Relative Base Strengths
base
HF/6-311+G**
MP2/6-311+G**
expt.
aniline
methylamine
aziridine
ethylamine
dimethylamine
pyridine
tert-butylamine
cyclohexylamine
azetidine
pyrrolidine
trimethylamine
piperidine
diazabicyclooctane
N-methylpyrrolidine
N-methylpiperidine
quinuclidine
25
49
66
61
81
77
83
83
95
103
102
106
124
117
117
141
21
46
46
54
75
63
71
75
79
92
92
92
105
105
109
121
29
45
52
58
76
76
81
81
90
95
95
100
110
112
118
130
mean absolute error
6
6
–
Overall …
Homolytic bond dissociation energies from Hartree-Fock
models are always significantly smaller than experimental
enthalpies. This can be traced back to the difference in the
number of electron pairs. Bond energies from MP2 models are
in good accord with experimental values
The energies of reactions that maintain overall bond count are
reasonably well described with both Hartree-Fock and MP2
models.
Energies of reactions that maintain individual bond counts are
reasonably well described with both Hartree-Fock and MP2
models.
Conformational Energy Differences
Conformations of Acyclic Molecules
molecule
low-energy/
high-energy HF/
MP2/
conformer 6-311+G** 6-311+G** expt.
n-butane
1-butene
1,3-butadiene
acrolein
N-methylformamide
N-methylacetamide
formic acid
methyl formate
methyl acetate
propanal
2-methylpropanal
ethanol
methyl ethyl ether
methyl vinyl ether
trans/gauche
skew/cis
trans/gauche
trans/cis
trans/cis
trans/cis
cis/trans
cis/trans
cis/trans
eclipsed/anti
eclipsed/anti
anti/gauche
anti/gauche
cis/skew
mean absolute error
4.2
2.9
13.4
8.4
4.6
12.1
22.6
25.1
45.2
3.3
1.7
0.8
7.5
7.5
2.1
2.1
10.5
9.2
5.0
10.0
19.2
23.8
41.0
3.8
1.7
0.0
5.9
10.9
2.8
0.9
12.1
7.1
5.9
9.6
16.3
19.9
35.6
2.8
3.3
0.5
6.3
7.1
2.5
1.9
–
Conformations of Cyclic Molecules
molecule
low-energy/
high-energy/
conformer
HF/
MP2/
6-311+G** 6-311+G** expt.
methylcyclohexane
equatorial/axial
tert-butylcyclohexane
equatorial/axial
cis-1,3-dimethylcyclohexaneequatorial/axial
fluorocyclohexane
equatorial/axial
chlorocyclohexane
equatorial/axial
piperidine
equatorial/axial
N-methylpiperidine
equatorial/axial
2-chlorotetrahydropyran
axial/equatorial
2-methylcyclohexanone
equatorial/axial
3-methylcyclohexanone
equatorial/axial
4-methylcyclohexanone
equatorial/axial
9.6
25.9
27.6
0.4
3.8
3.8
16.3
11.3
8.4
7.1
8.8
7.1
21.3
22.2
0.4
2.1
3.3
15.5
12.1
6.3
6.7
5.4
7.3
22.6
23.0
0.7
2.1
2.2
13.2
7.5
8.8
5.7,6.5
7.3,8.8
mean absolute error
2.0
2.1
–
Overall …
Both Hartree-Fock and MP2 models appear to provide a
reasonable account of conformational energy differences. The
two sets of results are not the same but, given the paucity of
experimental data and significant error bounds on some of
these data, it is difficult to make generalizations.
Density Functional Theory
While MP2 models offer only modest improvements over
Hartree-Fock models for geometries and for some types of energy
comparisons, they are required to describe the energies of
reactions where bonds are made or broken, including activation
energies (barriers to chemical reactions). MP2 models are
significantly more costly than Hartree-Fock models and are more
limited in their range of application.
Density functional models provide an alternative for estimating
the correlation energy, by replacing the need to mix ground and
excited-state wavefunctions with an explicit term in the
Hamiltonian. This results in significantly lower computation cost
and consequently in a significantly larger range of application.
Formulation of Density Functional Theory
The hydrogen atom is not the only problem for which the
Schrödinger equation can be solved exactly. Another is an
electron gas, solution of which leads to a functional form for the
exchange/correlation energy, Exc, in terms of the electron density
and the gradient of the density. Exc may be combined with
Hartree-Fock terms for the kinetic energy, ET, electron-nuclear
interaction energy, EV, and Coulomb energy, EJ.
E = ET + EV + EJ + EXC
Minimizing E with respect to the unknown orbital coefficients
yields the Kohn-Sham equations, that are analogous to the
Roothaan-Hall equations in Hartree-Fock theory.
Origin of Density Functional Models
Problems with Density Functional Models
The magnitude of the error in the energy obtained from density
functional models does not scale with the size of the molecule.
Size consistency (as this behavior is commonly termed) is
important its absence is potentially a serious problem. Even more
important is the absence of a systematic way to improve
functionals in order to achieve an arbitrary level of accuracy
Comparing B3LYP Density Functional
and MP2 Models
Bond Lengths in Small Molecules
molecule
B3LYP/6-311+G**
MP2/6-311+G**
expt.
diborane
ethane
methylamine
methanol
methyl fluoride
methyl silane
methyl phosphine
methane thiol
methyl chloride
hydrazine
hydrogen peroxide
fluorine
disilane
hydrogen disulfide
chlorine
1.765
1.530
1.465
1.424
1.395
1.886
1.873
1.836
1.806
1.432
1.454
1.408
2.356
2.114
2.054
1.768
1.529
1.465
1.422
1.389
1.877
1.856
1.813
1.776
1.430
1.450
1.417
2.342
2.083
2.025
1.763
1.531
1.471
1.421
1.383
1.867
1.862
1.819
1.781
1.449
1.452
1.412
2.327
2.055
1.988
mean absolute error
0.018
0.010
–
Bond Lengths in Hydrocarbons
bond
hydrocarbon
B3LYP/6-311+G**
MP2/6-311+G**
expt.
C–C
but-1-yne-3-ene
propyne
1,3-butadiene
propene
cyclopropane
propane
cyclobutane
1.423
1.457
1.456
1.500
1.509
1.532
1.554
1.430
1.464
1.460
1.502
1.511
1.529
1.550
1.431
1.459
1.483
1.501
1.510
1.526
1.548
C=C
cyclopropene
allene
propene
cyclobutene
but-1-yne-3-ene
1,3-butadiene
cyclopentadiene
1.291
1.304
1.331
1.339
1.338
1.338
1.348
1.305
1.314
1.341
1.352
1.347
1.347
1.359
1.300
1.308
1.318
1.332
1.341
1.345
1.345
0.008
0.006
–
mean absolute error
Homolytic Bond Dissociation Energies
bond dissociation reaction
B3LYP/6-311+G**
MP2/6-311+G**
expt.
CH3 – CH3  CH3• + CH3•
384
406
406
CH3 – NH2  CH3• + NH2•
364
389
389
CH3 – OH  CH3• + OH •
389
410
410
CH3 – F  CH3• + F•
460
469
477
NH2 – NH2  NH2• + NH2•
289
310
305
HO – OH  OH• + OH•
205
218
230
F – F  F• + F•
134
121
159
mean absolute error
21
9
–
Energies of Structural Isomers
formula (reference)
isomer
B3LYP/6-311+G**
MP2/6-311+G**
expt.
C2H3N (acetonitrile)
methyl isocyanide
100
112
88
C2H4O (acetaldehyde)
oxirane
121
117
113
C2H4O2 (acetic acid)
methyl formate
67
75
75
C2H6O (ethanol)
dimethyl ether
46
59
50
C3H4 (propyne)
allene
cyclopropene
-8
100
21
100
4
92
C3H6 (propene)
cyclopropane
38
21
29
C4H6 (1,3-butadiene)
2-butyne
cyclobutene
bicyclo [1.1.0] butane
38
63
130
21
38
92
38
46
109
11
11
–
mean absolute error
Relative Base Strengths
Me 3N + NH 4+
base
Me3NH + + NH 3
B3LYP/6-311+G**
MP2/6-311+G**
expt.
aniline
methylamine
aziridine
ethylamine
dimethylamine
pyridine
tert-butylamine
cyclohexylamine
azetidine
pyrrolidine
trimethylamine
piperidine
diazabicyclooctane
N-methylpyrrolidine
N-methylpiperidine
quinuclidine
24
46
56
60
76
79
82
83
89
99
94
102
115
110
117
132
21
46
46
54
75
63
71
75
79
92
92
92
105
105
109
121
29
45
52
58
76
76
81
81
90
95
95
100
110
112
118
130
mean absolute error
2
6
–
Conformations of Acyclic Molecules
molecule
low-energy/
high-energy B3LYP/
MP2/
conformer 6-311+G** 6-311+G** expt.
n-butane
1-butene
1,3-butadiene
acrolein
N-methylformamide
N-methylacetamide
formic acid
methyl formate
methyl acetate
propanal
2-methylpropanal
ethanol
methyl ethyl ether
methyl vinyl ether
trans/gauche
skew/cis
trans/gauche
trans/cis
trans/cis
trans/cis
cis/trans
cis/trans
cis/trans
eclipsed/anti
eclipsed/anti
anti/gauche
anti/gauche
cis/skew
mean absolute error
3.8
2.9
14.6
9.2
4.2
10.0
18.8
22.1
38.0
3.8
1.7
0.0
6.3
8.4
2.1
2.1
10.5
9.2
5.0
10.0
19.2
23.8
41.0
3.8
1.7
0.0
5.9
10.9
2.8
0.9
12.1
7.1
5.9
9.6
16.3
19.9
35.6
2.8
3.3
0.5
6.3
7.1
1.4
1.9
–
Conformations of Cyclic Molecules
molecule
low-energy/
high-energy/
conformer
B3LYP/ MP2/
6-311+G** 6-311+G** expt.
methylcyclohexane
equatorial/axial
tert-butylcyclohexane
equatorial/axial
cis-1,3-dimethylcyclohexaneequatorial/axial
fluorocyclohexane
equatorial/axial
chlorocyclohexane
equatorial/axial
piperidine
equatorial/axial
N-methylpiperidine
equatorial/axial
2-chlorotetrahydropyran
axial/equatorial
2-methylcyclohexanone
equatorial/axial
3-methylcyclohexanone
equatorial/axial
4-methylcyclohexanone
equatorial/axial
10.0
22.2
25.1
0.8
2.9
2.9
16.7
15.5
8.4
6.7
8.4
7.1
21.3
22.2
0.4
2.1
3.3
15.5
12.1
6.3
6.7
5.4
7.3
22.6
23.0
0.7
2.1
2.2
13.2
7.5
8.8
5.7,6.5
7.3,8.8
mean absolute error
2.9
2.1
–
Performance of B3LYP and MP2 Models
Geometries obtained from B3LYP and MP2 models are very
similar and generally improved over geometries obtained from
Hartree-Fock models. There are exceptions. For example, the
geometries of molecules incorporating second-row elements, are
better described with Hartree-Fock models.
Bond dissociation energies from the B3LYP model are slightly
inferior to those from MP2 model, although both are greatly
improved relative to Hartree-Fock models. B3LYP and MP2
models provide comparable results for reactions where bonding
is partially or fully maintained.
Conformational energy differences obtained from B3LYP and
MP2 models are comparable in quality.
Practical Models …. LCAO Approximation
The third approximation connecting the Schrödinger equation to
Hartree-Fock and post-Hartree-Fock models is the LCAO
approximation, which writes the molecular orbitals ψ as linear
combinations of a finite set (a basis set) of prescribed functions
 (basis functions).
basis functions
yi=
ci 


c are the (unknown) molecular orbital coefficients. There are two
issues, the kind of functions and the number of functions.
Gaussian Basis Sets
The obvious choice for the individual functions is a polynomial
in the Cartesian coordinates times an exponential function, that
is, the form of the exact solutions for the hydrogen atom.
However, exponential functions give rise to expressions that are
difficult to solve analytically. Gaussian functions (an
exponential in the square of the distance from the origin rather
than the distance itself), lead to simpler mathematics and are
used instead.
All practical quantum chemical calculations now use Gaussian
functions, although in the past there was interest in “thinking”
about using exponential functions.
Minimal Basis Sets … STO-3G
A minimal basis set comprises the smallest set of functions
required to accommodate all of the atom’s electrons, and still
maintain its overall spherical symmetry. This is a single (1s)
function for hydrogen and helium, a set of five functions (1s, 2s,
2px, 2py, 2pz) for lithium to neon and a set of nine functions (1s,
2s, 2px, 2py, 2pz, 3s, 3px, 3py, 3pz) for sodium to argon.
Each of the functions in the STO-3G minimal basis set is
expanded in terms of three Gaussian functions. Gaussian
exponents and linear coefficients have been determined by least
squares as best fits to so-called Slater-type (exponential)
functions, that is, hydrogen atom solutions. “Have your cake
and eat it too” approach.
Shortcomings of Minimal Basis Sets
Minimal basis sets suffer from two shortcomings. The first is
that the basis functions are either themselves spherical or come
in sets that taken together describe a sphere. This implies that
atoms with spherical or nearly spherical molecular environments
will be better described than atoms with aspherical
environments.
Split-Valence Basis Sets … 3-21G
A split-valence basis set provides “inner” and “outer” sets of
valence basis functions that may be combined to account for
different environments. For example, a p orbital from the inner
set may emphasized to construct a  bond, while a p orbital from
the outer set may be emphasized to construct a  bond.
p = inner
+ outer
p = inner
+ outer
The 3-21G basis set uses three Gaussians for each of the core
orbitals and two and one Gaussians for each of the orbitals in the
inner and outer valence sets.
Basis Set Nomenclature
Basis sets may be thought of as divided into core and valence
regions. “Chemistry” is a function of the valence, and to the
maximum extent possible this is where emphasis (functions)
need to be placed. Typically each core atomic orbital is
represented by a single set of functions and each valence
atomic orbital by two (or more) sets of functions. The number
in the basis set designation to the left of the “–” indicates the
number of Gaussian functions use to construct core atomic
orbitals, and the numbers to the right indicate the numbers of
Gaussians used to construct valence atomic orbitals. For
example, 3-21G uses three Gaussians to describe each of the
core orbitals and two and one Gaussians to describe each of
the orbitals in the inner and outer valence sets.
Shortcomings of Minimal Basis Sets
The second shortcoming of a minimal basis set is that the basis
functions are atom centered. This restricts their flexibility to
describe off-center electron distributions.
The “obvious” solution is to move the functions away from the
nuclei. This is not a viable option as raises the question of “where
to put the functions”
Polarization Basis Sets 6-31G* and 6-31G**
A polarization basis set provides d-type functions on maingroup elements (*) and (optionally) p-type functions on hydrogen
(**) to allow displacement of electron distributions from the
nuclei. These are available to combine (hybridize) with the
valence orbitals.
+ 
+ 
6-31G* and 6-31G** basis sets are examples of the two types of
polarization basis sets. Here, each polarization function is made
up of a single Gaussian function.
Larger Basis Sets … 6-311+G**
Larger basis sets have been formulated to provide more
extensive splitting of the valence shell and to incorporate
functions that extend far from the nuclei, so-called diffuse or
“+” functions The latter may be needed for calculations on
anions and on excited states, where electrons may drift far from
nuclear centers. The 6-311+G** basis set splits the valence into
three parts, adds polarization functions to all atoms and adds a
diffuse function (made up of a single Gaussian function) to each
non-hydrogen atom.
Theoretical Models
Specification of approximations to the Schrödinger equation and
a basis set leads to a Theoretical Model. Most important, a
theoretical model needs to be well defined, meaning that it
depends only on the locations and identities of the nuclei, on the
total number of electrons and the number of unpaired electrons,
and it needs to be practical for molecules and problems of
interest. Only slightly less important, it should be unbiased,
meaning that little or no “chemical intuition” is used in its
formulation, and it should be size consistent, meaning that the
error in the energy scales with the size of the molecule. If
possible, a theoretical model should be variational, meaning that
the energy is higher than the energy from exact solution of the
Schrödinger equation.
Properties of Theoretical Models
Hartree-Fock models are well defined, practical for molecules
with up to 50 heavy (non-hydrogen) atoms, unbiased, size
consistent and variational.
MP2 models (and all MPN models) are well defined, applicable to
molecules with less than 20 heavy atoms, unbiased and size
consistent but they are not variational.
Density functional models are applicable to molecules with less
than 50 heavy atoms and are unbiased but are neither size
consistent nor variational. While density functional models are
well defined given the form of the term added to the Hartree-Fock
Hamiltonian, there is no obvious way to “improve” on this term
Simplifying Hartree-Fock Models
Are Simpler Models Justified?
Is there a need for quantum chemical models that are simpler
(“less costly”) than any that have presented thus far? Certainly
much less than there was only a decade ago and much more than
there will be in another decade. Present generation personal
computers can easily perform Hartree-Fock and density
functional on molecules comprising 50 heavy atoms and more
(getting close to the maximum size of molecules that can actually
be made). Missing are quantum chemical models that are
applicable to biopolymers (proteins), with thousands to tens of
thousands of atoms. While these can already be handled using
molecular mechanics, methods based on quantum mechanics
might be of interest.
Semi-Empirical Models
The NDDO Approximation
Semi-empirical models follow from the Hartree-Fock model by
introducing a rather draconian approximation. Known as NDDO
(Neglect of Diatomic Differential Overlap) approximation, this
insists that atomic orbitals residing on different atomic centers
do not overlap (“see each other”). This reduces the size
dependence of the limiting step in Hartree-Fock models from
O(N4) to O(N2), where N is the total number of functions in the
atomic basis set. A minor step in Hartree-Fock models (matrix
diagonalization) is O(N3) and dominates the computation for
semi-empirical models, limiting practical calculations to 200300 atoms at most. Proteins are not in the cards.
Form and Parameters for Semi-Empirical
Models
The functional form of all present generation semi-empirical
models is nearly the same. What differs are the values of atomic
parameters introduced in order to overcome the “damage”
incurred by the NDDO approximation. In practice, upwards of
20 parameters per atom are used in an attempt to reproduce
geometries, heats of formation, dipole moments and ionization
potentials. Experience suggests that this is too ambitious a goal.
Basis Sets for Semi-Empirical Models
Semi-empirical models use a minimal valence basis set of
exponential functions. Hydrogen is represented by a single (1s)
function. Main-group elements are represented by a single s-type and
set of three p-type functions.
2s, 2px,2py,2pz
first-row element
3s, 3px,3py,3pz
second-row element
Transition metals are represented by set of five d-type, one s-type and
set of three p-type functions.
3dxx-yy,3dzz,3dxy,3dxz,3dyz,4s,4px,4py,4pz
first-row
4dxx-yy,4dzz,4dxy,4dxz,4dyz,5s,5px,5py,5pz second-row
Origin of Semi-Empirical Models
Choosing a Theoretical Model
The choice of theoretical model ultimately rests with a balance
of how well it performs for the quantity of interest (geometry,
reaction energy or conformation as already discussed, or other
chemically important quantities such as activation energies and
spectra) and how easily (if at all) it can be applied to the
molecules of interest.
Theoretical Model Chemistry
A theoretical model gives rise to a “chemistry” (a Theoretical
Model Chemistry), that is, a set of results realized from its
application. This chemistry is distinct from that provided by any
other theoretical model, and from experiment. As the severity of
approximations used to construct the model is reduced, results
should approach experimental results.
Performance of
Practical Theoretical Models
Practical Theoretical Models
Among the quantum chemical models that have proven to be
reliable and practical for routine use are Hartree-Fock models
with 3-21G and 6-31G* basis sets, the B3LYP/6-31G* density
functional model and the MP2/6-31G* model. Less reliable but
more widely applicable is the PM3 semi-empirical model.
The utility of these five models rests on their ability to provide
accurate molecular properties.
We start with equilibrium geometries, reaction energies and
conformational energy differences.
Geometries of Small Molecules
molecule
Hartree-Fock
3-21G
6-31G*
B3LYP
6-31G*
MP2
6-31G*
PM3
expt.
diborane
ethane
methylamine
methanol
methyl fluoride
methyl silane
methyl phosphine
methane thiol
methyl chloride
hydrazine
hydrogen peroxide
fluorine
disilane
hydrogen disulfide
chlorine
1.786
1.542
1.471
1.441
1.404
1.883
1.855
1.823
1.806
1.451
1.473
1.402
2.342
2.057
1.996
1.778
1.527
1.453
1.400
1.365
1.888
1.861
1.817
1.785
1.413
1.393
1.345
2.353
2.064
1.990
1.769
1.531
1.465
1.419
1.383
1.889
1.876
1.836
1.804
1.437
1.456
1.403
2.351
2.098
2.042
1.754
1.527
1.465
1.424
1.392
1.884
1.860
1.817
1.778
1.439
1.467
1.421
2.338
2.069
2.015
1.773
1.504
1.469
1.395
1.351
1.863
1.866
1.801
1.764
1.440
1.482
1.350
2.396
2.034
2.035
1.763
1.531
1.471
1.421
1.383
1.867
1.862
1.819
1.781
1.449
1.452
1.412
2.327
2.055
1.988
mean absolute error
0.012
0.020
0.016
0.009
0.025
–
Geometries of Hydrocarbons
bond
hydrocarbon
Hartree-Fock
3-21G
6-31G*
C-C
but-1-yne-3-ene
propyne
1,3-butadiene
propene
cyclopropane
propane
cyclobutane
1.432
1.466
1.479
1.510
1.513
1.541
1.543
1.439
1.468
1.467
1.503
1.497
1.528
1.548
1.424
1.461
1.458
1.502
1.509
1.532
1.553
1.429
1.463
1.458
1.499
1.504
1.526
1.545
1.414
1.433
1.456
1.480
1.499
1.512
1.542
1.431
1.459
1.483
1.501
1.510
1.526
1.548
C=C
cyclopropene
allene
propene
cyclobutene
but-1-yne-3-ene
1,3-butadiene
cyclopentadiene
1.282
1.292
1.316
1.326
1.320
1.320
1.329
1.276
1.296
1.318
1.322
1.322
1.323
1.329
1.295
1.307
1.333
1.341
1.341
1.340
1.349
1.303
1.313
1.338
1.347
1.344
1.344
1.354
1.314
1.297
1.328
1.349
1.332
1.331
1.352
1.300
1.308
1.318
1.332
1.341
1.345
1.345
0.011
0.011
0.006
0.007
0.015
–
mean absolute error
B3LYP
6-31G*
MP2
6-31G*
PM3
expt.
[18] Annulene
There are cases where HF/6-31G* and B3LYP/6-31G* models
disagree structure of. For example, the Hartree-Fock model finds
a geometry for [18] annulene with alternating single and double
bonds, while the density functional model shows that carboncarbon bond lengths vary only slightly. The latter is in much
better accord with the experimental (X-ray) structure which
shows that the bonds vary only slightly from 1.38 to 1.41Ǻ.
.
[18] Annulene
Geometries of Organometallics
organometallic
bond
CO3Cr (benzene)
B3LYP/6-31G*
PM3
expt.
1.85
1.90
1.84
CO4Cr (Dewar benzene)
ax
eq
1.90
1.86
1.96
1.92
1.86
1.83
CO5Cr=C(Me)NH(Me)
ax
eq
1.89
1.90
1.91
1.91
1.86-1.88
1.88-1.91
CO3Fe (cyclobutadiene)
1.78
1.74
1.79
CO3Fe (butadiene)
1.78
1.75
1.76
CO4Fe (acetylene)
ax
eq
1.85
1.79
1.82
1.75
1.77
1.76
CO4Fe (ethylene)
ax
eq
1.81
1.79
1.81
1.75
1.78
1.81
CO3Co (allyl)
1.78
1.81
1.77
mean absolute error
0.02
0.04
–
Dichloro [ethane-1,2-diylbis(cyclopentadienyl)]zirconium
Dichloro[ethane-1,2-diyl-bis(cyclopentadienyl)]zirconium acts as a
catalyst in homogenous olefin polymerization.
Cl
Zr
Cl
The PM3 equilibrium geometry of this complex is nearly identical
to the X-ray structure in the Cambridge Structural Database.
Overall …
All five models generally provide a solid account of equilibrium
geometries. The B3LYP/6-31G* and MP2/6-31G* models are best
and the PM3 models is worst, but the differences are not large.
Hartree-Fock models and the MP2/6-31G* model provide a poor
account of the geometries of transition-metal inorganic and
organometallic compounds. The B3LYP/6-31G* model and, to a
lesser extent, the PM3 model provide reasonable equilibrium
geometries.
Bond Dissociation Energies
bond dissociation reaction
Hartree-Fock
3-21G
6-31G*
B3LYP
6-31G*
MP2
6-31G*
PM3
expt.
CH3-CH3  CH3+ CH3
285
293
406
414
310
406
CH3-NH2  CH3 + NH2
247
243
372
385
285
389
CH3-OH  CH3 + OH
222
247
402
410
347
410
CH3-F  CH3 + F
247
289
473
473
423
477
NH2-NH2  NH2 + NH2
155
142
293
305
209
305
HO-OH  OH + OH
13
0
226
230
192
230
F-F  F + F
-121
-138
176
159
247
159
mean absolute error
190
171
9
2
77
–
Energies of Structural Isomers
Hartree-Fock
3-21G
6-31G*
B3LYP/ MP2/
6-31G* 6-31G*
formula (reference)
isomer
PM3
expt.
C2H3N (acetonitrile)
methyl isocyanide
88
100
113
121
130
88
C2H4O (acetaldehyde)
oxirane
142
130
117
112
151
113
C2H4O2 (acetic acid)
methyl formate
54
54
50
59
63
75
C2H6O (ethanol)
dimethyl ether
25
29
21
38
38
50
C3H4 (propyne)
allene
cyclopropene
13
167
8
109
-8
92
21
96
29
117
4
92
C3H6 (propene)
cyclopropane
59
33
33
17
42
29
C4H6 (1,3-butadiene)
2-butyne
17
cyclobutene
75
bicyclo [1.1.0] butane 192
29
54
126
33
50
117
17
33
88
-4
29
160
38
46
109
mean absolute error
32
13
11
15
29
–
Nitrogen Base Strengths
base
Hartree-Fock
3-21G
6-31G*
B3LYP
6-31G*
MP2
6-31G*
PM3
expt.
aniline
methylamine
aziridine
ethylamine
dimethylamine
pyridine
tert-butylamine
cyclohexylamine
azetidine
pyrrolidine
trimethylamine
piperidine
diazabicyclooctane
N-methylpyrrolidine
N-methylpiperidine
quinuclidine
4
42
67
54
71
59
75
75
92
96
88
92
109
105
109
121
29
46
59
59
75
75
79
84
92
96
92
100
117
109
117
130
21
42
46
54
67
67
79
79
79
92
79
92
100
100
105
117
29
42
38
50
67
54
71
75
75
88
79
88
96
96
100
113
3
-8
-25
0
-17
0
25
21
0
0
-25
13
-21
-8
4
4
29
45
52
58
76
76
81
81
90
95
95
100
110
112
118
130
mean absolute error
8
2
8
12
86
–
Overall …
Only B3LYP/6-31G* and MP2/6-31G* models properly account
for homolytic bond dissociation energies. Except for PM3, all
models provide acceptable descriptions of the energetics of
reactions in which total electron-pair count is maintained and good
descriptions of reactions that maintain individual bond counts. The
PM3 model appears to be unreliable for all reaction energy
calculations.
Aside … Heats of Formation
Were calculations able to reliably provide accurate heats of
formation, we would not have to worry about carefully choosing
reactions. There have been numerous attempts, the simplest of
which (known as the G3(MP2) recipe) is able to reproduce
experimental heats to within 8 kJ/mol (mean absolute error).
However, G3(MP2) requires an MP2/6-31G* geometry, an
HF/6-31G* frequency, a very large basis set MP2 energy
calculation and a QCISD(T)/6-31G* energy calculation. The
last of these is most problematic as it scales with the seventh
power of the size. In practice, G3(MP2) is applicable only to
very small molecules (less than 10 heavy atoms).
The T1 Recipe
The T1 recipe has recently been formulated with the objective of
closely reproducing G3(MP2) heats of formation, but require two
to three orders of magnitude less computation. It replaces the
MP2/6-31G* geometry with an HF/6-31G* geometry, eliminates
both the HF/6-31G* frequency calculation and most importantly
the QCISD(T)/6-31G* energy calculation. It substitutes the large
basis set MP2 energy calculation with an RI-MP2 calculation
using a dual basis set. Parameters based on bond orders are
introduced.
T1 is easily applicable to molecules 25-30 heavy atoms. At
present, it is restricted to uncharged, closed-shell molecules
comprising H, C, N, O, Si, P, S, F, Cl and Br only.
T1 vs. G3(MP2)
Heats of formation obtained from the T1 and G3(MP2) recipes
differ by less than 2 kJ/mol (mean absolute error). In effect the
two procedures yield identical heats.
T1 vs. Experimental Heats of Formation
The mean absolute error between T1 and experimental heats of
formation for the molecules in the NIST database is <9 kJ/mol.
T1 Reaction Energies
T1 reproduces experimental reaction energies better than any
other practical theoretical model surveyed, and typically leads to
mean absolute errors that are <4 kJ/mol (comparable to errors in
the experimental data). It may offer a viable alternative to
experimental data which are both limited (combustion requires
large amounts of material which is in turn destroyed), and prone
to error.
Conformational Energy Differences
How Well do Quantum Chemical Models
Reproduce Conformational Differences?
Experimental data with which to assess the theoretical models are
limited to systems with a single degree of freedom. PM3 and
HF/3-21G models yield poor results, comparisons are restricted to
HF/6-31G*, B3LYP/6-31G* and MP2/6-31G* models.
Experimental data on conformer energy differences derive
primarily from equilibrium measurements, and may become less
and less accurate as the conformer energy differences increase.
Conformations of Acyclic Molecules
molecule
n-butane
1-butene
1,3-butadiene
acrolein
N-methylformamide
N-methylacetamide
formic acid
methyl formate
methyl acetate
propanal
2-methylpropanal
ethanol
methyl ethyl ether
methyl vinyl ether
mean absolute error
low-energy/
high-energy/
conformer
HF/
6-31G*
B3LYP/
6-31G*
MP2/
6-31G*
expt.
trans/gauche
skew/cis
trans/gauche
trans/cis
trans/cis
trans/cis
cis/trans
cis/trans
cis/trans
eclipsed/anti
eclipsed/anti
anti/gauche
anti/gauche
cis/skew
4.2
2.9
13.0
7.1
4.6
12.6
25.5
25.9
39.3
4.6
3.3
0.4
7.1
8.4
3.3
1.7
15.1
7.1
2.9
11.3
21.8
22.1
35.6
5.0
3.3
-1.3
5.9
9.6
2.9
2.1
10.9
6.3
4.2
11.7
25.5
25.9
46.0
4.6
4.2
0.4
7.1
7.9
2.8
0.9
12.1
7.1
5.9
9.6
16.3
19.9
35.6
2.8
3.3
0.5
6.3
7.1
2.3
1.7
2.7
1.9
–
Conformations of Cyclic Molecules
molecule
low-energy/
high-energy/
conformer
HF/ B3LYP/
6-31G* 6-31G*
MP2/
6-31G*
expt.
methylcyclohexane
equatorial/axial
tert-butylcyclohexane
equatorial/axial
cis-1,3-dimethylcyclohexaneequatorial/axial
fluorocyclohexane
equatorial/axial
chlorocyclohexane
equatorial/axial
piperidine
equatorial/axial
N-methylpiperidine
equatorial/axial
2-chlorotetrahydropyran
axial/equatorial
2-methylcyclohexanone
equatorial/axial
3-methylcyclohexanone
equatorial/axial
4-methylcyclohexanone
equatorial/axial
9.6
25.5
27.2
-1.3
4.2
3.3
15.1
10.5
9.6
7.1
8.8
8.8
22.2
25.1
-0.8
3.8
1.3
14.2
15.5
10.0
6.7
8.4
7.9
23.4
23.8
-2.9
2.9
2.5
15.1
11.7
9.2
3.8
6.3
7.3
22.6
23.0
0.7
2.1
2.2
13.2
7.5
8.8
5.7,6.5
7.3,8.8
mean absolute error
2.1
2.1
1.7
2.1
–
T1 Conformer Energy Differences
As measured by mean absolute error (kJ/mol), T1 is the most
successful practical model surveyed with regard to reproducing
conformer energy differences.
acyclic molecules
cyclic molecules
T1
1.1
1.1
HF/6-31G*
2.3
2.1
B3LYP/6-31G*
1.7
2.1
B3LYP/6-311+G**
MP2/6-31G*
1.4
2.7
2.9
1.7
MP2/6-311+G**
MMFF
1.9
1.1
2.1
3.3
Standards for Conformational Analysis
The lack of reliable experimental data for any but very simple
molecules means that it is necessary to use calculated conformer
energy differences as a standard with which to judge the
performance of practical theoretical models. The standard needs
to accurately reproduce existing experimental data and be
applicable to more complex molecules.
The B3LYP/6-311+G**//6-31G* model (B3LYP/6-311+G**
energy based on HF/6-31G* geometry) is used here as the
standard. In the future, this standard will probably be replaced
by T1.
2-Benzylamino-1-propanol
4,6-Dimethyl-1-phenyl-5-hepten-3-one
Performance of Practical Theoretical
Models for Conformational Analysis
From comparisons with B3LYP/6-311+G** results:
Both HF/6-31G* and B3LYP/6-31G* models properly identify
the lowest-energy conformer (or suggest a very similar “best”
conformer), and provide a reasonable account of conformer
energy differences.
HF/3-21G and PM3 models commonly fail to properly assign
the lowest-energy conformer, and neither provides a good
account of conformer energy differences.
What Does it Cost to Fit a Molecule into a
Protein?
The anti cancer drug gleevec has been crystallized inside a
protein and this complex is available in the PDB database
(accessible from Spartan).
According to the Hartree-Fock 6-31G* model, the protein bound
conformer is 9 kJ/mol less stable than the lowest-energy
conformer for the isolated molecule.
Molecular Charge Distributions
Dipole Moments
In addition to geometry (sterics”), organic chemists often refer to
charge-charge interactions (“electrostatics”) to judge whether a
molecule is likely to be favorable. The dipole moment reflects
opposing contributions of positively-charged nuclei and
negatively-charged electrons and accounts for overall polarity.
(debyes) = 2.5416 [ ZArA – Pr]
Summation are over atoms (A) and pairs of atomic basis functions
(j). ZA is the atomic number of atom A, rA is the position of atom
A (relative to an arbitrary origin), P is an element of the density
matrix and r is an integral.
r = ∫ jrjdt
Dipole Moments for Small Molecules
Molecule
Hartree-Fock
3-21G
6-31G*
B3LYP/
6-31G*
MP2/
6-31G*
PM3
expt.
CO
PH3
H2S
HCl
NH3
HF
H2O
CH3F
CH3Cl
CS
H2CO
HCN
LiH
LiF
LiCl
0.4
0.9
1.4
1.5
1.8
2.2
2.4
2.3
2.3
1.4
2.7
3.0
6.0
5.8
7.8
0.3
0.9
1.4
1.5
1.9
2.0
2.2
2.0
2.3
1.3
2.7
3.2
6.0
6.2
7.7
0.1
1.0
1.4
1.5
1.9
1.9
2.1
1.7
2.1
1.5
2.2
2.9
5.6
5.6
7.1
0.2
1.0
1.5
1.5
2.0
1.9
2.2
1.9
2.0
2.0
2.3
3.0
5.8
5.9
7.3
0.2
1.2
1.8
1.4
1.6
1.4
1.7
1.4
1.4
1.4
2.2
2.7
5.7
5.3
6.5
0.11
0.58
0.97
1.08
1.47
1.82
1.85
1.85
1.87
1.98
2.34
2.99
5.83
6.28
7.12
mean absolute error
0.4
0.3
0.4
0.2
0.4
–
Dipole Moments for Hydrocarbons
Hartree-Fock
3-21G
6-31G
B3LYP/
6-31G*
MP2/
6-31G*
PM3
expt.
formula
hydrocarbon
C3H4
propyne
cyclopropene
0.7
0.5
0.6
0.6
0.7
0.5
0.6
0.5
0.4
0.4
0.75
0.45
C3H6
propene
0.3
0.3
0.4
0.3
0.2
0.36
C4H6
cyclobutene
1,2-butadiene
1-butyne
methylenecyclopropane
bicyclo[1.1.0]butane
1-methylcyclopropene
0.1
0.4
0.7
0.3
0.8
0.9
0.0
0.4
0.7
0.4
0.7
0.9
0.1
0.4
0.7
0.4
0.8
0.9
0.1
0.3
0.6
0.3
0.8
0.8
0.2
0.2
0.3
0.2
0.4
0.6
0.13
0.40
0.80
0.40
0.68
0.80
C4H8
isobutene
cis-2-butene
cis-1-butene
methylcyclopropane
0.5
0.1
0.4
0.1
0.5
0.1
0.4
0.1
0.5
0.2
0.4
0.1
0.4
0.2
0.3
0.1
0.4
0.3
0.2
0.1
0.50
0.26
0.44
0.14
C5H6
cyclopentadiene
0.4
0.3
0.4
0.4
0.5
0.42
0.1
0.1
0.0
0.1
0.2
–
mean absolute error
Atomic Charges
The “charge” on an atom in a molecule (the sum of its nuclear
charge and the charges of “associated” electrons) may neither be
measured nor calculated unambiguously. The problem is how to
associate electrons with a particular atom. Consider an electron
density surface for hydrogen fluoride that encloses a large
fraction of the electrons.
H
F
While it shows that a large fraction of the electrons “belong” to
fluorine, the is no “correct” way to actually know this fraction.
Atomic Charges
from Fits to Electrostatic Potentials
Define atomic charges such that they reproduce as closely as
possible the electrostatic potential:
i)
Select points located outside the van der Waals surface.
The number and location of the points are not unique,
meaning that the resulting charges are not unique.
ii)
Calculate the electrostatic potential at these points.
the points to a potential in which atomic charges have
replaced nuclei and electrons, subject to the requirement
that the charges sum to the total charge on the molecule.
iii) Fit
Spectroscopy
Infrared Spectra
As mentioned earlier, the frequency of each of the lines in the infrared
spectrum is proportional to the square root of the second derivative of
the energy with respect to change in coordinate (the force constant).
The force constant is the first finite term in a power series expansion of
the energy as a function of the coordinate.
E = Eo + dE/dR + d2E/dR2 + higher-order terms
Eo is a constant. dE/dR is zero only at a stationary point, which means
that infrared spectra calculations need to be carried out using the
correct equilibrium geometry. Higher-order (anharmonic) terms are
ignored and the energy goes to infinity and not to zero (separated
atoms) with increasing distance. This means that the potential will be
too steep and the calculated frequency will be too large.
Calculation of Infrared Spectra
Evaluation of infrared frequencies involves calculation of the
second derivatives of the energy with respect to displacements
of the Cartesian coordinates. For Hartree-Fock and density
functional models, the “computational cost” is roughly four to
six or steps of geometry optimization, and molecules with
molecular weights <400 amu are in range. Infrared spectra
calculation with MP2 models is significantly more difficult and
presently limited to small molecules.
The intensity of an infrared absorption is proportional to the
change in the dipole moment in response to motion along the
coordinate.
Performance of Theoretical Models
for Infrared Spectra
Hartree-Fock models overestimate frequencies associated with
bond-stretching motions by ~12%, for example, the CO bond
stretch in cyclohexanone (left). The direction of the error is
consistent with the tendency to underestimate bond lengths. It is
greatly reduced for B3LYP models (right).
NIST Infrared Database
Spartan provides on-line access to the NIST database of infrared
spectra (~6,000 compounds). Measured spectra may be
displayed on top of calculated spectra.
Greenhouse Gases
Blackbody radiation from the earth exhibits “holes” due to
absorption in the infrared of “greenhouse gases”, CO2 most
important among them.
MTBE is a Greenhouse Gas
The fuel additive MTBE (methyl tert-butyl ether) is also likely
to be a greenhouse gas.
Matching Calculated Infrared Spectra
Scaled to account for systematic errors and broadened to
simulate finite temperatures, calculated infrared spectra
closely fit experimental spectra.
This suggests that a database of calculated infrared spectra
could be used lieu of experimental spectra to identify unknown
molecules.
Spartan Infrared Spectral Database
The Spartan Infrared Database contains ~50,000 spectra for
organic molecules obtained from the EDF2/6-31G* density
functional model. The database may be searched by “pattern
matching” to a input (experimental) spectrum, where overall
scaling and peak width at half height are individually
optimized. Because each spectrum in the database needs to be
optimized for best fit to the unknown, searching will require
significantly more computer time than searching a database of
measures spectra.
NMR Spectroscopy
Nuclear spins either align parallel or antiparallel to an applied
magnetic field, giving rise to nuclear spin states, the difference
in energy (ΔE) between which depends on the type of nucleus
and on the strength of the magnetic field (B0) at the nucleus.
ΔE = γħB0
γ is the gyromagnetic ratio which depends on the nucleus and ħ
is Planck’s constant/2π. The applied magnetic field is weakened
by electrons around the nucleus. Nuclei that are well shielded by
the electron cloud experience a lesser field than those that are
poorly shielded, and show a smaller energy splitting. The
splittings, relative to a standard, are termed chemical shifts.
Calculation of Chemical Shifts
Chemical shifts require calculation of the second derivatives of
the energy with respect to an external magnetic field at each
nuclear position. The “computational cost” of chemical shift
calculations follows the cube of the total number of basis
functions, and in practice is comparable to two or three steps of
geometry optimization. Molecules with molecular weights <500
amu are within range.
NMR shift calculations are available for both Hartree-Fock and
density functional models.
Presentation of Spectra
Only chemical shifts are calculated. Intensities for both proton
and 13C spectra are proportional to the number of equivalent
protons/carbons. HH coupling constants are obtained from an
empirical fit to the geometry. 13C DEPT spectra may be drawn.
Chemical Shift Database
Spartan provides on-line access to a database of ~15,000
compounds from the University of Cologne. Measured spectra
may be displayed on top of calculated spectra.
Direct Calculation of 13C Chemical Shifts
The simplest approach is to fit the calculated shifts to the
experimental using least-squares. B3LYP/6-31G* calculations
show an (rms) error of 4.4 ppm and significant outliers. The HF/631G* model gives a poorer correlation.
250
200
expt.
150
100
50
0
0
50
100
B3LYP
150
200
250
13C
.
Chemical Shifts in Organometallics
Cyclopentenebromonium Ion
The 13C spectrum of cyclopentenebromonium ion contains lines
at 18.8, 31.8 and 114.6 ppm. These might either arise from a
structure with bromine bonded to both sp2 carbons (bridged) or
from an equilibrium between a pair of equivalent structures with
bromine attached only to one carbon (open).
+
H
+
H
Br
bridged
13C
H
Br
H
open
chemical shifts for the bridged structure fit the observed
NMR spectrum much more closely than those for open form.
9,10-Dihydroxytetrahydrodicyclopentadiene
Calculated 13C shifts (in red) are sufficiently accurate to allow
the exo and endo stereoisomers of 9,10-dihydroxytetrahydrodicyclopentadiene to be distinguished.
7
HO
3
9
HO
HO
10
position(s)
1,3
9,10
HO
1
exo
endo
30.3 (33.1) 25.0 (25.8)
71.5 (73.5) 67.7 (68.9)
∆
5.3 (7.3)
3.8 (4.6)
What Works?
The two forms of cyclopentenebronomium ion are very different
and their NMR spectra are easily distinguished. Stereoisomers of
9,10-dihydroxytetrahydrodicyclopentadiene are very similar and
the calculations benefit from error cancellation.
A good example of what happens where the molecules are not
different enough for their NMR spectra to be easily distinguished,
and not similar enough to benefit from error cancellation, is
provided by attempts to assign the product of biosynthesis related
to the known pathway for lambertellol (Masaru Hashimoto,
Hirosaki University, Japan).
Biosynthetic Pathway
OH
O
O
O
H3C
OH
OH
O
O
OH
O
O
H3C
O
neolambertellin
H3 C
OH
O
H3C
O
OH
O
O
O
O
OH
O
O
H3C
O
O
O
H3C
O
O
OH
O
lambertellin
H3C
isolated
O
product?
Identification of Biosynthesis Products
The data are not sufficiently accurate to allow definitive structure
assignment. None of the possibilities (including the correct one
shown below) reproduces the overall “pattern” of the experimental
spectrum.
Restricted Calculation of 13C Chemical Shifts
Chemical shift predictions can be improved by restricting
comparisons to carbons that are closely related, for example, to
carbons in alkenes (left) and methyl group carbons (right).
Regression Fits to 13C Chemical Shifts
This suggests that calculated shifts can be “improved” simply by
taking the local environment into account, for example, by
“counting” the number of each kind of bond attached to a
particular carbon. In effect, this corresponds to using a series of
“standards” instead of just a single standard (tetramethylsilane).
The scheme implemented in Spartan’08 is based on a linear
regression with the calculated chemical shift and bond counts as
variables. An improved scheme that uses bond orders instead of
bond counts will be available in Spartan’10.
Regression Fits to 13C Chemical Shifts
Regression fits to experimental 13C shifts for sp2 (left) and sp3
(right) carbons show a factor of two reduction in (rms) error over
simple least-squares fits. There are no significant outliers.
Identification of Biosynthesis Products
Corrected shifts for one of the isomers matches the experimental 13C
spectrum, while those for the remaining structures do not match. This is
the structure supported by labeling experiments and by comparison of
calculated and experimental UV/vis spectrum.
Honest Assessment
13C
chemical shifts that have been “corrected” by taking the local
environment about the carbon into account offer significant
improvement over “uncorrected” shifts, and may now be adequate
for the purpose of distinguishing among molecules that are
structurally close (the “difficult” cases). This situation will
continue to improve with the development of more accurate
correction schemes.
Strychnine
Hesperidin
δ-3,4-trans-Tetrahydrocannabinol
Onocerin
Nicotine
Galanthamine
Caulophylline
Chamazulene
Prednisone
Cnicin
Morphine
NMR Timescale
The time required for a transition between nuclear spin states is
comparable to that required for some chemical processes, for
example, protonation/deprotonation, and several orders of
magnitude longer than that required to reach equilibrium among
conformers. This means that NMR spectra may depend on
temperature, and that a “high-temperature” spectrum represents
an average of the spectra of individual chemical species or
individual conformers of one species.
NMR Spectra of Flexible Molecules
Because the time for relaxation of nuclear spin states is much
longer than the time required to reach conformational
equilibrium, the NMR spectrum of a flexible molecule needs to
be calculated as a Boltzmann-weighted sum of spectra of the
individual conformers. While only a few conformers are likely to
contribute significantly to the average, it is necessary to consider
all conformers to identify these few.
Boltzmann Distributions
The difficulty is in calculating accurate Boltzmann weights. The
T1 recipe (described in the next section) appears to provide
reliable results, and shows that ~94% of atropine molecules exist
in one of five conformers. Averaging over these allows calculation
of the 13C spectrum for atropine.
Proton Chemical Shifts
Proton chemical shifts correlate reasonably well with experimental
shifts for both HF/6-31G* and B3LYP/6-31G* models, for
example, for methyl group hydrogens. A correction scheme to take
account of local environment is under development.
Application … Magnetic Anisotropy
In response to an external magnetic field, the π electrons in
benzene generate a local field that subtracts from the external
field directly above and below the ring plane (“shielding”), and
adds to the external field in the periphery (“deshielding”).
shielding
region
deshielding
region
deshielding
region
shielding
region
Protons that are shielded will exhibit smaller chemical shifts
than expected while those for protons that are deshielded will
exhibit larger shifts.
m-Cyclophane
The proton spectrum of m-cyclophane, shows resonances assigned
to the two benzene rings at 4.27, 6.97 and 7.24 ppm in a ratio of
1:2:1.
The spectrum obtained from the HF/6-31G* model is in agreement.
2D Spectra
In addition to proton and 13C spectra, Spartan can display
COSY, NOESY, HSQC and HMBC “2D” spectra. Note that
Spartan only calculates chemical shifts. Coupling constants are
not calculated, but instead are assigned from knowledge of the
actual structure (the opposite to what is done experimentally).
The objective is to provide connections between what is actually
observed in an NMR experiment the quantities that are actually
calculated.
HMBC Spectrum of Cnicin
HSQC Spectrum of Capsanthin
UV/Visible Spectra
UV/visible spectra (λmax) calculations require consideration of
both the ground state and a series of excited states.
B3LYP models (so-called time dependent density functional
models for excited states) generally provide an acceptable
account, although basis sets that incorporate diffuse functions
may be needed. In practice, UV/visible spectra calculations are
limited to molecules with 20-30 heavy atoms.
Display of UV/Visible Spectra
The resulting data may either be presented in terms of a series of
ground to excited state energy transitions or more conventionally
as a spectrum. The experimental spectrum displayed with the
calculated spectrum is from the NIST database (comprising
~1500 compounds) accessible on-line from Spartan.
λmax Calculations for Related Compounds
Absorption maxima obtained from the B3LYP/6-31+G* model
(6-31G* with diffuse functions on heavy atoms) for series
coumarin derivatives correlate with experimental λmax values.
Emission Spectra of Coumarin Derivatives
Some molecules emit light following absorption. While it not
practical to directly calculate emission spectra, they may be
related to quantities that can be calculated, specifically, λmax and
absorption intensity.
Chemical Reactivity and Selectivity
Revisiting the Hammond Postulate
As stated earlier, the justification for using key molecular orbitals on
the reactants to anticipate chemical reactivity and selectivity follows
from the Hammond Postulate: the transition state for an exothermic
reaction will resemble the reactants.
This same justification allows application of other graphical models.
We will consider two of these, local ionization potential maps to
anticipate reactivity/selectivity of electrophilic additions and LUMO
maps to do the same for nucleophilic additions.
Local Ionization Potential Map
The local ionization potential indicates the ease of electron
removal (ionization) at a location in the vicinity of a molecule. As
such, it is an indicator of electrophilic reactivity. The lower the
local ionization potential, the more loosely bound the electron and
the greater the likelihood for electrophilic attack. A local
ionization potential map paints the value of the local ionization
potential onto a density surface. Easily ionized regions are colored
red and regions that are difficult to ionize are colored blue.
"electron density"
Local Ionization Potential Map
Local ionization potential maps for benzene, aniline and
nitrobenzene show both positional selectivity in electrophilic
aromatic substitution (NH2 directs ortho/para, NO2 directs
meta), and the fact that π-donors such as NH2 activate benzene
while π-acceptors such as NO2 deactivate benzene.
Stereospecific Alkylation of Enolates
Alkylation of each of the enolates formed from the substituted
cyclodecanones shown below gives rise to a single product.
CH3O2C
CH3O2C
LiH; CH3I
O
LiH; CH3I
O
H
O
O
H
H
CN
H
CN
Local ionization potential maps properly show the observed
alkylation product.
LUMO Map
The lowest-unoccupied molecular orbital or LUMO indicates where
a pair of electrons (a nucleophile) will be most likely to add to a
molecule. A LUMO map paints the absolute value of the LUMO
onto a density surface, and provides a model for nucleophilic
reactivity. Regions colored blue have high LUMO concentration
and regions colored red have low LUMO concentration.
LUMO
non-bonded
electron pair
"electron density"
LUMO Map for Cyclohexenone
The map for cyclohexenone exhibits two regions of high LUMO
concentration. One is over the carbonyl carbon, consistent with
nucleophilic addition, while that over the  carbon, consistent
with conjugate or Michael addition.
HO
O
CH3
CH3Li
carbonyl addition
O
(CH3 )2CuLi
Michael addition
CH3
Nucleophilic Addition to Camphor
Addition of LiAlH4 to 2-norbornone occurs from the equatorial
face of the carbonyl group, while addition to camphor occurs
from the axial face.
Examination of LUMO maps for 2-methyl-2-norbornene and
7,7-dimethyl-2-norbornene clearly show that this change is due
to the pair of methyl groups at the 7 position.
Silaolefins
With the exception of phosphorous ylides, compounds
incorporating a double bond between carbon and a second-row
element are rare. A search of CSD turns up only a very few
compounds incorporating a carbon-silicon double bond.
Reactivity of Silaolefins
What all the known compounds have in common is a crowded
environment around the double bond. This suggests that it is
necessary to keep reagents away. Local ionization potential and
LUMO maps for tetramethylsilaethylene, Me2Si=CMe2, and its
carbon analog 2,3-dimethyl-2-butene, Me2C=CMe2, show that
the silaolefin more reactive toward both electrophiles and
nucleophiles than the olefin.
Back to Basics … Energy Surfaces
What is a Transition State?
In one dimension, a maximum on an energy curve corresponds
to a transition state. This does not necessarily mean that it
corresponds to a transition state for a “useful” chemical reaction,
but merely that it connects two minima (stable molecules).
What is a Reaction Coordinate?
Chemists represent multi-dimensional systems in terms of
reaction coordinate diagrams where focus is drawn to the
“important” coordinate (the reaction coordinate). For example,
interconversion of equivalent chair conformers of cyclohexane
through a twist-boat intermediate is thought of in terms of a
continuous motion.
transition
state
energy
transition
state
twist boat
chair
chair
reaction coordinate
Reaction Coordinate for Cyclohexane
Interconversion
In reality, the motion is probably much more complex than
portrayed in such a diagram. But the important point is that the
pathway followed from reactants to products is ill defined and a
reaction coordinate diagram is nothing more than an expression
of preconceived ideas.
The real world analogy (albeit only in 2D) is climbing a mountain.
The starting point (“reactants”) and the summit (“transition state”)
are well defined, but there are many possible paths (“reaction
coordinates”) connecting the two.
Diels-Alder Cycloaddition
of 1,3-Butadiene and Acrylonitrile
A further example is provided by the Diels-Alder cycloaddition
of 1,3-butadiene and acrylonitrile to form 4-cyanocyclohexene.
As the transition state involves hybridization and bond length
changes (relative to reactants), it seems unlikely that any single
simple coordinate will be able to provide an adequate description.
If we give up the idea of actually “seeing” a multi-dimensional
energy surface, we can make progress, as it is possible to provide
a mathematical definition of those few points on such a surface
that correspond to transition states.
Back to Basics …
As stated earlier, the “important” points are all stationary
points, that is, points for which the first derivative of the energy
with respect to each geometrical coordinate is zero.
in one dimension:
dE/dR = 0
in many dimensions: E/Ri = 0
i = 1,2,3 . . . 3N–6
Energy Minimum vs. Energy Maximum
In one dimension, an energy maximum (transition state) is a
point where the second derivative is negative.
d2E/dR2 < 0
Recall that we can generalize this to many dimensions simply by
replacing the original coordinates, R, by normal coordinates, ξ,
that lead to a second derivative matrix that is diagonal.
∂2E/∂Ri∂Rj  ∂2E/∂ξi∂ξj = δij ∙∂2E/∂ξi∂ξj
δij is 1 for i=j and 0 otherwise. Each stationary point may now be
assigned as either an energy minimum or an energy maximum
Defining a Transition State
We have already stated that points at which the energy is a
minimum in all dimensions correspond to “stable” molecules.
Points at which the energy is a minimum for all but one
dimension and a maximum in one dimension correspond to
“transition states”.
Liken the latter to a mountain pass. One does not go over a
summit (an energy maximum) to cross a mountain range but
rather through a pass. To repeat an earlier comment, while the
energy minima and the transition state are well defined, there are
many pathways (“reaction coordinates”), in the same way that
there can be many roads leading up to and away from a
mountain pass.
How to Guess a Transition State?
We start at what appears to be a disadvantage, as there are no
experimental transition-state structures. Transition states cannot
even be detected, let alone isolated and characterized, simply
because they “do not exist”. However, there is a significant
body of information from quantum chemical calculations about
the geometries of transition states. This can be searched by
substructure for a “best match” to the reaction of interest.
Alternatively, Spartan will do this automatically.
How to Find a Transition State?
The procedure used to locate a transition state is identical to that
used to find an equilibrium structure, except that the algorithm is
instructed to search out a geometry that is an energy maximum
in one dimension. The procedure terminates only when all
energy first derivatives closely approach zero and all geometrical
variables reach constant values.
Success requires a good guess, but even if you have one,
locating a transition will generally require two or three times the
number of steps required to as find an equilibrium geometry.
Finally, don’t be surprised if you turn up a transition state for an
“unexpected” reaction.
How to Verify a Transition State?
The “infrared spectrum” of a transition state needs to contain a
single imaginary frequency. This follows from the fact that
frequency is proportional to the square root of the second
derivative (divided by a mass). One of the second derivatives for
a transition state is negative (the energy curves upwards),
meaning that the square root is an imaginary number.
Existence of a single imaginary frequency is a necessary, but not
sufficient, requirement for a transition state. In addition, an
acceptable transition state needs be on a pathway that actually
connects the reactants and products of the chemical reaction of
interest.
Performance of Practical Theoretical
Models for Transition-State Geometries
There are no experimental geometries for transition states, and
the only way to assess the performance of practical models is to
compare them with results obtained from a model that has been
previously established to properly describe the structures of
stable molecules. The MP2/6-311+G** model has been selected
as the standard.
According to this measure, all five models are satisfactory (with
the B3LYP and MP2 models the best), although large individual
deviations are seen for single bonds that are breaking or forming.
While not shown here, the PM3 model occasionally fails to
provide a reasonable transition state.
Transition State Geometries
reaction
transition
state
O
O
b
bond
length
H
C
c
O
CH2
a
d
H2C
CH2
f
b
+ C 2 H4
H
C
H
e
H2
C
c
HC
CH2
a
d
H2C
CH2
f
b
O
+ CO 2
O
H2
C
e
c
HC
O
a
d
HC
C
f
mean absolute error
H
C
H2
e
O
Hartree-Fock/
3-21G
6-31G*
B3LYP/
6-31G*
MP2/
6-31G*
PM3
MP2/
6-311+G**
a
b
c
d
e
f
1.88
1.29
1.37
2.14
1.38
1.39
1.92
1.26
1.37
2.27
1.38
1.39
1.90
1.29
1.38
2.31
1.38
1.40
1.80
1.31
1.38
2.20
1.39
1.41
1.68
1.30
1.40
1.94
1.40
1.42
1.80
1.30
1.39
2.22
1.39
1.41
a
b
c
d
e
f
1.40
1.37
2.11
1.40
1.45
1.35
1.40
1.38
2.12
1.40
1.45
1.36
1.42
1.39
2.12
1.41
1.48
1.32
1.43
1.39
2.03
1.41
1.55
1.25
1.41
1.39
1.97
1.40
1.51
1.29
1.43
1.39
2.07
1.41
1.53
1.25
a
b
c
d
e
f
1.39
1.37
2.12
1.23
1.88
1.40
1.38
1.37
2.26
1.22
1.74
1.43
1.40
1.38
2.19
1.24
1.78
1.42
1.40
1.38
2.08
1.25
1.83
1.41
1.40
1.38
2.02
1.24
1.93
1.40
1.40
1.38
2.06
1.24
1.83
1.41
0.05
0.05
0.03
0.01
0.05
–
Ene Reaction
The ene reaction involves addition of a electron-poor double bond
to an alkene with a allylic hydrogen. The hydrogen is transferred
and a new carbon-carbon bond formed, for example, in the
addition of maleic anhydride and propene.
O
O
O
+
O
H
O
H
O
An animation of the motion associated with the imaginary
frequency shows that bond making and bond breaking occur
simultaneously, consistent with this being a concerted reaction.
Transition States for Derivative Reactions
Transition states for reactions that differ only by remote changes
in structure or in substitution closely resemble each other. This
means that transition states for simplified reactions may be used
to guess transition states for complex reactions.
A comparison of transition states for pyrolysis of ethyl formate
(leading to ethylene and formic acid) and cyclohexyl formate
(leading to cyclohexene and formic acid) shows that the parts in
common are nearly identical.
Stereoselective Claisen Rearrangements
In some cases, simply looking at transition-state geometries
leading to different products may suggest why one is favored
over the other. For example, the observed product in the Claisen
rearrangement shown below arises from a chair-like transition
state whereas the product that is not observed would require a
boat-like transition state.
Absolute Reaction Rate
Absolute reaction rate depends on the product of a rate constant
and the concentrations of the reactants, [A]a, [B]b....
rate = rate constant [A]a [B]b [C]c ...
Arrhenius Equation
The rate constant provides a “molecular interpretation” of reaction
rate, and is contained in the Arrhenius equation.
k = Ae– E‡/RT
E‡ is the activation energy (difference between the transition state
and the reactants), T is the temperature and R is the gas constant. A
accounts for the efficiency of molecular collisions.
The underlying assumption behind the Arrhenius equation is that all
molecules pass through the transition state. This allows us to use
thermodynamic arguments. This is not entirely reasonable, as some
molecules will have excess energy and be able to “fly over” the
transition state as they move from reactants to products.
Too Slow and Too Fast
A good rule of thumb is that reactions with activation energies
>200 kJ/mol will not occur at normal temperatures, while
reactions with activation energies <100 kJ/mol will be
unstoppable under the same conditions.
Nexium
The anti-ulcer drug esomeprazole (Nexium) is actually the S
enantiomer of an older unresolved drug no longer on patent.
While both enantiomers are active, the R enantiomer is
metabolized faster than the S enantiomer. This means that the
pure S compound is longer lasting than the racemic mixture. In
order for esomeprazole to qualify as a “new drug” subject to
patent protection it must not racemize.
Will Esomeprazole Racemize?
Chirality in esomeprazole is due to the sulfoxide group. There
are two distinct racemization pathways. The obvious one
involves inversion at sulfur through a planar transition state, and
the less obvious one involves a pair of [2,3] sigmatropic
rearrangements and an achiral intermediate.
Performance of Practical Models for
Absolute Activation Energy Comparisons
The Arrhenius is a simplified model of “reality” and absolute
activation energies derived from experimental rates based on it
may not accurately reflect calculated values. It is probably more
appropriate to compare calculated activation energies with
results obtained from a model that has been previously
established to properly describe the structures of stable
molecules. We will use the MP2/6-311+G** model as the
standard.
Absolute Activation Energies
Hartree-Fock
3-21G
6-31G*
reaction
CH3NC
CH3CN
HCO2CH2CH3
O
HCO2H + C2H4
O
+
+ C2H4
B3LYP/
6-31G*
MP2/
6-31G*
MP2/
6-311+G** PM3
238
192
172
180
172
243
259
293
222
251
234
251
192
238
142
117
109
-
176
205
121
109
109
146
126
167
84
50
38
134
314
356
243
251
230
255
81
93
15
11
-
91
H
mean absolute error
Absolute Activation Energies (con’t)
Hartree-Fock
3-21G
6-31G*
reaction
B3LYP/
6-31G*
MP2/
6-31G*
MP2/
6-311+G** PM3
N
HCNO + C2H2
O
105
146
50
33
38
427
230
247
163
159
142
-
276
197
151
155
142
267
+ CO2
247
251
167
184
172
276
+ SO 2
205
205
92
105
92
234
81
93
15
11
-
91
O
O
SO2
mean absolute error
Performance of Practical Theoretical Models
for Absolute Activation Energies
Absolute activation energies from Hartree-Fock models are
larger than those obtained from the MP2/6-311+G** model.
This parallels the fact that Hartree-Fock models underestimate
bond dissociation energies. Transition states are likely to be
more compact than reactants, meaning that electron motions are
likely to be more tightly coupled. Therefore, Hartree-Fock
models will do better for reactants than for transition states.
B3LYP/6-31G* and MP2/6-31G* models provide a better
account of absolute activation energies, although both show
sizable errors in some cases.
The PM3 model does not provide a satisfactory account.
Combining Different Theoretical Models
While geometry is well described by simple models, accurate
description of reaction and activation energies typically requires
“better” and more costly (in terms of computation) models. It
may be advantageous to combine different models rather than to
use a single model, for example, to use the HF/6-31G* model to
furnish geometry and to use the B3LYP/6-311+G** model to
provide energies.
Ireland Rearrangement
The Ireland rearrangement provides a route to allyl vinyl ethers
from allyl esters. The second step in this reaction is a Claisen
rearrangement.
Activation energies from the B3LYP/6-31G* model for the
Claisen step (for R=Me) change only slightly with use of
HF/3-21G or HF/6-31G* structures instead of the “exact”
(B3LYP/6-31G*) structures. A larger change is seen with use of
PM3 structures.
Kinetic Product
The kinetic product is that resulting from the lowest-energy
transition state, irrespective of whatever or not this is lowestenergy product.
energy
reaction coordinate
Kinetic product ratios depend on activation energy differences in
the same way that thermodynamic product ratios depend on the
difference in reactant and product energies.
Relative Activation Energies
Establishing the kinetic product of a reaction typically does not
require knowledge of absolute activation energy, but rather only
of the difference in activation energies for closely-related
reactions. For example, the kinetic preference of a particular
regio or stereoisomer formed in a reaction requires only
differences in transition-state energies leading from a common
set of reactants to each of the products.
Analogy with reaction energies (thermodynamics) suggests that
such types of comparisons conserve bonding and should benefit
from error cancellation. Even simple quantum chemical models
swould be expected to yield acceptable results.
Chiral Hydroboration
Hydroboration may occur from either the “top” or “bottom” of
the alkene shown below. However, only one diastereomer
results. (Oxidation follows hydroboration.)
OH
O
H3C
H
CH2OCH2Ph
CH3
CH2OCH2Ph
O
CH3
CH3
H
Relative transition state energies show a strong preference for
formation of the observed diastereomer.
Thermodynamic vs. Kinetic Control
of Chemical Reactions
If thermodynamic and kinetic products differ, it may be possible
to control the overall product distribution by changing reaction
conditions, in particular, by changing the temperature.
Calculations provide the means to say if thermodynamic and
kinetic preferences are different and, if they are, which better fits
the experimental data. This knowledge can then be used to
suggest reaction conditions that yield the desired products.
Radical Ring Closure Reactions
Loss of bromine from 6-bromohexene yields products derived
primarily from cyclopentylmethyl radical, rather than from
cyclohexyl radical.
Bu3SnH
AlBN
Br

•
hex-5-enyl radical
17%
•
rearrangement
cyclopentylmethyl radical
•
cyclohexyl radical
81%
2%
Thermodynamic vs. Kinetic Product
Cyclohexyl radical is more stable than cyclopentylmethyl
radical, not unexpected given that secondary radicals are more
stable than primary radicals and six-membered rings are more
stable than five-membered rings. The fact that cyclohexane is
not the observed ring-closure product means that the reaction is
not under thermodynamic control.
The transition state for closure of hex-5-enyl radical to
methylcyclopentyl radical is lower in energy than that for
closure to cyclohexyl radical. Methylcyclopentane is the kinetic
product. The fact that it is the observed product suggests that the
reaction is under kinetic control.
Polymerization of Cyclopentadiene
Cyclopentadiene undergoes a Diels-Alder reaction with itself.
The resulting dimer adds cyclopentadiene yielding a trimer, and
so forth. However, the reaction stops around the 20-mer.
N
Addition may either occur with exo or endo stereochemisty. The
former is thermodynamically favored while the latter is
kinetically favored.
+
exo addition
+
endo addition
Thermodynamic vs. Kinetic Product
The exo polymer (the thermodynamic product) shows a helical
structure. The endo polymer (the kinetic product) closes on itself
around the 20-mer. The reaction that prematurely terminates
appears to be under kinetic control, and raising the temperature
might cause a longer (and different) polymer to form.
exo polymer
endo polymer
Reactions Without Transition States
Some reactions proceed without barriers and discernible
transition states. Radicals combine without a barrier, for
example, two methyl radicals to form ethane, and typically add
to multiple bonds with little or no barrier. Ions add to neutral
molecules in the gas phase without activation barrier. For
example, in the gas phase SN2 addition of an anionic nucleophile
to an alkyl halide occurs without activation energy. The known
barrier for the process in solution is a consequence of the
solvent.
Reactions of Flexible Molecules.
The “Important” Conformer
For kinetically controlled processes, the important conformer
will not necessarily be the lowest-energy conformer. A good
example is provided by the Diels-Alder cycloaddition of 1,3butadiene with acrylonitrile.
+
CN
CN
The diene exists primarily in a trans conformation which is
unable to react. The “reactive” cis conformer is 8 kJ/mol less
stable and accounts for only about 5% of butadiene molecules at
room temperature. Nevertheless, the reaction occurs.
Curtin-Hammett Principle
Assuming that equilibration among conformers is much faster
than chemical reaction, whatever higher-energy conformer that
reacts will be replenished before it is needed again. This is the
Curtin-Hammett Principle.
chemical reaction
"high-energy process"
E
equilibration among conformers
"low-energy process"
It means that the products of a kinetically-controlled reaction
need not derive from the lowest-energy conformer.
Spartan Molecular Database (SMD)
Large collections of data obtained from quantum chemical
calculations are available. These include equilibrium geometries,
energies and atomic charges for ~150,000 organic and maingroup inorganic molecules from the HF/3-21G, HF/6-31G*,
B3LYP/6-31G*, EDF1/6-31G* and MP2/6-31G* models (not all
molecules are represented by all models) based on the “best
conformation” assigned from the MMFF molecular mechanics
model. Smaller collections (several thousand molecules) are
available for HF, B3LYP and MP2 models using the 6-311+G**
basis set, for the G3(MP2) model and for the transition-metal
inorganic and organometallic compounds using the B3LYP/631G* model. These may be searched by substructure or by
name.
Spartan Molecular Database (SMD)
A collection of ~100,000 closed-shell organic molecules obtained
from the EDF2/6-31G* using the best conformer assigned from the
T1 model is under development. This will include equilibrium
geometries, energies (T1) heats of formation, infrared and proton
and 13C NMR spectra. The database entries will also include the
wavefunction allowing “on-the-fly” generation of graphical
displays (molecular orbitals, electrostatic potential maps, etc).
In addition to substructure and name searching, this data may be
searched for a match to an “unknown” infrared spectrum.
Data Mining
Tools for “mining” the data in SMD are available. These allow
graphical and statistical comparisons of different atomic and
molecular properties of related molecules, of a single property of
related molecules from different theoretical models and of energies
(or more generally changes in any property) for a specified
chemical reaction or for a series of related reactions.
Concluding Remarks
Quantum chemical calculations have long been successfully
employed to interpret and rationalize experimental observation,
but more often than not, as an afterthought. They should now be
recognized as a legitimate means to explore chemistry alongside
of experiment or as an alternative to experiment.
There is a learning curve, just as there is a learning curve for
techniques of experimental chemistry. The focus should be on
understanding the capabilities and limitations of practical
quantum chemical models rather than on the underlying theory.
As with experimental chemistry, practical expertise and
confidence can and will follow only by “doing”.
Contact Information
General Sales & Licensing Questions:
sales@wavefun.com
Demo Requests:
Sue Kurz
sue@wavefun.com
Academic Licensing:
Tyler Netherton
tyler@wavefun.com
Commercial Licensing: Sean Ohlinger
Invoicing & Payments: Michelle Fitzpatrick
Webmaster: Pamela Ohsan
Wavefunction Support Team:
sean@wavefun.com
michelle@wavefun.com
pam@wavefun.com
support@wavefun.com
Sean Ohlinger, Phil Klunzinger, Jurgen Schnitker, Warren Hehre. . .
Download