Chemistry 6440 / 7440 Computational Chemistry and Molecular Modeling MidTerm Review Types of Molecular Models • Wish to model molecular structure, properties and reactivity • Range from simple qualitative descriptions to accurate, quantitative results • Costs range from trivial to months of supercomputer time • Some compromises necessary between cost and accuracy of modeling methods Molecular mechanics • Ball and spring description of molecules • Better representation of equilibrium geometries than plastic models • Able to compute relative strain energies • Cheap to compute • Lots of empirical parameters that have to be carefully tested and calibrated • Limited to equilibrium geometries • Does not take electronic interactions into account • No information on properties or reactivity • Cannot readily handle reactions involving the making and breaking of bonds Semi-empirical molecular orbital methods • Approximate description of valence electrons • Obtained by solving a simplified form of the Schrödinger equation • Many integrals approximated using empirical expressions with various parameters • Semi-quantitative description of electronic distribution, molecular structure, properties and relative energies • Cheaper than ab initio electronic structure methods, but not as accurate Ab Initio Molecular Orbital Methods • More accurate treatment of the electronic distribution using the full Schroedinger equation • Can be systematically improved to obtain chemical accuracy • Does not need to be parameterized or calibrated with respect to experiment • Can describe structure, properties, energetics and reactivity • Expensive Potential Energy Surfaces • The concept of potential energy surfaces is central to computational chemistry • The structure, energetics, properties, reactivity, spectra and dynamics of molecules can be readily understood in terms of potential energy surfaces • Except in very simple cases, the potential energy surface cannot be obtained from experiment • The field of computational chemistry has developed a wide array of methods for exploring potential energy surface • The challenge for computational chemistry is to explore potential energy surfaces with methods that are efficient and accurate enough to describe the chemistry of interest • Equilibrium molecular structures correspond to the positions of the minima in the valleys on a PES • Energetics of reactions can be calculated from the energies or altitudes of the minima for reactants and products • A reaction path connects reactants and products through a mountain pass • A transition structure is the highest point on the lowest energy path • Reaction rates can be obtained from the height and profile of the potential energy surface around the transition structure • The shape of the valley around a minimum determines the vibrational spectrum • Each electronic state of a molecule has a separate potential energy surface, and the separation between these surfaces yields the electronic spectrum • Properties of molecules such as dipole moment, polarizability, NMR shielding, etc. depend on the response of the energy to applied electric and magnetic fields Asking the Right Questions • molecular modeling can answer some questions easier than others • stability and reactivity are not precise concepts – need to give a specific reaction • similar difficulties with other general concepts: – resonance – nucleophilicity – leaving group ability – VSEPR – etc. Asking the Right Questions • phrase questions in terms of energy differences, energy derivatives, geometries, electron distributions • trends easier than absolute numbers • gas phase much easier than solution • structure and electron distribution easier than energetics • vibrational spectra and NMR easier than electronic spectra • bond energies, IP, EA, activation energies are hard (PA not quite as hard) • excited states much harder than ground states • solvation by polarizable continuum models (very hard by dynamics) Molecular Mechanics • PES calculated using empirical potentials fitted to experimental and calculated data • composed of stretch, bend, torsion and non-bonded components E = Estr + Ebend + Etorsion + Enon-bond • e.g. the stretch component has a term for each bond in the molecule Bond Stretch Term • many force fields use just a quadratic term, but the energy is too large for very elongated bonds Estr = ki (r – r0)2 • Morse potential is more accurate, but is usually not used because of expense Estr = De [1-exp(-(r – r0)]2 • a cubic polynomial has wrong asymptotic form, but a quartic polynomial is a good fit for bond length of interest Estr = { ki (r – r0)2 + k’i (r – r0)3 + k”i (r – r0)4 } • The reference bond length, r0, not the same as the equilibrium bond length, because of non-bonded contributions Angle Bend Term • usually a quadratic polynomial is sufficient Ebend = ki ( – 0)2 • for very strained systems (e.g. cyclopropane) a higher polynomial is better Ebend = ki ( – 0)2 + k’i ( – 0)3 + k”i ( – 0)4 + . . . • alternatively, special atom types may be used for very strained atoms Torsional Term • most force fields use a single cosine with appropriate barrier multiplicity, n Etors = Vi cos[n( – 0)] • some use a sum of cosines for 1-fold (dipole), 2fold (conjugation) and 3-fold (steric) contributions Etors = { Vi cos[( – 0)] + V’i cos[2( – 0)] + V”i cos[3( – 0)] } Non-Bonded Terms • Lennard-Jones potential – EvdW = 4 ij ( (ij / rij)12 - (ij / rij)6 ) – easy to compute, but r -12 rises too rapidly • Buckingham potential – EvdW = A exp(-B rij) - C rij-6 – QM suggests exponential repulsion better, but is harder to compute • tabulate and for each atom – obtain mixed terms as arithmetic and geometric means – AB = (AA + BB)/2; AB = (AA BB)1/2 Parameterization • difficult, computationally intensive, inexact • fit to structures (and properties) for a training set of molecules • recent generation of force fields fit to ab initio data at minima and distorted geometries • trial and error fit, or least squares fit (need to avoid local minima, excessive bias toward some parameters at the expense of others) • different parameter sets and functional forms can give similar structures and energies but different decomposition into components • don't mix and match Applications • good geometries and relative energies of conformers of the same molecule (provided that electronic interactions are not important) • effect of substituents on geometry and strain energy • well parameterized for organics, less so for inorganics • specialty force fields available for proteins, DNA, for liquid simulation • molecular mechanics cannot be used for reactions that break bonds • useful for simple organic problems: ring strain in cycloalkanes, conformational analysis, Bredt's rule, etc. • high end biochemistry problems: docking of substrates into active sites, refining x-ray structures, determining structures from NMR data, free energy simulations Schrödinger Equation Ĥ E • H is the quantum mechanical Hamiltonian for the system (an operator containing derivatives) • E is the energy of the system • is the wavefunction (contains everything we are allowed to know about the system) • ||2 is the probability distribution of the particles Hamiltonian for a Molecule ˆ H electrons i 2 2 nuclei 2 2 electronsnuclei e 2 Z A electrons e 2 nuclei e 2 Z A Z B i A 2me riA rij rAB A 2m A i A i j A B • kinetic energy of the electrons • kinetic energy of the nuclei • electrostatic interaction between the electrons and the nuclei • electrostatic interaction between the electrons • electrostatic interaction between the nuclei Variational Theorem • the expectation value of the Hamiltonian is the variational energy * ˆ Hd d * Evar Eexact • the variational energy is an upper bound to the lowest energy of the system • any approximate wavefunction will yield an energy higher than the ground state energy • parameters in an approximate wavefunction can be varied to minimize the Evar • this yields a better estimate of the ground state energy and a better approximation to the wavefunction Born-Oppenheimer Approximation • the nuclei are much heavier than the electrons and move more slowly than the electrons • in the Born-Oppenheimer approximation, we freeze the nuclear positions, Rnuc, and calculate the electronic wavefunction, el(rel;Rnuc) and energy E(Rnuc) • E(Rnuc) is the potential energy surface of the molecule (i.e. the energy as a function of the geometry) • on this potential energy surface, we can treat the motion of the nuclei classically or quantum mechanically Hartree Approximation • assume that a many electron wavefunction can be written as a product of one electron functions (r1 , r2 , r3 ,) (r1 ) (r2 ) (r3 ) • if we use the variational energy, solving the many electron Schrödinger equation is reduced to solving a series of one electron Schrödinger equations • each electron interacts with the average distribution of the other electrons Hartree-Fock Approximation • the Pauli principle requires that a wavefunction for electrons must change sign when any two electrons are permuted • the Hartree-product wavefunction must be antisymmetrized • can be done by writing the wavefunction as a determinant 1 (1) 1 (2) 1 (n) 1 2 (1) 2 (2) 2 (n) n n (1) n (1) n (n) 1 2 n Fock Equation • take the Hartree-Fock wavefunction 1 2 n • put it into the variational energy expression Evar * Ĥd * d • minimize the energy with respect to changes in the orbitals Evar / i 0 • yields the Fock equation F̂i ii Fock Operator ˆ V ˆ Jˆ K ˆ Fˆ T NE • Coulomb operator (electron-electron repulsion) 2 e Jˆ i { j j d }i rij j • exchange operator (purely quantum mechanical -arises from the fact that the wavefunction must switch sign when you exchange to electrons) electrons e2 j rij i d } j electrons ˆ { K i j Solving the Fock Equations F̂i ii 1. obtain an initial guess for all the orbitals i 2. use the current I to construct a new Fock operator 3. solve the Fock equations for a new set of I 4. if the new I are different from the old I, go back to step 2. LCAO Approximation • • • • numerical solutions for the Hartree-Fock orbitals only practical for atoms and diatomics diatomic orbitals resemble linear combinations of atomic orbitals e.g. sigma bond in H2 1sA + 1sB for polyatomics, approximate the molecular orbital by a linear combination of atomic orbitals (LCAO) c Roothaan-Hall Equations • • • • • basis set expansion leads to a matrix form of the Fock equations F Ci = i S Ci F – Fock matrix Ci – column vector of the molecular orbital coefficients I – orbital energy S – overlap matrix Slater-type Basis Functions 1/ 2 3 1s (r ) 1s / exp( 1s r ) 1/ 2 5 2 s (r ) 2 s / 96 r exp( 2 s r / 2) 1/ 2 5 2 px (r ) 2 p / 32 x exp( 2 p r / 2) • • • • • exact for hydrogen atom used for atomic calculations right asymptotic form correct nuclear cusp condition 3 and 4 center two electron integrals cannot be done analytically Gaussian-type Basis Functions 1/ 4 3 g s (r ) 2 / exp( r 2 ) 5 3 1/ 4 g x (r ) 128 / x exp( r 2 ) 7 3 1/ 4 2 g xx (r ) 2048 / 9 x exp( r 2 ) 7 3 1/ 4 g xy (r ) 2048 / xy exp( r 2 ) • • • die off too quickly for large r no cusp at nucleus all two electron integrals can be done analytically Minimal Basis Set • • • only those shells of orbitals needed for a neutral atom e.g. 1s, 2s, 2px, 2py, 2pz for carbon STO-3G – 3 gaussians fitted to a Slater-type orbital (STO) – STO exponents obtained from atomic calculations, adjusted for a representative set of molecules • also known as single zeta basis set (zeta, , is the exponent used in Slater-type orbitals) Double Zeta Basis Set (DZ) • • • • • each function in a minimal basis set is doubled one set is tighter (closer to the nucleus, larger exponents), the other set is looser (further from the nucleus, smaller exponents) allows for radial (in/out) flexibility in describing the electron cloud if the atom is slightly positive, the density will be somewhat contracted if the atom is slightly negative, the density will be somewhat expanded Split Valence Basis Set • • • • only the valence part of the basis set is doubled (fewer basis functions means less work and faster calculations core orbitals are represented by a minimal basis, since they are nearly the same in atoms an molecules 3-21G (3 gaussians for 1s, 2 gaussians for the inner 2s,2p, 1 gaussian for the outer 2s,2p) 6-31G (6 gaussians for 1s, 3 gaussians for the inner 2s,2p, 1 gaussian for the outer 2s,2p) Polarization Functions • • • • • • • higher angular momentum functions added to a basis set to allow for angular flexibility e.g. p functions on hydrogen, d functions on carbon large basis Hartree Fock calculations without polarization functions predict NH3 to be flat without polarization functions the strain energy of cyclopropane is too large 6-31G(d) (also known as 6-31G*) – d functions on heavy atoms 6-31G(d,p) (also known as 6-31G**) – p functions on hydrogen as well as d functions on heavy atoms DZP – DZ with polarization functions Diffuse Functions • • • • functions with very small exponents added to a basis set needed for anions, very electronegative atoms, calculating electron affinities and gas phase acidities 6-31+G – one set of diffuse s and p functions on heavy atoms 6-31++G – a diffuse s function on hydrogen as well as one set of diffuse s and p functions on heavy atoms Correlation-Consistent Basis Functions • • • • • • a family of basis sets of increasing size can be used to extrapolate to the basis set limit cc-pVDZ – DZ with d’s on heavy atoms, p’s on H cc-pVTZ – triple split valence, with 2 sets of d’s and one set of f’s on heavy atoms, 2 sets of p’s and 1 set of d’s on hydrogen cc-pVQZ, cc-pV5Z, cc-pV6Z can also be augmented with diffuse functions (aug cc-pVXZ) Molecular Orbital Plots i (r ) ci (r ) • plot a surface where |i(r)|2 = c • i(r) can have positive and negative values • shade in different colors • only the change in sign matters, not the absolute sign Population Analysis occ * 2 c i ci P density matrix, S overlap matrix i partition P S N e into contributi ons from different atoms and basis functions M P S Mulliken population analysis matrix M AB P S condensed to atoms A B q A Z A M AB B atomic cha rge Dipole Moment • for Hartree-Fock wavefunctions, the dipole is the expectation value of the classical expression for the dipole • can be written in terms of the density matrix and a set of dipole integrals over the basis functions * ( eri ) d eZ A RA i A 2 (er )i d eZ A RA occ * i i A P (er ) d eZ A RA A Electron Density (r ) P (r ) (r ) Electrostatic Potential • energy of a unit test charge placed at rC * ESP(rC ) (e / rC ) d eZ A / RAC A P (e / rC ) d eZ A / RAC A Features of Potential Energy Surfaces Initial guess for geometry & Hessian Calculate energy and gradient Minimize along line between current and previous point Update Hessian (Powell, DFP, MS, BFGS, Berny, etc.) Take a step using the Hessian (Newton, RFO, Eigenvector following) Check for convergence on the gradient and displacement no Update the geometry yes DONE Testing Minima • Compute the full Hessian (the partial Hessian from an optimization is not accurate enough and contains no information about lower symmetries). • Check the number of negative eigenvalues: – 0 required for a minimum. – 1 (and only 1) for a transition state • For a minimum, if there are any negative eigenvalues, follow the associated eigenvector to a lower energy structure. • For a transition state, if there are no negative eigenvalues, follow the the lowest eigenvector up hill. Algorithms for Finding Transition States • • • • Surface fitting Linear and quadratic synchronous transit Coordinate driving Hill climbing, walking up valleys, eigenvector following • Gradient norm method • Quasi-Newton methods • Newton methods Gradient Based Transition Structure Optimization Algorithms • Quadratic Model – fixed transition vector – constrained transition vector – associated surface – fully variable transition vector • Non Quadratic Models-GDIIS • Eigenvector following/RFO for stepsize control • Bofill update of Hessian, rather than BFGS • Test Hessian for correct number of negative eigenvalues Testing Transition Structures • Compute the full Hessian (the partial Hessian from an optimization is not accurate enough and contains no information about lower symmetries). • Check the number of negative eigenvalues: – 1 and only 1 for a transition state. • Check the nature of the transition vector (it may be necessary to follow reaction path to be sure that the transition state connects the correct reactants and products). • If there are too many negative eigenvalues, follow the appropriate eigenvector to a lower energy structure. Reaction Paths Taylor expansion of reaction path 0 2 1 3 2 x(s) x(0) s (0) 1 2 s (0) 1 6 s (0) d x( s ) g ds |g| 0 Tangent d 0 ( s ) d 2 x( s ) 1 ds d s2 Curvature 1 (H 0 ( 0tH 0 ) 0 ) / | g | Harmonic Vibrational Frequencies for a Polyatomic Molecule ˆ H nuc 2 2 1 2 qi 2 2 q i 2 i, j i ~ t L k L L M k ML i 2 q Lt Lt Mx M i , j i , j / mi t I – eigenvalues of the mass weighted Cartesian force constant matrix qi – normal modes of vibration Calculating Vibrational Frequencies • optimize the geometry of the molecule • calculate the second derivatives of the HartreeFock energy with respect to the x, y and z coordinates of each nucleus • mass-weight the second derivative matrix and diagonalize • 3 modes with zero frequency correspond to translation • 3 modes with zero frequency correspond to overall rotation (if the forces are not zero, the normal modes for rotation may have non-zero frequencies; hence it may be necessary to project out the rotational components) Pople, J. A.; Schlegel, H. B.; Krishnan, R.; DeFrees, D. J.; Binkley, J. S.; Frisch, M. J.; Whiteside, R. A.; Hout, R. F.; Hehre, W. J.; Molecular orbital studies of vibrational frequencies. Int. J. Quantum. Chem., Quantum Chem. Symp., 1981, 15, 269-278. Scaling of Vibrational Frequencies • calculated harmonic frequencies are typically 10% higher than experimentally observed vibrational frequencies • due to the harmonic approximation, and due to the Hartree-Fock approximation • recommended scale factors for frequencies HF/3-21G 0.9085, HF/6-31G(d) 0.8929, MP2/6-31G(d) 0.9434, B3LYP/6-31G(d) 0.9613 • recommended scale factors for zero point energies HF/3-21G 0.9409, HF/6-31G(d) 0.9135, MP2/6-31G(d) 0.9676, B3LYP/6-31G(d) 0.9804 Electron Correlation Energy • in the Hartree-Fock approximation, each electron sees the average density of all of the other electrons • two electrons cannot be in the same place at the same time • electrons must move two avoid each other, i.e. their motion must be correlated • for a given basis set, the difference between the exact energy and the Hartree-Fock energy is the correlation energy • ca 20 kcal/mol correlation energy per electron pair Goals for Correlated Methods • well defined – applicable to all molecules with no ad-hoc choices – can be used to construct model chemistries • efficient – not restricted to very small systems • variational – upper limit to the exact energy • size extensive – E(A+B) = E(A) + E(B) – needed for proper description of thermochemistry • hierarchy of cost vs. accuracy – so that calculations can be systematically improved Configuration Interaction • determine CI coefficients using the variational principle 0 tia ia tijab ijab ia ijab abc abc t ijk ijk ijkabc ˆ d / *d with respect to t minimize E *H • CIS – include all single excitations – useful for excited states, but on for correlation of the ground state • CISD – include all single and double excitations – most useful for correlating the ground state – O2V2 determinants (O=number of occ. orb., V=number of unocc. orb.) • CISDT – singles, doubles and triples – limited to small molecules, ca O3V3 determinants • Full CI – all possible excitations – ((O+V)!/O!V!)2 determinants – exact for a given basis set – limited to ca. 14 electrons in 14 orbitals Møller-Plesset Perturbation Theory • choose H0 such that its eigenfunctions are determinants of molecular orbitals ˆ Fˆ H 0 i • expand perturbed wavefunctions in terms of the Hartree-Fock determinant and singly, doubly and higher excited determinants 1 aia ia aijab ijab ia ijab abc abc a ijk ijk ijkabc • perturbational corrections to the energy ˆ d V ˆ d EHF E0 E1 0 H 0 0 0 0 ˆ d E EMP 2 EHF E2 EHF 0 V 1 HF i j , a b ˆ ab d ]2 [ 0 V ij a a i j Coupled Cluster Theory • CISD can be written as ˆ T ˆ ) CISD (1 T 1 2 0 • T1 and T2 generate all possible single and double excitations with the appropriate coefficients T̂2 0 ab ab t ij ij i j , a b • coupled cluster theory wavefunction ˆ T ˆ ) CCSD exp(1 T 1 2 0 Theoretical Basis for Density Functional Theory • Hohenberg and Kohn (1964) – – – – energy is a functional of the density E[] the functional is universal, independent of the system the exact density minimizes E[] applies only to the ground state • Kohn and Sham (1965) – variational equations for a local functional E[ ] T [ ] VNE [ ] J [ ] Exc [ ] Vnuc – density can be written as a single determinant of orbitals (but orbitals are not the same as Hartree-Fock) – EXC takes care of electron correlation as well as exchange Density Functional Theory • local functionals (LSDA) – depend only on the density – exchange and correlation functional from electron gas • generalized gradient approximation (GGA) – depends on ||/4/3 – BLYP, BP86, BPW91, PBE • hybrid functionals – mix some Hartree-Fock exchange – B3LYP, PBE1PBE, B3PW91 Semi-empirical MO Methods • the high cost of ab initio MO calculations is largely due to the many integrals that need to be calculated (esp. two electron integrals) • semi-empirical MO methods start with the general form of ab initio Hartree-Fock calculations, but make numerous approximations for the various integrals • many of the integrals are approximated by functions with empirical parameters • these parameters are adjusted to improve the agreement with experiment Zero Differential Overlap (ZDO) • two electron repulsion integrals are one of the most expensive parts of ab initio MO calculations ( | ) (1) (1) • • • 1 (2) (2)d 1d 2 r12 neglect integrals if orbitals are not the same ( | ) ( | ) where 1 if , 0 if approximate integrals by using s orbitals only CNDO, INDO and MINDO semi-empirical methods Neglect of Diatomic Differential Overlap (NDDO) • fewer integrals neglected 1 ( | ) (1) (1) (2) (2)d 1d 2 r12 • neglect integrals if and are not on the same atom or and are not on the same atom integrals approximations are more accurate and have more adjustable parameters than in ZDO methods parameters are adjusted to fit experimental data and ab initio calculations MNDO, AM1 and PM3 semi-empirical methods • • • Model Chemistries • A theoretical model chemistry is a complete algorithm for the calculation of the energy of any molecular system. • It cannot involve subjective decisions in its application. • It must be size consistent so that the energy of every molecular species is uniquely defined. • A simple model chemistry employs a single theoretical method and basis set. • A compound model chemistry combines several theoretical methods and basis sets to achieve higher accuracy at lower cost. • A model chemistry is useful if for some class of molecules it is the most accurate calculation we can afford to do. Model Chart Minimal STO-3G HF MP2 MP3 MP4 QCISD(T) ... Full CI Split-Valence 3-21G Basis Polarized 6-31G* 6-311G* Diffuse 6-311+G* High Ang. Mom. 6-311+G(3df,p) … Schrödinger Equation Development of a Model Chemistry • Set targets – accuracy goals – cost/size goals – validation data set • Define and implement methods – Specify level of theory for geometry optimization, electronic energy, vibrational zero point energy • Test model on validation data set Compound Model Chemistries: G2 and G2(MP2) • Proposed by J. Pople and co-workers • Goal: Atomization energies to 2 kcal/mol • Strategy: Approximate QCISD(T)/6311+G(3df,p) by assuming that basis set and correlation corrections are additive • Mean absolute error of 1.21 kcal/mol in 125 comparisons CBS Extrapolation The slow, N-1, convergence of the correlation energy vs the one-electron basis set expansion is the result of the universal cusp in wave functions as interelectronic distances, rij 0 . Thus, we can reasonably expect the N-1 form to also be universal. SCF MP2 MP4(SDQ) MP4(SDTQ) QCISD(T) FCI 6-31G 631G† 6-31+G† 6-31+G†† 6-311G(d,p) 6-31+G(d(f),d,p) 6-311+G(d,p) 6-311G(2df,p) 6-311+G(2df,p) 6-311+G(3df,2p) 6-311+G(3d2f,2df,p) 6-311++G(3d2f,2df,2p) [6s6p3d2f,4s2p1d] CBS Exact RM S Error: G 2 test set (kcal/m ol) 3.0 2.5 C BS-4 C BS-q 2.0 G 2(MP2) G2 1.5 C BS-Q B3 1.0 C BS-Q C I/APNO 0.5 0.0 0 5 10 15 20 25 M axim um N um ber of H eavy Atom s 30 Thermodynamic Functions • U(T) - internal energy at absolute temperature T • H(T) = U(T) + PV = U(T) + RT - enthalpy • S(T) - entropy • G(T) = H(T) – T S(T) – free energy Thermodynamic Functions • at absolute zero, T = 0 U(0) = H(0) = G(0) U(0) = electronic energy + zero point energy S(0) = 0 for a pure crystalline substance (third law of thermodynamics) Thermodynamic Functions at T 0 • U(T) = U(0) + CvdT – heat at constant volume, molecule gains energy for translation (3/2 RT), rotation (3/2 RT) and vibration ( 1/(1-exp(-i/kT)) • H(T) = H(0) + CpdT – heat at constant pressure, molecule gains additional energy from expansion • S(T) > 0 – more states become accessible as the temperature increases