Supplementary Information for Pseudobond parameters for QM/MM studies involving nucleosides, nucleotides and their analogs Robin Chaudret, Jerry M. Parks and Weitao Yang Contents a. General dihedral behavior b. Dihedral angles in additional molecules c. Bond Dissociation Error table d. Full Gaussian reference a. General dihedral behavior In this section we discuss the differences in dihedral angles between molecules containing pseudobonds (PsMols) and the corresponding standard reference molecules (StdMols). In general, performance of the pseudobond method for geometric quantities including bonds, angles and dihedrals is quite good. However, some significant deviations are found for some of the dihedrals, evident as off-diagonal points in Figure S1. Here, we review the reasons explaining such behavior. The RMSE for the Tot pseudobond (parameters optimized against the total training set) is 9.3° (maximum errors are -31.5° and 35.1°. One points near (-180º,180º) represents only a small modification of a dihedral with values close to 180º. The largest errors often come from the rotation of highly flexible groups such as the 2’ or 3’ hydroxyls or the phosphates. In addition, neglecting inclusion of the MM subsystem or the rotation of some of its substituent atoms induces a relaxation of the ribose sugar ring because the potential energy surface is very flat. However, such flexibility of the phosphate group is not expected to occur in a more realistic QM/MM system where it would be bonded either to another nucleic acid or to a diphosphate group, for example in ATP. 200 150 Pseudobond 100 50 0 -200 -100 -50 0 100 200 -100 -150 -200 Standard Figure S1: Correlation of dihedrals in molecules containing pseudobonds (PsMol) and standard molecules (StdMol) for the Total test set. b. Dihedral angles in additional molecules The dihedral RMSE values were computed individually for each PsMol and were found to be quite low (1.9-3.3 degrees) for clofarabine, emtricitabine, tenofovir, and triglycine polypeptide (Table S1). However, the dihedral RMSE is large for acyclovir (14.1 degrees). The same reasons as for the larger deviations in the angles (see main text) can be used to explain the large deviations in the dihedrals. For the acyclovir StdMol, a hydrogen bond is present between the hydroxyethoxymethyl tail and a guanosyl nitrogen atom, which forms an effective nine-membered ring and therefore restrains the conformation slightly (Figure S2-a). However, the PsMol lacks the nucleobase completely so geometry optimization of the PsMol alters the conformation of the hydroxyethoxymethyl tail of acyclovir significantly. Full QM/MM optimization of acyclovir, which includes the guanosyl group explicitly in the MM subsystem leads to a structure that more closely matches the StdMol (Figure S2-b). If the rmse of the QM/MM minimization is not better (18.1º), most of the error comes from the rotation of the hydroxyl hydrogen, removing it from the calculation decreases the rmse to 4.8º for the QM/MM calculation but only to 13.2º for the PsMol. This shows that the heavy atoms position aremuch better conserved during QM/MM minimization of the whole molecules than during QM minization of the PsMol. Table S1: RMS errors for dihedral angles in acyclovir, clofarabin, emtricitabine, ribavirin and tenofovir. Molecule Dihedral RMSE(°) aciclovir 14.1 clofarabine 3.3 emtricitabine 3.2 tenofovir 3.1 tetraglycine 1.9 Figure S2 : Representation of (a) the optimized (StdMol) structure of acyclovir and (b) a comparison of the traces of the StdMol (gold), PsMol (blue) (b.) and QM/MM-optimized molecule (purple). The PsMol does not contain any MM atoms so the guanosyl base is absent. In (a), the intramolecular hydrogen bond between the guanosyl group and the terminal hydroxyethoxymethyl group is shown in red. c. Bond Dissociation Errors table Table S2: Bond Dissociation (BD) energies and errors in kcal/mol and BD errors in %age of the BD energies for the total training set molecules. Molecule ADE THY GUA CYT URA rADE rTHY rGUA rCYT rURA BD Energy BD Error BD Error (kcal/mol) (kcal/mol) (% BD Energy) 346.4 9.7 2.8 346.4 8.8 2.5 346.4 16.6 4.8 346.5 6.8 2.0 346.4 9.6 2.8 421.7 9.6 2.3 421.7 1.7 0.4 421.6 16.6 3.9 421.7 2.5 0.6 421.7 2.7 0.6 d. Full Gaussian09 reference Gaussian 09, R. A., Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, Jr., J. A.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, N. J.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, Ö.; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Gaussian, Inc., Wallingford CT, 2009.