Supplementary Materials Properties of the Nucleic-acid Bases in Free and Watson-Crick Hydrogen-bonded States: Computational Insights into the Sequence-dependent Features of Double-helical DNA A. R. Srinivasan,1 Ronald R. Sauers,1 Marcia O. Fenley,3,4 Alexander H. Boschitsch,5 Atsushi Matsumoto,1,6,7 ‡ Andrew V. Colasanti,1, and Wilma K. Olson1,2, 1 * Department of Chemistry & Chemical Biology 2 BioMaPS Institute for Quantitative Biology Rutgers, the State University of New Jersey Wright-Rieman Laboratories 610 Taylor Road Piscataway, NJ 08854-8087, USA 3 4 Department of Physics Institute of Molecular Biophysics Florida State University Tallahassee, FL 32306-4380, USA 5 Continuum Dynamics, Inc. 34 Lexington Avenue Ewing, NJ 08618-2302, USA 6 Quantum Bioinformatics Team Center for Computational Science and Engineering 7 Research Unit for Quantum Beam Life Science Initiative Quantum Beam Science Directorate Japan Atomic Energy Agency 8-1 Umemidai Kizugawa Kyoto, 619-0215, Japan ‡ Current address: Provid Pharmaceuticals Inc., 671 U.S. Route 1, North Brunswick, NJ 08902. * To whom correspondence should be addressed: Tel: 732-445-3993; Fax: 732-445-5958; Email: wilma.olson@rutgers.edu. p. 2 Highlighted References Frisch MJ, Trucks GW, Schlegel HB et al. (2003) Gaussian 03. Gaussian, Inc., Pittsburgh, PA Frisch MJ, Trucks GW, Schlegel HB et al. (2001) Gaussian 98. Gaussian, Inc., Pittsburgh, PA These suites of ab initio quantum chemistry programs successfully predict numerous properties of molecules and reactions, in the gas phase and in solution, including the energies, structures, atomic charges, electrostatic potentials, and normal modes of the nucleic acid bases discussed herein. Boschitsch AH, Fenley MO, Zhou H-X (2002) Fast boundary element method for the linear Poisson-Boltzmann equation. J Phys Chem B 106:2741-2754 Boschitsch AH, Fenley MO (2004) Hybrid boundary element and finite difference method for solving the nonlinear Poisson-Boltzmann equation. J Comp Chem 25:935-955 The novel algorithms developed in these papers produce useful, highly detailed electrostatic potential surfaces of nucleic acids and other biomolecules in a simulated aqueous salt environment. Lu X-J, Olson WK (2003) 3DNA: a software package for the analysis, rebuilding, and visualization of threedimensional nucleic acid structures. Nucleic Acids Res 31:5108-5121 Lu X-J, Olson WK (2008) 3DNA: a versatile, integrated software system for the analysis, rebuilding, and visualization of three-dimensional nucleic-acid structures. Nature Protocols 3:1213-1227 This versatile, integrated software system facilitates the analysis, reconstruction, and visualization of threedimensional nucleic-acid-containing structures. Berman HM, Olson WK, Beveridge DL et al. (1992) The Nucleic Acid Database: a comprehensive relational database of three-dimensional structures of nucleic acids. Biophys J 63:751-759 This relational database assembles and distributes information about the high-resolution structures of nucleic acids. p. 3 List of Supporting Tables S1. Residual atomic charges, in esu, of nucleic-acid bases in free and Watson-Crick paired forms. S2. Database identities, refinement information, sequences, base-pair contents, and literature citations of high-resolution B-DNA structures surveyed in this study. S3. Comparative features of exocyclic amino groups in energy-optimized vs. observed DNA bases S4. Average surface electrostatic potential, in kcal mole–1e–1, of selected surface atoms on isolated DNA bases and Watson-Crick base pairs S5. Mean step parameters and deformational properties of AA·TT and GG·CC dimers constructed from optimized base pairs and subjected to configurational sampling p. 4 Table S1. Residual atomic charges, in esu, of nucleic-acid bases in free and Watson-Crick paired forms.† Purine Atom Free Adenine N1 C2 H2 N3 C4 C5 C6 N6 H61 H62 N7 C8 H8 N9 C1´ H1C1´ H2C1´ H3C1´ –0.8316 0.6585 0.0347 –0.7795 0.5241 –0.0703 0.7657 –0.8117 0.3643 0.3501 –0.5818 0.2194 0.1267 –0.1424 0.0084 0.0583 0.0366 0.0706 –0.7495 0.5662 0.0716 –0.7470 0.4933 –0.0227 0.7216 –0.8876 0.4339 0.3903 –0.6003 0.2267 0.1218 –0.1498 0.0399 0.0658 0.0279 0.0451 Guanine N1 H1 C2 N2 H21 H22 N3 C4 C5 C6 O6 N7 C8 H8 N9 C1´ H1C1´ H2C1´ H3C1´ –0.8967 0.4473 0.9656 –0.8687 0.3695 0.3551 –0.7611 0.4105 –0.0760 0.8724 –0.6592 –0.5253 0.1367 0.1394 –0.0654 –0.0466 0.0700 0.0543 0.0781 –0.9123 0.5174 0.9845 –0.9123 0.4186 0.3745 –0.7794 0.3743 –0.0357 0.8515 –0.7406 –0.5416 0.1130 0.1367 –0.0223 –0.0553 0.0494 0.0542 0.0878 †Structures Paired Paired Free Atom Pyrimidine –0.1612 0.8381 –0.6413 –0.7790 0.3995 0.8905 –0.6869 –0.2249 –0.1337 0.0640 0.0469 0.0680 –0.0636 0.1963 –0.0789 0.0670 0.0742 0.0777 –0.1129 0.8476 –0.6559 –0.7918 0.4063 0.8665 –0.6470 –0.1808 –0.1162 0.0635 0.0424 0.0635 –0.1144 0.2081 –0.1962 0.0907 0.1358 0.0907 N1 C2 O2 N3 H3 C4 O4 C5 C5M H51 H52 H53 C6 H6 C1´ H1C1´ H2C1´ H3C1´ Thymine –0.2366 0.9694 –0.7103 –0.9998 1.1582 –1.1693 0.5703 0.4541 –0.6905 0.2122 0.1840 0.1524 –0.1091 0.0834 0.0729 0.0962 –0.2442 1.0039 –0.6951 –0.8906 0.9500 –0.8817 0.3820 0.3539 –0.6436 0.2111 0.1530 0.1550 –0.0795 0.0608 0.0896 0.0754 N1 C2 O2 N3 C4 N4 H41 H42 C5 H5 C6 H6 C1´ H1C1´ H2C1´ H3C1´ Cytosine of free and paired bases obtained from calculations based on second-order Møller-Plesset perturbation theory within the Gaussian 98 and Gaussian 03 suites of programs [1, 2] starting with standard nucleic-acid base [3] and base-pair [4] models. Partial atomic charges computed at the MP2/6-311+G**//MP2/6-31G* level of model chemistry and fitted to the electrostatic potential obtained through the CHelpG (CHarges from electrostatic potentials using a Grid-based method) scheme [5] as incorporated in Gaussian. p. 5 Table S2. Database identities, refinement information, sequences, base-pair contents, and literature citations of high-resolution B-DNA structures surveyed in this study. NDB_ID† Resolution (Å) R-value (%) DNA sequence A·T G·C Reference BD0001 1.6 17.3 5´-d(ApCpCpGpApCpGpTpCpGpGpT)-3´ 2 6 1 BD0005 1.75 21.8 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´ 4 4 2 BD0006 1.15 17.22 5´-d(GpGpCpCpApApTpTpGpG)-3´ 4 4 4 3 BD0007 1.1 16.2 5´-d(CpGpCpGpApApTAFpTpCpGpCpG)-3´ BD0009 1.6 19.6 5´-d(CpGpCpGpAOCH3pApTpCpCpGpCpG)-3´ 2 5 2 6 BD0010 2 23.2 5´-d(CpGpCpGpAOCH3pApTpTpCpGpCpG)-3´ BD0012 1.2 17.9 5´-d(CpGpCpGpApApTAFpTpCpGpCpG)-3´ 4 7 4 7 2 8 6 9 BD0013 1.5 18.6 5´-d(CpGpCpGpApApTAFpTpCpGpCpG)-3´ BD0014 1.45 21.72 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´ 4 5´-d(CpBrUpCpCpBrUpCpCpGpCpGpCpG)-3´ BD0017 1.8 21 5´-d(CpGpCpGpCp GpGpApG)-3´ BD0018 1.3 18.2 5´-d(GpCpGpApApTpTpCpGpCpG)-3´ 4 2 10 BD0019 1.7 19.4 5´-d(GpGpCpGpApApTpTpCpGpCpG)-3´ 4 2 10 4 BD0021 1.55 20.31 5´-d(CpCpApGpGXL1pCpCpTpGpG)-3´ BD0023 0.74 10.5 5´-d(CpCpApGpTpApCpTpGpG)-3´ 1 1 12 BD0029 1.82 20.9 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´ 4 4 13 BD0030 0.95 16 5´-d(CpGpCpGpApApTAFpTpCpGpCpG)-3´ 4 14 2 6 1 15 11 BD0031 1.6 21.8 5´-d(CpGpCpGpAOCH3pApTpTpCpGpCpG)-3´ BD0032 1.8 22.4 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´ 1 BD0033 0.98 14.09 5´-d(CpCpApApCpGpTpTpGpG)-3´ 2 16 BD0034 0.98 11.83 5´-d(CpCpApApCpGpTpTpGpG)-3´ 2 16 p. 6 NDB_ID† Resolution (Å) R-value (%) DNA sequence A·T G·C Reference BD0035 0.98 14.05 5´-d(CpCpApGpCpGpCpTpGpG)-3´ 1 1 16 BD0036 0.98 12.14 5´-d(CpCpApGpCpGpCpTpGpG)-3´ 1 1 16 BD0037 0.89 13.5 5´-d(GpCpGpApApTpTpCpG)-3´ 4 BD0038 1.43 19.8 5´-d(CpGpCpGpApApTLCpCTLpCpGpCpG)-3´ BD0041 1.2 14.1 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´ BD0043 1.57 21 17 2 18 4 19 5´-d(CpGpCpGpApApTpUFORpCpGpCpG)-3´ 2 20 2 20 4 BD0044 1.55 20.7 5´-d(CpGpCpGpApApTpUFORpCpGpCpG)-3´ BD0045 1.85 24 5´-d(CpGpCpGpApApTpUFORpCpGpCpG)-3´ 2 20 2 20 BD0046 1.8 20.8 5´-d(CpGpCpGpApApTpUFORpCpGpCpG)-3´ BD0048 1.6 22.4 5´-d(CpGpCpGpApApTpTpCOCH3pGpCpGP)-3´ BD0050 2 24 5´-d(GpGpCpGpCpC)-3´ BD0051 1.6 18.4 5´-d(CpCpTpTpTpApApApGpG)-3´ 6 23 2 24 2 21 2 22 BD0053 1.6 21.8 5´-d(CpGpCpApApApTpTpCOCH3pGpCpG)-3´ BD0054 1.2 16.3 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´ 4 BD0055 2 13.4 5´-d(CpCpApApIpApTpTpGpG)-3´ 1 BD0060 1.2 16 5´-d(CpGpCpGpApApTTFpTpCpGpCpG)-3´ BD0066 1.6 21.6 5´-d(CpGpCpApApTpTpGpCpG)-3´ 1 BD0067 1.53 19.9 5´-d(CpGpCpApApApTpTpTpGpCpG)-3´ 6 2 29 BD0070 1.6 20.7 5´-d(CpGpCpTpGpGpApApApTpTpTpCpCpApGpC)-3´ 3 2 30 BD0071 0.89 12.63 5´-d(CpTpTpTpTpApApApApG)-3´ 6 31 2 32 33 4 25 26 4 27 28 BD0072 1.65 22.1 5´-d(CpGpCpGpApApTpTpC5FpGpCpG)-3´ BD0073 2 23.7 5´-d(CpCpApTpTpApApTpGpG)-3´ 6 BD0077 1.5 25.3 5´-d(CpCpGpTpTpApApCpGpG)-3´ 4 2 34 BD0079 0.99 29 5´-d(CpCpApGpCpGpCpTpGpG)-3´ 1 2 34 p. 7 NDB_ID† Resolution (Å) R-value (%) DNA sequence A·T G·C Reference BD0080 1.05 27.6 5´-d(CpCpGpTpCpGpApCpGpG)-3´ 2 4 34 BD0081 1.65 23.3 5´-d(CpCpGpCpCpGpGpCpGpG)-3´ 6 34 BD0082 2 26.4 5´-d(CpCpGpApTpApTpCpGpG)-3´ 4 2 34 BD0084 1.75 31.4 5´-d(CpCpGpApGpCpTpCpGpG)-3´ 2 4 34 BD0087 0.94 42.8 5´-d(CpCpGpApApTpTpCpGpG)-3´ 1 1 34 BDJ008 1.3 16.4 5´-d(CpCpApApGpAppTpTpGpG) 1 BDJ017 1.6 16 5´-d(CpCpApGpGpCpCpTpGpG)-3´ 1 BDJ019 1.4 16 5´-d(CpCpApApCpGpTpTpGpG)-3´ 2 BDJ025 1.5 16.1 5´-d(CpGpApTpCpGpApTpCpG)-3´ 4 BDJ031 1.5 15.7 5´-d(CpGpApTpTpApApTpCpG)-3´ 6 38 BDJ036 1.7 17.8 5´-d(CpGpApTpApTpApTpCpG)-3´, Calcium 6 39 BDJ037 2 16.5 5´-d(CpGpApTpApTpApTpCpG)-3´, Magnesium 6 39 BDJ051 2 19.6 5´-d(CpApTpGpGpCpCpApTpG)-3´ 2 4 40 BDJ052 1.9 17.9 5´-d(CpCpApApGpCpTpTpGpG)-3´, Calcium 4 2 41 BDJ060 1.7 20 5´-d(CpTpCpTpCpGpApGpApG)-3´ 2 4 42 BDJ061 1.95 17 5´-d(CpCpApCpTpApGpTpGpG)-3´ 4 2 43 BDJ081 1.85 23.3 5´-d(CpApApApGpApApApApG)-3´ 15 3 44 BDJB44 1.3 15.2 5´-d(CpCpApApCpIpTpTpGpG)-3´, monoclinic 1 BDL001 1.9 17.8 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´, 290 K 4 4 46 BDL005 1.9 14.9 4 4 47 35 1 36 35 2 37 45 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´, 290 K, (anisotropic thermal motion model) BDL020 1.9 18.8 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´, 290 K, (re-refinement) 4 4 48 BDL084 1.4 19.7 5´-d(CpGpCpGpApApTpTpCpGpCpG)-3´ 4 4 49 BDLB13 2 16.9 5´-d(CpGpCpGpApACH3pTpTpCpGpCpG)-3´ 4 50 p. 8 NDB_ID† Resolution (Å) R-value (%) DNA sequence A·T BDLB26 2 18.5 5´-d(CpGpCpGCH3pApApTpTpTpGpCpG)-3´ 2 BDLB84 1.55 20.8 5´-d(CpGpCpGpApApTFlOpTFlO pCpGpCpG)-3´ 2 52 4 52 † G·C Reference 51 BDLB85 1.55 21.8 5´-d(CpGpCpGpApApTFlOpTpCpGpCpG)-3´ UD0023 1.97 22.3 5´-d(TpCpGpGpTpApCpCpGpA)-3´ 1 4 53 UD0024 2 24.6 5´-d(CpCpGpGpTpApCpCpGpG)-3´ 1 4 53 UD0025 1.8 20.3 5´-d(CpCpGpGpTpApCpCpGpG)-3´ 1 4 53 UD0026 1.5 20.69 5´-d(TpCpGpGpTpApCpCpGpA)-3´ 1 4 54 UD0028 1.7 22.9 5´-d(CpCpGpGpCpGpCpCpGpG)-3´ 5 55 UD0029 2 23.6 5´-d(CpCpApGpTpApCpTpGpG)-3´ 1 1 55 UD0030 1.9 21.5 5´-d(CpCpApGpTpApCpBrUpGpG)-3´ 1 UDJ049 2 20.9 5´-d(GpGpCpCpApApTpTpGpG)-3´ 4 55 1 56 NDB_ID refers to the identification code of the B-DNA structure in the Nucleic Acid Database (Berman, H. M., Olson, W. K., Beveridge, D. L., Westbrook, J., Gelbin, A., Demeny, T., Hsieh, S.-H., Srinivasan, A. R., Schneider, B. (1992) “The Nucleic Acid Database: a comprehensive relational database of three-dimensional structures of nucleic acids.” Biophys. J. 63, 751-759). p. 9 References to Table S2: 1. H. Rozenberg, D. Rabinovich, F. Frolow, R.S. Hegde & Z. Shakked (1998) “Structural code for DNA recognition revealed in crystal structures of Papillomavirus E2-DNA targets.” Proc. Natl. Acad. Sci., USA 95, 15194-15199. 2. X. Shui, C. C. Sines, L. McFail-Isom, D. VanDerveer & L. D. Williams (1998) “Structure of the potassium form of CGCGAATTCGCG: DNA deformation by electrostatic collapse around inorganic cations.” Biochemistry 37, 16877-16887. 3. D. Vlieghe, J.P. Turkenburg & L. Van Meervelt (1999) “B-DNA at atomic resolution reveals extended hydration patterns.” Acta Crystallogr. Sect. D 55, 1495-1502. 4. Tereshko, V., Minasov, G., Egli, M. (1999) “The Dickerson-Drew B-DNA dodecamer revisited at atomic resolution.” J. Am. Chem. Soc. 121, 470-471. 5. Chatake, T., Ono, A., Ueno, Y., Matsuda, A., Takenaka, A. (1999) “Crystallographic studies on damaged DNAs. I. An N(6)-methoxyadenine residue forms a Watson-Crick pair with a cytosine residue in a B-DNA duplex.” J. Mol. Biol. 294, 1215-1222. 6. Chatake, T., Hikima, T., Ono, A., Ueno, Y., Matsuda, A., Takenaka, A. (1999) “Crystallographic studies on damaged DNAs. II. N(6)-methoxyadenine can present two alternate faces for Watson-Crick base-pairing, leading to pyrimidine transition mutagenesis.” J. Mol. Biol. 294, 1223-1230. 7. Tereshko, V., Minasov, G., Egli, M. (1999) “A ‘hydrat-ion spine’ in a B-DNA minor groove.” J. Am. Chem. Soc. 121, 3590-3595. 8. J. Liu & J.A. Subirana (1999) “Structure of d(CGCGAATTCGCG) in the presence of Ca 2+ ions.” J. Biol. Chem. 274, 24749-24752. 9. Rhee, S., Han, Z., Liu, K., Miles, H.T., Davies, D.R. (1999) “Structure of a triple helical DNA with a triplex-duplex junction.” Biochemistry 38, 1681016815. 10. G. Minasov, V. Tereshko & M. Egli (1999) “Atomic-resolution crystal structures of B-DNA reveal specific influences of divalent metal ions on conformation and packing.” J. Mol. Biol. 291, 83-99. 11. van Aalten, D.M., Erlanson, D.A., Verdine, G.L., Joshua-Tor, L. (1999) “A structural snapshot of base-pair opening in DNA.” Proc. Natl. Acad. Sci., USA 96, 11809-11814. 12. C. L. Kielkopf, S. Ding, P. Kuhn & D. C. Rees (2000) “Conformational flexibility of B-DNA at 0.74 Å resolution: d(CCAGTACTGG)2.” J. Mol. Biol. 296, 787-801. 13. K.K. Woods, L. McFail-Isom, C.C. Sines, S.B. Howerton, R.K. Stephens & L.D. Williams (2000) “Monovalent cations sequester within the A-tract minor groove of [d(CGCGAATTCGCG)]2.” J. Am. Chem. Soc. 122, 1546-1547. 14. Egli, M., Tereshko, V., Teplova, M., Minasov, G., Joachimiak, A., Sanishvili, R., Weeks, C.M., Miller, R., Maier, M.A., An, H., Dan Cook, P., Manoharan, M. (1998) “X-ray crystallographic analysis of the hydration of A- and B-form DNA at atomic resolution.” Biopolymers 48, 234-252. 15. Johansson, E., Parkinson, G., Neidle, S. (2000) “A new crystal form for the dodecamer C-G-C-G-A-A-T-T-C-G-C-G: symmetry effects on sequencedependent DNA structure.” J. Mol. Biol. 300, 551-561. 16. Chiu, T.K., Dickerson, R.E. (2000) “1 A crystal structures of B-DNA reveal sequence-specific binding and groove-specific bending of DNA by magnesium and calcium.” J. Mol. Biol. 301, 915-945. 17. Soler-Lopez, M., Malinina, L., Subirana, J.A. (2000) “Solvent organization in an oligonucleotide crystal. The structure of d(GCGAATTCG) 2 at atomic resolution.” J. Biol. Chem. 275, 23034-23044. 18. Minasov, G., Teplova, M., Nielsen, P., Wengel, J., Egli, M. (2000) “Structural basis of cleavage by RNase H of hybrids of arabinonucleic acids and RNA.” Biochemistry 39, 3525-3532. 19. Sines, C.C., McFail-Isom, L., Howerton, S.B., VanDerveer, D., Williams, L.D. (2000) “Cations mediate B-DNA conformational heterogeneity.” J. Am. Chem. Soc. 122, 11048-11056. 20. Tsunoda, M., Karino, N., Ueno, Y., Matsuda, A., Takenaka, A. (2001). “Crystallization and preliminary X-ray analysis of a DNA dodecamer containing 2´-deoxy-5-formyluridine; what is the role of magnesium cation in crystallization of Dickerson-type DNA dodecamers?” Acta Crystallogr. Sect. D 57, 345-348. p. 10 21. Hossain, M.T., Chatake, T., Hikima, T., Tsunoda, M., Sunami, T., Ueno, Y., Matsuda, A., Takenaka, A. (2001) “Crystallographic studies on damaged DNAs: III. N(4)-methoxycytosine can form both Watson-Crick type and wobbled base pairs in a B-form duplex.” J. Biochem. (Tokyo) 130, 9-12. 22. Vargason, J.M., Henderson, K., Ho, P.S. (2001) “A crystallographic map of the transition from B-DNA to A-DNA.” Proc. Natl. Acad. Sci., USA 98, 7265-7270. 23. Mack, D.R., Chiu, T.K., Dickerson, R.E. (2001) “Intrinsic bending and deformability at the T-A step of CCTTTAAAGG: a comparative analysis of T-A and A-T steps within A-tracts.” J. Mol. Biol. 312, 1037-1049. 24. Hossain, M.T., Sunami, T., Tsunoda, M., Hikima, T., Chatake, T., Ueno, Y., Matsuda, A., Takenaka, A. (2001) “Crystallographic studies on damaged DNAs IV. N(4)-methoxycytosine shows a second face for Watson-Crick base-pairing, leading to purine transition mutagenesis.” Nucleic Acids Res. 29, 3949-3954. 25. Howerton, S.B., Sines, C.C., VanDerveer, D., Williams, L.D. (2001) “Locating monovalent cations in the grooves of B-DNA.” Biochemistry 40, 1002310031. 26. Lipanov, A.A., Kopka, M.L., Kaczor-Grzeskowiak, M., Dickerson, R.E. (1993) “Structure of the B-DNA decamer C-C-A-A-C-I-T-T-G-G in two different space groups: conformational flexibility of B-DNA.” Biochemistry 32, 1373-1389. 27. Wilds, C.J., Wawrzak, Z., Krishnamurthy, R., Eschenmoser, A., Egli, M. (2002) “Crystal structure of a B-form DNA duplex containing (L)-alphathreofuranosyl (3´-->2´) nucleosides: a four-carbon sugar is easily accommodated into the backbone of DNA.” J. Am. Chem. Soc. 124, 13716-13721. 28. Valls, N., Wright, G., Steiner, R.A., Murshudov, G.N., Subirana, J.A. (2004) “DNA variability in five crystal structures of d(CGCAATTGCG).” Acta Crystallogr. Sect. D 60, 680-685. 29. Woods, K.K., Maehigashi, T., Howerton, S.B., Sines, C.C., Tannenbaum, S., Williams, L.D. (2004) “High-resolution structure of an extended A-tract: [d(CGCAAATTTGCG)]2.” J. Am. Chem. Soc. 126, 15330-15331. 30. Huang, D.B., Phelps, C.B., Fusco, A.J., Ghosh, G. (2005) “Crystal structure of a free kB DNA: insights into DNA recognition by transcription factor NF-kB.” J. Mol. Biol. 346, 147-160. 31. Han, G.W., Langs, D., Kopka, M.L., Dickerson, R.E. (to be published) “The ultra-high resolution structure of d(CTTTTAAAAG)2: modulation of bending by T-A steps and its role in DNA recognition.” 32. Kimura, K., Ono, A., Watanabe, K., Takenaka, A. (to be published) “X-Ray analyses of oligonucleotides containing 5-formylcytosine, suggest a structural reason for the codon-anticodon recognition of mitochondrial tRNA-Met.” 33. Arai, S., Chatake, T., Ohhara, T., Kurihara, K., Tanaka, I., Suzuki, N., Fujimoto, Z., Mizuno, H., Niimura, N. (2005). “Complicated water orientations in the minor groove of the B-DNA decamer d(CCATTAATGG)2 observed by neutron diffraction measurements” Nucleic Acids Res. 33, 3017-3024. 34. Hays, F.A., Teegarden, A.T., Jones, Z.J.R., Harms, M., Raup, D., Watson, J., Cavaliere, E., Ho, P.S. (2005) “How does sequence define structure? A crystallographic map of DNA structure and conformation.” Proc. Natl. Acad. Sci., USA 102, 7157-7162. 35. G. G. Privé, K. Yanagi & R. E. Dickerson (1991) “Structure of the B-DNA decamer C-C-A-A-C-G-T-T-G-G and comparison with isomorphous decamers C-C-A-A-G-A-T-T-G-G and C-C-A-G-G-C-C-T-G-G.” J. Mol. Biol. 217, 177-199. 36. U. Heinemann & C. Alings (1989) “Crystallographic study of one turn of G/C-rich B-DNA.” J. Mol. Biol. 210, 369-381. 37. K. Grzeskowiak, K. Yanagi, G. G. Privé & R. E. Dickerson (1991) “The structure of B-helical C-G-A-T-C-G-A-T-C-G and comparison with C-C-A-AC-G-T-T-G-G. The effect of base pair reversals.” J. Biol. Chem. 266, 8861-8883. 38. J. R. Quintana, K. Grzeskowiak, K. Yanagi & R. E. Dickerson (1992) “The structure of a B-DNA decamer with a central T-A step: C-G-A-T-T-A-A-TC-G.” J. Mol. Biol. 225, 379-395. 39. H. Yuan, J. Quintana & R. E. Dickerson (1992) “Alternative structures for alternating poly(dA-dT) tracts: the structure of the B-DNA decamer C-G-AT-A-T-A-T-C-G” Biochemistry 31, 8009-8021. 40. D. S. Goodsell, M. L. Kopka, D. Cascio & R. E. Dickerson (1993) “Crystal structure of CATGGCCATG and its implications for A-tract bending models.” Proc. Natl. Acad. Sci., USA 90, 2930-2934. p. 11 41. K. Grzeskowiak, D. S. Goodsell, M. Kaczor-Grzeskowiak, D. Cascio & R. E. Dickerson (1993) “Crystallographic analysis of C-C-A-A-G-C-T-T-G-G and its implications for bending in B-DNA.” Biochemistry 32, 8923-8931. 42. D. S. Goodsell, K. Grzeskowiak & R. E. Dickerson (1995) “Crystal structure of C-T-C-T-C-G-A-G-A-G: implications for the structure of the Holliday junction.” Biochemistry 34, 1022-1029. 43. Z. Shakked, G. Guzikevich-Guerstein, F. Frolow, D. Rabbinovich, A. Joachimiak & P. B. Sigler (1994) “Determinants of repressor/operator recognition from the structure of the trp operator binding site.” Nature 368, 469-473. 44. G. W. Han, M. L. Kopka, D. Cascio, K. Grzeskowiak &, R. E. Dickerson (1997) “Structure of a DNA analog of the primer for HIV-1 RT second strand synthesis.” J. Mol. Biol. 269, 811-826. 45. Lipanov, A., Kopka, M.L., Kaczor-Grzeskowiak, M., Quintana, J., Dickerson, R.E. (1993) “Structure of the B-DNA decamer C-C-A-A-C-I-T-T-G-G in two different space groups: conformational flexibility of B-DNA.” Biochemistry 32, 1373-1389. 46. H. R. Drew, R. M. Wing, T. Takano, C. Broka, S. Tanaka, K. Itakura & R. E. Dickerson (1981) “Structure of a B-DNA dodecamer: conformation and dynamics.” Proc. Natl. Acad. Sci., USA 78, 2179-2183. 47. S. R. Holbrook, R. E. Dickerson & S.-H. Kim (1985) “Anisotropic thermal-parameter refinement of the DNA dodecamer CGCGAATTCGCG by the segmented rigid-body method.” Acta Crystallogr. Sect. B 41, 255-262. 48. E. Westhof (1987) “Re-refinement of the B-dodecamer d(CGCGAATTCGCG) with a comparative analysis of the solvent in it and in the Z-hexamer d(5BrCG5BrCG5BrCG).” J. Biomol. Struct. Dynam. 5, 581-600. 49. X. Shui, L. McFail-Isom, G. G. Hu & L. D. Williams (1998) “The B-DNA dodecamer at high resolution reveals a spine of water on sodium.” Biochemistry 37, 8341-8355. 50. Frederick, C.A., Quigley, G.J., van der Marel, G.A., van Boom, J.H., Wang, A.H., Rich, A. (1988) “Methylation of the EcoRI recognition site does not alter DNA conformation: the crystal structure of d(CGCGAm6ATTCGCG) at 2.0-Å resolution.” J. Biol. Chem. 263, 17872-17879. 51. Leonard, G.A., Thomson, J., Watson, W.P., Brown, T. (1990) “High-resolution structure of a mutagenic lesion in DNA.” Proc. Natl. Acad. Sci., USA 87, 9573-9576. 52. Berger, I., Tereshko, V., Ikeda, H., Marquez, V.E., Egli, M. (1998) “Crystal structures of B-DNA with incorporated 2´-deoxy-2´-fluoro-arabinofuranosyl thymines: implications of conformational preorganization for duplex stability.” Nucleic Acids Res. 26, 2473-2480. 53. Cardin, C.J., Gale, B.C., Thorpe, J.H., Texieira, S.C.M., Gan, Y., Moraes, M.I.A.A., Brogden, A.L. (to be published) “Structural analysis of two Holliday junctions formed by the sequences TCGGTACCGA and CCGGTACCGG.” 54. Cardin, C.J., Thorpe, J.H., Gale, B.C., Teixeira, S.C.M. (to be published). “Strontium, a MAD target for the DNA Holliday junction.” 55. Hays, F.A., Vargason, J.M., Ho, P.S. (2003). “Effect of sequence on the conformation of DNA Holliday junctions.” Biochemistry 42, 9586-9597. 56. Vlieghe, D., Van Meervelt, L., Dautant, A., Gallois, B., Precigoux, G., Kennard, O. (1996) “Parallel and antiparallel (G.GC) 2 triple helix fragments in a crystal structure.” Science 273, 1702-1705. 57. Berman, H. M., Olson, W. K., Beveridge, D. L., Westbrook, J., Gelbin, A., Demeny, T., Hsieh, S.-H., Srinivasan, A. R., Schneider, B. (1992) “The Nucleic Acid Database: a comprehensive relational database of three-dimensional structures of nucleic acids.” Biophys. J. 63, 751-759. p. 12 Table S3. Comparative features of exocyclic amino groups in energy-optimized vs. observed DNA bases.† Base Adenine Prediction Neutron [6] Neutron [7] Infrared [8] X-ray Guanine Prediction X-ray Cytosine Prediction Neutron [9] Infrared [10] X-ray † Torsion angles (deg) C5–C6–N6–H62 ±21.4 –11.6 (11.6) 13.1 2.8 – – N1–C6–N6–H61 19.2 11.6 (–11.6) ±5.3 ±6.9 – – C4–C5–C6–N6 ±177.1 –178.1 (178.0) ±178.6 ±179.4 ~ 160° 177.3 (±2.2) C2–N1–C6–N6 176.8 178.1 (–178.1) 178.9 ±179 ~ –160° –178.1 (±1.3) free pair pair N3–C2–N2–H22 ±11.7 –16.3 (16.3) – N1–C2–N2–H21 43.9 23.0 (–22.6) – C6–N1–C2–N2 177.0 176.2 (–176.3) –179.9 (±1.6) C4–N3–C2–N2 ±175.5 –175.9 (176.0) 178.6 (±2.2) free C5–C4–N4–H42 28.1 N3–C4–N4–H41 ±15.0 C2–N3–C4–N4 ±176.5 C6–C5–C4–N4 177.0 free pair free free free pair pair free free pair –5.6 (5.8) ±0.1 – – 7.1 (–7.2) ±3.9 ~ 16° – –179.8 (179.8) 179.2 – –178.9 (±2.5) –179.7 (179.7) ±177.7 – 178.9 (±2.3) Predicted base nonplanarity obtained from calculations based on second-order Møller-Plesset perturbation theory compared to experimental findings. Numerical values in parentheses correspond to base pairs in a secondary, higher energy minimum with positive propeller twist. (The minimum-energy base-pair structure has negative propeller twist; see text). Observations come from neutron-diffraction studies of unpaired (free) bases,[6, 7, 9] infrared measurements of the vibrational transition moments of free bases,[8, 10] and analyses of ultra-high resolution (0.99 Å or better resolution) B-DNA crystal structures (Table S2). Mean values and standard deviations (subscripted values in parentheses) based on the designated torsions in 27 A·T pairs from 16 different structures and 36 G·C pairs from 15 different structures. p. 13 Table S4. Average surface electrostatic potential, in kcal mole–1e–1, of selected surface atoms on isolated DNA bases and Watson-Crick base pairs.1 Purine Atom2 potential Atom2 Base-pair potential edge3 potential Pyrimidine potential G·C 0.13 G(N1) –0.14 WC –0.08 C(N3) –0.24 0.19 G(N2)§ –0.19 WC,m –0.17 C(O2) –0.40 0.05 G(N3) –0.14 m –0.33 G(O6) –0.40 WC,M –0.06 C(N4)4 0.04 –0.29 G(N7) –0.38 M 0.18 C(C5) 0.16 A·T 1 –0.14 A(N1) 0.06 WC –0.04 T(N3) 0.06 –0.09 A(N3) –0.16 m –0.18 T(O2) –0.11 –0.07 A(N6)§ –0.17 WC,M –0.20 T(O4) –0.18 –0.11 A(N7) –0.23 M Electrostatic potentials of specific atomic sites calculated by taking the average of the potential determined at accessible points in the aqueous milieu surrounding the given atoms. Sampled points reside on the surface of spheres 1.0 Å beyond the van der Waals’ radii of the selected atoms. The potential is determined by treating the DNA solute as a low dielectric region with a dieleletric constant of 2 to mimic solute polarizability and the exterior solvent region with a dielectric constant of 80, and then solving the Poisson equation (at 298 K and zero ionic strength) with a fast-multipole accelerated boundary-element method [11, 12]. The static atomic point-charge distribution is represented by the derived partial charges. The set of atomic radii R (in Å) used to define the molecular surface is based on the Parse parameter set [13]: RH = 1.0: RC = 1.7; RN = 1.5; RO = 1.4. The dielectric interface separating solute and solvent regions is defined by the solvent-excluded molecular surface [14], obtained with a solvent-probe radius of 1.4 Å. The MSMS algorithm [15] is used to tessellate the solvent-excluded surface, which is represented by up to 10,000 curved triangular elements. A second-order multipole and Taylor series are invoked and the maximum number of boundary elements per terminal octree box is set to 2. All other default code parameters are employed [12]. 2 Atoms, which are significantly neutralized by Watson-Crick base-pair formation (i.e., the average surface electrostatic potential differs by 0.10 kcal mole–1e–1 or more), are highlighted in boldface. 3 WC: Watson-Crick hydrogen-bonded edge; m: minor-groove edge; M: major-groove edge. 4 Value omits contribution of pendant hydrogens. p. 14 Table S5. Dimer Mean step parameters and deformational properties of AA·TT and GG·CC dimers constructed from optimized base pairs and subjected to configurational sampling.1 Tilt (°) Roll (°) Twist (°) Shift (Å) Slide (Å) Rise (Å) Dx Dy Dz Emin ln z Free optimized base pairs2 AA·TT –0.3 (±1.5) 1.0 (±4.9) 34.0 (±3.5) 0.02 (±0.51) 0.15 (±0.60) 3.26 (±0.10) –26.1 6.3 GG·CC 0.2 (±1.6) –1.5 (±3.8) 33.7 (±3.5) –0.09 (±0.52) 0.29 (±0.58) 3.23 (±0.07) –27.5 6.0 Free B-DNA base pairs2 AA·TT –0.1 (±1.7) 1.0 (±4.0) 33.3 (±3.2) –0.01 (±0.46) 0.20 (±0.51) 3.21 (±0.05) –27.2 5.6 GG·CC 0.2 (±1.6) –1.2 (±3.7) 33.7 (±3.5) –0.09 (±0.51) 0.35 (±0.62) 3.22 (±0.07) –27.6 6.1 Observed base-pair steps3 AA·TT –1.2 (±2.5) 0.6 (±4.7) 34.8 (±3.7) 0.08 (±0.38) –0.18 (±0.37) 3.23 (±0.16) GG·CC –0.2 (±3.4) 5.1 (±4.0) 33.3 (±4.1) –0.10 (±0.69) –0.43 (±0.65) 3.41 (±0.22) 1 Boltzmann-averaged step parameters and standard deviations computed over 124,215 configurational states where: = [–3°, = [–18°, +18°] at 3° intervals; = [30°, 42°] at 2° intervals; Dx = [–1 Å, +1 Å] at 0.5 Å intervals; Dy = [–2.4 Å, +2.4 Å] at 0.4 Å intervals; Dz = [3.2 Å, 3.6 Å] at 0.2 Å intervals. Emin is the minimum energy between free base pairs, expressed in kcal/mole, and z = exp(–Ei/RT) is the configurational partition function evaluated over all states i sampled at 298 K. Energy contributions based on an updated nucleic-acid force field that accounts for the sequencedependent conformational features of the Dickerson-Drew dodecamer in both the solid state and the aqueous liquidcrystalline phase [16]. Thymine C5 and C1´ methyl groups treated as united atoms with a van der Waals’ radius of 2.39 Å. Dielectric constant set to 4 throughout. 2 Optimized base pairs assigned the base-pair parameters in Table 1 found from ab initio calculations; B-DNA base pairs assigned the mean parameters found in high-resolution X-ray structures. Atoms assigned partial atomic charges from Table S1. Standard deviations of roll vs. tilt and twist, and shift and slide compared to rise mimic observed deformations. The relative bending, twisting, and sliding of AA·TT and GG·CC base-pair steps, however, differs from experiment. If the charges are reduced in half, to simulate interactions with solvent, the fraction of GG·CC pairs assuming positive rather than negative values of roll increases from 0.35 to 0.39 and the fraction of AA·TT pairs decreases from 0.58 to 0.57. 3 Data based on the analysis, within 3DNA [17], of 421 AA·TT and 317 GG·CC steps in 239 DNA-protein crystal complexes of 2.5 Å or better resolution without chemical modifications, mismatches, or drugs from the Nucleic Acid Database [18]. The dataset includes 101 structures of double-helical DNA bound to enzymes, 121 duplexes in the presence of regulatory proteins, 16 complexes with structural proteins, and one DNA associated with a multifunctional protein [19]. Structures filtered to exclude over-represented complexes in order to obtain a balanced sample of spatial and functional forms. Mean values and standard deviations (subscripted values in parentheses) exclude terminal base pairs, side groups attached to nicked backbone strands, and base pairs that stacked against modified or mispaired 3´- and 5´-nucleotides. p. 15 References to Supplementary Materials 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. M.J. Frisch, G.W. Trucks, H.B. Schlegel, G.E. Scuseria, M.A. Robb, et al. 2001. Gaussian 98. Pittsburgh, PA: Gaussian, Inc. M.J. Frisch, G.W. Trucks, H.B. Schlegel, G.E. Scuseria, M.A. Robb, et al. 2003. Gaussian 03. Pittsburgh, PA: Gaussian, Inc. L. Clowney, S.C. Jain, A.R. Srinivasan, J. Westbrook, W.K. Olson & H.M. Berman (1996) Geometric parameters in nucleic acids: nitrogenous bases, J. Am. Chem. Soc. 118, 509-518. W.K. Olson, M. Bansal, S.K. Burley, R.E. Dickerson, M. Gerstein, S.C. Harvey, U. Heinemann, X.-J. Lu, S. Neidle, Z. Shakked, H. Sklenar, M. Suzuki, C.-S. Tung, E. Westhof, C. Wolberger & H.M. Berman (2001) A standard reference frame for the description of nucleic acid base-pair geometry, J. Mol. Biol. 313, 229-237. C.M. Breneman & K.B. Wiberg (1990) Determining atom-centered monopoles from molecular electrostatic potentials. The need for high sampling density in formamide conformational analysis, J. Comp. Chem. 11, 361-373. R.K. McMullan, P. Benci & B.M. Craven (1980) The neutron crystal structure of 9methyladenine at 126 K, Acta Cryst. B36, 1424-1430. W.T. Klooster, J.R. Ruble, B.M. Craven & R.K. McMullan (1991) Structure and thermal vibrations of adenosine from neutron diffraction data at 123 K, Acta Cryst. B47, 376-383. F. Dong & R.E. Miller (2002) Vibrational transition moment angles in isolated biomolecules: a structural tool, Science 298, 1227-1230. H.P. Weber, B.M. Craven & R.K. McMullan (1980) The structure of deuterated cytosine monohydrate at 82 K by neutron diffraction, Acta Cryst. B36, 645-649. M.Y. Choi, F. Dong & R.E. Miller (2005) Multiple tautomers of cytosine identified and characterized by infrared laser spectroscopy in helium nanodroplets: probing structure using vibrational transition moment angles., Philos. Transact. A Math. Phys. Eng. Sci. 363, 393-412. A.H. Boschitsch, M.O. Fenley & W.K. Olson (1999) A fast adaptive multipole algorithm for calculating screened Coulomb (Yukawa) interactions, J. Comp. Phys. 151, 212-241. A.H. Boschitsch, M.O. Fenley & H.-X. Zhou (2002) Fast boundary element method for the linear Poisson-Boltzmann equation, J. Phys. Chem. B 106, 2741-2754. D. Sitkoff, K.A. Sharp & B. Honig (1994) Accurate calculation of hydration free energies using macroscopic solvent models, J. Phys. Chem. B 98, 1978-1988. B. Lee & F.M. Richards (1971) The interpretation of protein structures: estimation of static accessibility, J. Mol. Biol. 55, 379-400. M. Sanner, A.J. Olson & J.C. Spehner (1996) Reduced surface: an efficient way to compute molecular surfaces, Biopolymers 38, 305-320. L. Wang, B.E. Hingerty, A.R. Srinivasan, W.K. Olson & S. Broyde (2002) Accurate representation of B-DNA double helical structure with implicit solvent and counterion, Biophys J. 83, 382-406. X.-J. Lu & W.K. Olson (2003) 3DNA: a software package for the analysis, rebuilding, and visualization of three-dimensional nucleic acid structures, Nucleic Acids Res. 31, 5108-5121. H.M. Berman, W.K. Olson, D.L. Beveridge, J. Westbrook, A. Gelbin, T. Demeny, S.-H. Hsieh, A.R. Srinivasan & B. Schneider (1992) The Nucleic Acid Database: a comprehensive relational database of three-dimensional structures of nucleic acids, Biophys. J. 63, 751-759. W.K. Olson, A.V. Colasanti, L. Czapla & G. Zheng (2009) Insights into the sequence-dependent bacromolecular properties of DNA from base-pair level modeling, in Coarse-Graining of Condensed Phase and Biomolecular Systems, G.A. Voth, eds., pp. 205-223, Boca Raton, FL: Taylor and Francis Group, LLC.