Supplementary Material for Robust Experimental Standards for Amide I Vibrational Spectroscopy 1,2 Mike Reppert, 2Anish R. Roy, 2Andrei Tokmakoff 1 Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139 2 Department of Chemistry, University of Chicago, Chicago, IL 60637 Isotope Enrichment of Amino Acids Isotopically enriched starting materials were purchased from Cambridge Isotope Laboratories. For 13C labeling work, Ala, Gly, Leu, Met, Phe, and Val amino acids with 13C in the 1-C position were used without further purification. For 18O enrichment purposes, H218O isotope enriched water (Cambridge Isotopes Laboratories) was saturated with HCl by bubbling dry HCl gas through the solution (final concentration ~12 molar). For this purpose, dry HCl gas was produced according to the method of Ref.1 by adding 12 molar HCl / H2O dropwise to concentrated H2SO4 and bubbling again through an additional layer of dry H2SO4 to ensure removal of any spurious un-enriched H2O. For production of 13C18O enriched amino acids, 1 mmol of 13C-enriched amino acid was dissolved in H218O water acidified with 1.3 mmol HCl to ensure complete protonation of the C-terminus, facilitate isotopic exchange, and increase solubility. For all amino acids, the molar equivalents of amino acid:HCl:H218O were fixed at 1:1.3:22. After heating for eight hours at 90oC, the sample was lyophilized to remove the isotopically diluted H216/18O water; the procedure was then repeated by re-dissolving again in 22 mmol H218O containing 0.3 mmol HCl; note that these quantities maintain the ratio 1:1.3:22 in the exchange, since 1 equivalent of HCl remains in the acid salt after lyophilization. The final product was neutralized with 1M NaOH before lyophilization and characterization by ESI mass spectrometry; isotope enrichment was found to be ~97% for all samples studied. All amino acids were prepared in the 200 mg quantities required for the isotope-enriched expression procedure described below. We note that in the final production run for the results presented here, significant quantities of enriched Ala and Gly were lost due to a mishap during the intermediate lyophilization step. As a result, for these two amino acids, the final expression medium contained only 100 - 150 mg of isotopeenriched material. Fortunately, this error is to a significant degree compensated for by the lower molecular weight of these two amino acids, so that the final molar concentration of enriched amino acid in the growth medium was comparable to the other labels. Expression and Purification of the NuG2b Peptide NuG2b (sequence MDTYKLVIVLNGTTFTYTTEAVDAATAEKVFKQYANDAGVDGEWTY DAATKTFTVTE) was expressed and purified in isotope enriched medium following the guidelines of Ref.2 Briefly, BL21-CodonPlus (DE3)-RIL competent cells (Stratagene, Cat# 230245) were transformed with the NuG2b plasmid kindly provided by the group of Tobin Sosnick at the University of Chicago. For routine expressions, working glycerol stocks were prepared by mixing overnight growth culture with 80% glycerol and freezing at -70 oC. For each expression run, an overnight culture of LB broth (Fisher scientific) containing 100 μg/mL ampicillin was inoculated with the glycerol stock and grown overnight at 37 oC. The next morning, a 15 mL aliquot was centrifuged down for 5 minutes at 10,000xg; the supernatant was discarded, and the cell pellet re-suspended in 10 mL glucose-based minimal medium (following the recipe of Ref.2). This solution was used to inoculate one liter of glucose-based minimal medium (100 μg/mL ampicillin) split between four 1 liter baffled Erlenmeyer flasks. Incubation continued at 37 oC until the solution reached an OD600 of 0.6 - 0.8, at which point 200mg of isotopically enriched amino acid was added to the solution; after an additional 30 minute equilibration period, expression was induced by the addition of 0.25 g IPTG. Expression was allowed to proceed for 2 hours before the cells were pelleted out by centrifugation at 10,000xg for 20 minutes at 4 oC. The obtained cell pellet was frozen overnight at -20 oC before being resuspended in 5% acetic acid and sonicated on ice (3 minutes of 2 seconds sonication / 1 second rest) for cell disruption. After centrifuging down the samples at 12,000 RPM for 30 minutes (10 o C), the supernatant was dialyzed against H2O for 6 hours at room temperature to remove small- molecule impurities. The product was then lyophilized for characterization and storage. Overall yield was typically 40 mg per liter of expression medium. The expressed sample was assessed for purity by reverse-phase HPLC and MALDI-TOF mass spectrometry and was found to contain three N-terminal variants on the desired protein sequence. The major product (~70 - 90%) was the formyl-methionine capped peptide due apparently to incomplete cleavage of the formyl group from the peptide terminus after expression. In addition, some samples showed significant (up to ~30%) oxidation of the terminal methionine side chain. The uncapped peptide (free NH3+) occurred with a population of ~5-10%. As for our purposes, the capping condition of the N-terminal Met residue was largely inconsequential, samples were used as obtained, without further attempt to separate these three variants. Infrared Spectroscopy Fourier transform infrared (FTIR) absorption spectra were collected on a Bruker Confocheck system at 2 cm-1 resolution. All measurements were performed in D2O containing 250 mM potassium phosphate buffer at a pD of ~1.5-2 to ensure complete protonation of the C-terminal and Glu/Asp side chain carboxylic acids (whose absorptions otherwise overlap the 13C18O isotope label region near 1580 cm-1). Before FTIR measurements, hydrogen/deuterium exchange was accomplished by heating a 2 mg/mL solution of NuG2b in 50:50 acetonitrile:D2O at 50 oC for 2 hours. (The addition of the acetonitrile facilitated unfolding of the protein and allowed for exchange at much lower temperatures than possible in pure D2O). After cooling to room temperature, the samples were spun down briefly on a desktop centrifuge, the supernatant was removed (leaving the aggregate which sometimes formed during heating), and re-lyophilized to be ready for measurement. The solubility of the NuG2b peptide was found to be near 100 mg/mL at pH 2, allowing for FTIR measurements to be performed under quite concentrated conditions. The measurements reported here correspond to NuG2b concentrations of 50 - 75 mg/mL and were referenced against lower concentration spectra (~20 mg/mL). For all measured spectra, a linear baseline was subtracted from the raw experiment to provide a flat line between 1525 cm-1 and 1800 cm-1. Baseline-corrected xy-data for the experimental FTIR spectra presented in Figure 2 of the main text are included in separate comma-delimited text files for reference. All file names follow the format “DesG_XXX_C13.txt” for 13C labels or “DesG_XXX_C13O18.txt” for C O labels, where “XXX” is the corresponding (all caps) amino acid three letter code. 13 18 MD Simulations MD simulations were performed using the GROMACS 4.6 simulation package3 for the CHARMM27, GROMOS53a6, and OPLS-AA force fields. The starting structure was obtained by removing the poly-His tail and making the triple mutant replacements N37A/A46D/D47A (numbering starting at the Asp residue) in the experimental NuG2 crystal structure (Protein Databank ID 1MI04) using the Chimera molecular modeling system. The starting structure was centered in a cubic box with walls with a minimum distance of 1 nm from the protein to each box wall (~5.6 nm side length) and solvated with ~5500 explicit water molecules, either SPC (GROMOS53a6 simulations) or SPC/E water (CHARMM27 and OPLS-AA). The box was neutralized by the addition of five chloride anions, energy minimized, and equilibrated by a 100 ps NVT trajectory and 1 ns NPT trajectory, both with constrained protein coordinates. Production run simulations were carried out for 10 ns in the NVT ensemble (2 fs time step, with bond lengths constrained using the LINCS algorithm5) and were continued from the NPT trajectory frame with volume closest to the average volume across the final 750 ps of the NPT run. Long-range electrostatics were treated in the Particle Mesh Ewald method with a grid spacing of 0.12 and cutoff length of 0.9 nm. Temperature control was maintained at 300 K using the Nosѐ-Hoover thermostat with separate coupling groups for the protein and solvent/ions. Spectral Calculations From each production run, coordinates were saved every 20 fs for spectral analysis using inhouse C code to calculate site frequencies, transition dipole moments, and coupling constants using various spectral maps from the literature, denoted JO (Jansen/OPLS-AA)6,7, SG (Skinner/GROMOS53a6)8, DC (dipeptide/CHARMM27)9, and DO (dipeptide/OPLS-AA)9. For each frame in the MD simulation, electrostatic variables—potential (Φ), field (Ea), and field gradient (Gab)—were calculated at each map site by explicitly summing over the electrostatic contributions from each atom in the box, excluding those atoms specified by each map. For the Jansen and Skinner maps of Refs.6–8 the excluded atom selections follow the convention of Ref.6, i.e. for the amide bond linking residues n and n+1, the atom exclusion list consists of all nonsidechain atoms of residues n and n+1 as well as the CA, HA, C and O atoms of residue n-1 and the N, H, CA, and HA atoms of residue n+2 (Figure 1A). For the dipeptide maps of Ref.9, all protein atoms were excluded in the electrostatic calculation with the exclusion of the C, O, N, H, and flanking CA atoms of the amide bond under consideration (Figure 1B). For reference, the electrostatic frequency shift parameters employed by each map are listed in Table S1. Figure S1. Excluded atom conventions for JO and SG electrostatic maps (A) and DC/DO electrostatic maps (B). Atoms displayed in blue are included in electrostatic frequency shift calculations for the amide bond displayed in red. The bond atoms themselves (red) and nonsidechain atoms of neighboring amino acids (grey) are excluded for electrostatic calculations. Transition dipole moments for JO map simulations were calculated according to the electrostatic dipole map of Ref.6; dipoles in the dipeptide simulations were calculated using the zero-field value of the same model, i.e. neglecting the non-Condon dependence of the transition dipole moment on the local electrostatic field. For SG map simulations, transition dipole moments were generated according to the the ab initio calculations of Torii and Tasumi (summarized concisely following Eq. 14 of Ref.8). For all four maps, the nearest-neighbor dihedral angle/coupling map of Ref.6 was used to calculate nearest-neighbor coupling constants. Through-space coupling constants for the Jansen and dipeptide maps were calculated using the transition charge coupling (TCC) model of Ref.6; through-space couplings for the Skinner map were calculated using the transition dipole coupling (TDC) model of Torii and Tasumi (Eq. 14 of Ref.8). Ex Ey Gxy Gzz Jansen(JO) C O N D 2188 990 470.3 230.2 171 898.3 -805.1 -1854.8 -3953.9 -469.4 -4141.1 -2049.4 -90 433.6 -1070.9 -454.8 Skinner (SG) C N 7729 -3576 - - - Dipeptide / CHARMM27 (DC) O 2765 - - - Dipeptide / OPLS-AA (DO) O 3608.3 - - - Table S1. Frequency shift coefficients for the JO, SG, DC, and DO electrostatic maps used for spectral simulations. All coefficients are in units of cm-1 per elementary electrostatic unit (Eh/e for potential, Eh/aoe for electric field, and Eh/ao2e for field gradient, where Eh indicates Hartrees, e is the elementary charge, and ao is the Bohr radius). All absorption spectral calculations were performed using the dynamic method of Torii to account explicitly for nonadiabatic motion of the excitation through the Amide I one-quantum manifold.10 For isotope-labeled simulations, a frequency shift of -65 cm-1 was assumed for 13 18 C O labels; this value is based both on literature assignments11,12 and on the observed difference between 13C and 13C18O peak frequencies in our data. For each MD trajectory and set of isotope labels, spectra were calculated both for the total protein (56 amide bonds) and for the isolated subset of isotope labels, i.e for a truncated Hamiltonian including coupling between isotope labeled sites but excluding the unlabeled portion of the protein. Gaussian Fits to Isotope Label Peaks For detailed analysis, Gaussian fits (red curves of Figure 5) were performed on the difference spectrum (black dashed curves) between various 13C18O isotope labeled peptides and the unlabeled reference spectrum. The fit was performed in MATLAB (R2013a, The MathWorks, Inc.) over the positive-valued portion of the difference spectrum (black solid curve) and consists of either two or three Gaussian curves (dashed red curve), as necessary to provide acceptable overall fits (open circles) to the experimental data. Full fit parameters are provided in Table SII below. (Partial data is presented in Table II of the main text). Labeled Site Component 1 Component 2 Component 3 Ala 0.86 1583.6 cm-1 30.2 cm-1 0.14 1587.9 cm-1 13.8 cm-1 Gly 0.88 1579.0 cm-1 30.1 cm-1 0.12 1596.8 cm-1 14.8 cm-1 Leu 0.56 1585.6 cm-1 10.7 cm-1 0.44 1580.8 cm-1 23.5 cm-1 -- -- -- Phe 0.60 1594.0 cm-1 11.7 cm-1 0.27 1581.9 cm-1 12.0 cm-1 0.13 1574.3 cm-1 27.9 cm-1 Val 0.56 1576.2 cm-1 27.8 cm-1 0.38 1586.9 cm-1 11.0 cm-1 0.06 1596.7 cm-1 8.4 cm-1 References 1 R.N. Maxson, Inorg. Synth. I, 147 (1939). 2 M. Saumuel-Landtiser, C. Zachariah, R. Williams, Chris, A.S. Edison, and J.R. Long, Curr. Protoc. Protein Sci. Chapter 26, Unit 26.3 (2007). 3 B. Hess, C. Kutzner, D. van der Spoel, and E. Lindahl, J. Chem. Theory Comput. 4, 435 (2008). 4 S. Nauli, B. Kuhlman, I. Le Trong, R.E. Stenkamp, D. Teller, and D. Baker, Protein Sci. 11, 2924 (2002). 5 B. Hess, H. Bekker, H.J.C. Berendsen, and J.G.E.M. Fraaije, J. Comput. Chem. 18, 1463 (1997). 6 T. la Cour Jansen, A.G. Dijkstra, T.M. Watson, J.D. Hirst, and J. Knoester, J. Chem. Phys. 125, 44312 (2006). 7 T. la Cour Jansen and J. Knoester, J. Chem. Phys. 124, 044502 (2006). 8 L. Wang, C.T. Middleton, M.T. Zanni, and J.L. Skinner, J. Phys. Chem. B 115, 3713 (2011). 9 M. Reppert and A. Tokmakoff, J. Chem. Phys. 138, 134116 (2013). 10 H. Torii, J. Phys. Chem. A 110, 4822 (2006). 11 C. Fang, J. Wang, Y.S. Kim, A.K. Charnley, W. Barber-Armstrong, A.B.I. Smith, S.M. Decatur, and R.M. Hochstrasser, J. Phys. Chem. B 108, 10415 (2004). 12 A.M. Woys, A.M. Almeida, L. Wang, C.-C. Chiu, M. McGovern, J.J. de Pablo, J.L. Skinner, S.H. Gellman, and M.T. Zanni, J. Am. Chem. Soc. 134, 19118 (2012).