Supplementary Material for Utilization of paramagnetic relaxation enhancements for high-resolution NMR structure determination of a soluble loop-rich protein with sparse NOE distance restraints Kyoko Furuita, Saori Kataoka, Toshihiko Sugiki, Yoshikazu Hattori, Naohiro Kobayashi, Takahisa Ikegami, Kazuhiro Shiozaki, Toshimichi Fujiwara and Chojiro Kojima This file includes: page Supplementary Materials and Methods 2-6 Supplementary Discussions 7-8 Supplementary References 9-10 Supplementary Tables S1 to S4 11-16 Supplementary Figures S1 to S12 17-33 1 Supplementary Materials and Methods Preparation of Sin1CRIM protein Construction of the pCold-GST expression vector encoding Sin1CRIM, SpSin1 (amino acid 247-400), expression and purification of unlabeled, 13C,15N-labeled 15N- and Sin1CRIM protein were performed as previously described (Kataoka et al. 2014). Briefly, the cDNA encoding Sin1CRIM was amplified by PCR and genetically inserted into pCold-GST vector (Hayashi and Kojima 2008). Recombinant Sin1CRIM was overexpressed in E. coli RosettaTM (DE3) (Novagen). Cells overexpressing Sin1CRIM were harvested by centrifugation and then physically disrupted by ultrasonication. The crude membrane debris was pelleted by ultracentrifugation and supernatants were then loaded onto a Glutathione Sepharose 4B column (GE Healthcare). Sin1CRIM was eluted using a buffer containing 50 mM reduced glutathione. The N-terminal GST tag of the Sin1CRIM protein was removed by digestion using Human rhinovirus (HRV) 3C protease. Following protease treatment, Sin1CRIM protein sample was further purified by size-exclusion column chromatography (SEC). PRE-derived distance restraints PRE-derived distance restraints were calculated as follows. First, the contribution of oxidized spin label to relaxation rates was calculated from intensity ratios of 1H-15N HSQC spectra in the paramagnetic and diamagnetic states (Figures S2 and S3), according equation (1) (Battiste and Wagner 2000). Iox/Ired = R2exp(-R2spt)/R2+R2sp (1) 2 where Iox and Ired are the peak intensities in the paramagnetic and diamagnetic states, respectively, and t is the total INEPT evolution time of the 1H-15N HSQC (10 ms). R2 and R2sp are the transverse relaxation rate for amide spin in the diamagnetic states, and the contribution of electron spin in the paramagnetic states to the relaxation rate, respectively. R2sp was then converted into distances using the following equation, r = [K/ R2sp(4τc + 3τc/ 1+ωh2τc2)]1/6 where r is the distance between the electron center on MTSL and nuclear spins, c is the correlation time for the electron-nuclear interaction, h is the Larmor frequency of the proton nuclear spin, and K is 1.23 10-32 cm6s-2 composed of physical constants (Battiste and Wagner 2000). For calculating the distances, the approximation was made that c was equal to the global correlation time of the protein estimated from the molecular weight of the protein (10 ns), and R2 was estimated from the line width at half-height (1/2) in proton dimension of 1H-15N HSQC spectra using the equation, R2 = 1/2, under reduced conditions (76.8 ms on average). Line width and peak intensities were estimated using the program Sparky. All peak intensity ratios were calculated from the peak height ratios. The error range of the peak intensity ratio was much larger than the expected from the noise level. In an effort to evaluate the error range experimentally, the peak intensity ratios showing more than 1 were focused on. Because the peak intensity ratio is between 0 and 1, the peak intensity ratios showing more than 1 could be the indicators of the experimental error. The averaged value and the 3 error range of the peak intensity ratios showing more than 1 were 1.06 ± 0.15 (2.5σ). In other words, 20% errors were expected at a maximum. Therefore the intensity ratios showing less than 0.8 were subject to the influence of PRE. PRE-derived distance restraints were introduced between amide protons and Cβ atoms of mutated residues with the error of ± 7 Å. The smaller error of ± 6 Å gave the larger target function values of CYANA, and the larger error of ± 8 Å did not show significant improvements. Thus, the error of ± 7 Å is reasonable in our system, although this error was much larger than the previously reported values, 2 - 4 Å (see Table S4 and references therein). This large error could be from the flexibility of MTSL. If this flexibility is a key factor, ensemble representations of spin-labels with two time-point measurements (Iwahara et al. 2004, 2007) will be useful. At least, further studies are necessary to understand why the error is so large in our system. Structure calculation for structure determination Structure calculations were performed combined with automated NOE assignments. The structure calculations were performed in the absence or presence of PRE-derived distance restraints obtained from 9 spin-labeled mutants (T280C, S282C, R291C, S301C, K312C, L332C, S371C, T384C and A394C). The input data for each structure calculation is provided in Table 1. The 10 structures with the lowest target function that were calculated by CYANA were further refined using Xplor-NIH 2.31 (Schwieters et al. 2003, 2006). The initial structures for Xplor-NIH structure refinements were generated with a single MTSL nitroxide label at each mutated position. In Xplor-NIH structure 4 calculations, PRE distance restraints were introduced for distances between NS1 atoms of MTSL labels and amide protons with an error of 4Å. The structure refinements were performed with NOE distance, PRE distance and dihedral angle restraints. 10 structures were calculated starting from each structure. The 10 lowest energy structures were selected and analyzed. The atomic coordinates of the refined structures of Sin1CRIM and the structural restraints including PREs and RDCs have been deposited in the Protein Data Bank with accession code 2RUJ. Structure calculations using the fixed list of NOE upper distance limits Calculations were performed in the absence or presence of PRE-derived distance restraints obtained from the 9 spin-labeled mutants referred to in the previous section. Except for the PRE-derived distance restraints, NOE upper distance and dihedral angle restraints created by CYANA in the structure calculations combined with automated NOE assignments in the presence of PRE-derived distance restraints, which are described in the previous section, experimentally determined , and χ1 dihedral angles are used as structural constraints. Structure calculations with varied number of PRE restraints PRE-derived distance restraints comprising 12.5, 25, 37.5, 50, 62.5, 75, or 87.5% of the original PRE-derived distance restraints, derived from the 9 spin-labeled mutants, were prepared by randomly selecting restraints from the original restraints using Microsoft Excel. Ten PRE-derived distance restraints 5 were prepared with respect to each percentage group. Each group of PRE-derived distance restraints was introduced in structure calculations combined with automated NOE assignments. Except for PRE-derived distance restraints, the input data was the same as those used in the structure calculations for the structure determination described above. Structure calculations using modified NOE peak lists First, at the lowest threshold, manual peak picking was applied to and 15N-edited noise level for 13C- NOESY spectra. The threshold was set to 4- and 6-times the 13C- and 15N-edited NOESY-HSQC spectra, respectively. Then a series of NOE peak lists were prepared by increasing the threshold of the NOESY spectra from 100% to 120%, 140%, 160%, 180% and 200%. These peak lists were then used in structure calculations combined with automated NOE assignments. Except for the peak lists, the input data was the same as those used in the structure calculations for the structure determination described above. 6 Supplementary Discussions Impact of the quality of NOESY spectra on structure determination The impact of the quality of NOESY spectra on structure determination was investigated by modifying the NOE peak lists. First, at the lowest thresholds, manual peak picking was applied to 13C- and 15N-edited NOESY spectra. The structure was calculated by the automated NOE assignment procedure using PRE-derived distance restraints and a series of NOE peak lists (Figure S12). These NOE peak lists were prepared by increasing the threshold of the NOESY spectra from 100% to 200%. That is, by increasing the threshold of the NOESY spectra, the number of NOE peaks decreases. All structures except for that shown in Figure S12 were calculated using the NOE peak lists prepared at 140% threshold. When the threshold of the NOESY spectra was higher than 140% of the original, the backbone RMSD increased significantly (Figure S12). When the threshold of the NOESY spectra was lower than 140% of the original, no significant change in backbone RMSD was observed (Figure S12). These results indicate that a certain level of quality of the NOESY spectra is required for convergence of the calculated structure, even if PRE-derived distance restraints are used. Furthermore, it is conceivable that some weak NOEs are critical for convergence. On the other hand, the RDC correlation coefficients did not show a clear dependence on the threshold of the NOESY spectra (Figure S12). These results indicate that the accuracy of the structure tends to be 7 maintained independently of the quality of the NOESY spectra, if PRE-derived distance restraints are used. Technical implementation of PRE data in structure calculation Battiste and Wagner utilized distances derived from paramagnetic broadening of 1H-15N HSQC spectra in protein structure determinations, where flexibility of spin label was considered by taking wide distance error range (Battiste and Wagner 2000). Another way to consider flexibility of spin label was proposed by Iwahara et al., where flexibility of spin label is considered by representing it as an ensemble of spin labels (Iwahara et al. 2004). By using this approach, 1H-PRE data arising from a flexible paramagnetic group could be accurately utilized in structure refinement (Iwahara et al. 2004). In order to accurately measure PRE relaxation rates, a two time-point measurement has been proposed (Iwahara et al. 2007). In this study, we used a method proposed by Battiste and Wagner. This method is used for 19 out of 20 NMR structures of membrane proteins found in the database 'Membrane Proteins of Known Structure Determined by NMR' (http://www.drorlist.com/nmr/MPNMR.html) (Table S4). 8 Supplementary References Battiste JL & Wagner G (2000) Utilization of site-directed spin labeling and high-resolution heteronuclear nuclear magnetic resonance for global fold determination of large proteins with limited nuclear overhauser effect data. Biochemistry 39: 5355-65. Bhattacharya A, Tejero R & Montelione GT (2007) Evaluating protein structures determined by structural genomics consortia. Proteins 66: 778–95. Bowie JU, Lüthy R & Eisenberg D (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253: 164–70. Hayashi K & Kojima C (2008) pCold-GST vector: a novel cold-shock vector containing GST tag for soluble protein production. Protein Expr Purif 62: 120-127. Kataoka S, Furuita K, Hattori Y, Kobayashi N, Ikegami T, Shiozaki K, Fujiwara T & Kojima C (2014) 1H, 15N and 13C resonance assignments of the conserved region in the middle domain of S. pombe Sin1 protein. Biomol NMR Assign, in press. Iwahara J, Schwieters CD & Clore GM (2004) Ensemble approach for NMR structure refinement against (1)H paramagnetic relaxation enhancement data arising from a flexible paramagnetic group attached to a macromolecule. J. Am. Chem. Soc. 126: 5879–96. Iwahara J, Tang C & Clore GM (2007) Practical aspects of 1 H transverse 9 paramagnetic relaxation enhancement measurements on macromolecules. J. Magn. Reson. 184: 185–195. Laskowski RA, Rullmannn JA, MacArthur MW, Kaptein R & Thornton JM (1996) AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J. Biomol. NMR 8: 477–86. Lüthy R, Bowie JU & Eisenberg D (1992) Assessment of protein models with three-dimensional profiles. Nature 356: 83–5. Sippl MJ (1993) Recognition of errors in three-dimensional structures of proteins. Proteins 17: 355–62. 10 Table S1. Mutants of Sin1CRIM designed for the site-directed spin labeling. Mutant constructiona expressionb solubilityb purification NMR measurement S248C × - - - - S256C × - - - - D260C × - - - - S269C T280C S282C S287C × - - - - R291C S298C × - - - S301C K304C × - - - - K312C Δ S317Cc S319C × - - - - G321Cc Q331C Δ Δ L332C V333C × - - - - Q341C R349C × - - - - G355Cc E359C × - - - - D360C × - - - - F361Cc A363C × - - - - R366C S371C K382C × - - - - 11 T384C A386Cc Q392C × - - - - A393C NAd - - - A394C Y395C × - - - - S399C aThe generation of plasmid containing the appropriate mutation was either successful () or unsuccessful (×). bThe expression level and solubility of the mutants was either high () or low (Δ). cMutants dNot designed after the structure determination. attempted. 12 Table S2. Backbone RMSDs and RDC correlation coefficients of structures determined in the presence of PRE-derived distance restraints derived from a single spin-labeled sample or 9 spin-labeled samples. labeled residue backbone RMSD RDC correlation coefficient 280 2.32 ± 0.29 0.64 ± 0.10 282 2.35 ± 0.73 0.60 ± 0.10 291 2.80 ± 0.58 0.73 ± 0.11 301 3.70 ± 0.77 0.49 ± 0.15 312 2.40 ± 0.51 0.62 ± 0.06 332 1.96 ± 0.85 0.38 ± 0.20 371 3.60 ± 0.98 0.64 ± 0.07 384 3.19 ± 0.84 0.49 ± 0.11 394 2.67 ± 0.77 0.58 ± 0.05 9 residuesa 0.91 0.17 0.86 0.05 aResidues 280, 282, 291, 301, 312, 332, 371, 384 and 394. 13 Table S3. Structural statistics for refined structures of Sin1CRIM. Completeness of resonance assignment (%) Backbone 93 Side chain 71 Aromatic 20 Conformationally restricting restraints Distance restraints NOE Total 929 Short range (|i-j|1) 612 Medium range (1<|i-j|<5) 126 Long range (|i-j|5) 191 PRE Total 867 Upper distance restraints 163 Lower distance restraints 704 Hydrogen-bond restraints 0 Disulfide restraints 0 Dihedral angle restraints Total 212 Backbone 200 Side chain 12 Distance violation > 0.5 Å 0 Dihedral angle violation 0 >10 Model quality Rmsd backbone atoms (Å)1 1.0 Rmsd heavy atoms (Å)1 1.5 Rmsd bond lengths (Å) 0.011 Rmsd bond angles () 1.4 RDC correlation coefficient 0.89 ± 0.03 14 PROCHECK Ramachandran statistics1,2,3 Most favored regions (%) 84.2 Additionally allowed regions (%) 12.6 Generously allowed regions (%) 2.4 Disallowed regions (%) 0.7 Global quality scores (raw/Z score)3 Verify3D4 0.20/-4.17 Prosall5 0.26/-1.61 PROCHECK (-)1,2 -0.54/-1.81 PROCHECK (all)1,2 -0.41/-2.42 MolProbity clash score6 30.57/-3.72 1calculated for amino acids 275-395 2Laskowski et al. 1996 3calculated using PSVS version 1.5 (Bhattacharya et al. 2007) 4Sippl 1993 5Bowie 6Davis et al. 1991; Lüthy et al. 1992 et al. 2007 15 Table S4. Membrane proteins determined using PRE restraints found in the database, “Membrane Proteins of Known Structure Determined by NMR”. Protein PDB ID Method1 Reference Mistic 1YGM A (TROSY) Roosild et al. 2005 FXYD1 2JO1 A (HSQC) Teriete et al. 2007 KCNE1 2K21 A (TROSY) Kang et al. 2008 DsbB 2K73 A (TROSY) Zhou et al. 2008 DsbB 2K74 A (TROSY) Zhou et al. 2008 DAGK 2KDC A (TROSY) Van Horn et al. 2009 Rv1761c 2K3M A (HSQC) Page et al. 2009 ArcB 2KSD A (TROSY) Maslennikov et al. 2010 QseC 2KSE A (TROSY) Maslennikov et al. 2010 KdpD 2KSF A (TROSY) Maslennikov et al. 2010 UCP2 2LCK A (TROSY- Berardi et al. 2011 HNCO) Proteorhodopsin 2L6X Combination of Reckel et al. 2011 A and B HIGD1A 2LOM A (TROSY) Klammt et al. 2012 HIGD1B 2LON A (TROSY) Klammt et al. 2012 TMEM14A 2LOP A (TROSY) Klammt et al. 2012 FAM14B 2LOQ A (TROSY) Klammt et al. 2012 TMEM141 2LOR A (TROSY) Klammt et al. 2012 TMEM14C 2LOS A (TROSY) Klammt et al. 2012 Human glycine 2M6I A (HSQC) Mowrey et al. 2013 2M8R A (TROSY) Liang et al. 2013 receptor alpha1 TM t-SNARE Syntaxin-1A 1Method that were used to implement PRE in structure determination. A, Method proposed by Battiste and Wagner. Experiments that were used to measure PRE are in parentheses; B, Method proposed by Iwahara et al. 16 Figure S1. Concentration dependence of the peak intensities of the 1H-15N HSQC spectra of MTSL-conjugated Sin1CRIM (K312C). The peak height ratios between 200 and 50 μM, and 100 and 50 μM are shown in gray and black points, respectively. The averaged values are 0.90 ± 0.04 and 1.03 ± 0.03 for 200μM / 50 μM (glay) and 100 μM / 50 μM (black), respectively. These values indicate the peak intensity at 200 μM is 10% lower than the expected. 17 (Figure S2, continues on the next page) 18 (Figure S2, continues on the next page) 19 Figure S2. Overlay of 1H-15N HSQC spectra of Sin1CRIM WT (blue), and MTSL-conjugated mutant in the diamagnetic (green) and paramagnetic (red) states. G321C, Q341C and G355C mutants showed dramatic chemical shift changes. In the case of Q331C, the 1H-15N HSQC spectrum could not be measured with sufficient signal-to-noise ratios. The 1H-15N HSQC spectrum of R366C was completely altered with a change from the oxidized to reduced state. 20 (Figure S3, continues on the next page) 21 Figure S3. Intensity ratio of 1H-15N HSQC peaks of the paramagnetic and diamagnetic states. Error bars indicate experimental uncertainties based on the noise level in the NMR spectra. 22 Figure S4. Time dependence of the average peak heights of 1H-15N HSQC spectra of MTSL-conjugated Sin1CRIM (K312C). The spectra are serially measured 15 times. 23 Figure S5. Concentration dependence of the PRE values of MTSL- conjugated Sin1CRIM (K312C). PRE ratios between 100 and 50 μM of protein are shown. The PRE values are evaluated from the intensity ratio of 1H-15N HSQC spectra in the presence or absence of 1 mM ascorbic acid. The errors are calculated from the root-mean-square of the spectral noises. The averaged value is 1.03 ± 0.05, indicating the PRE values are same at difference protein concentrations, 100 and 50 μM. 24 (Figure S6, continues on the next page) 25 (Figure S6, continues on the next page) 26 Figure S6. Location of PRE-derived distance restraints obtained for each mutant. MTSL-conjugated cysteine residues are shown by yellow spheres. Red, residues restrained by upper distances; magenta, residues restrained by both upper and lower distances; cyan, residues restrained by lower distances. 27 Figure S7. The correlation plots of back-calculated versus experimental RDCs for the lowest energy structure calculated by CYANA in the absence of PRE (a), the lowest energy structure calculated by CYANA in the presence of PRE (b) and the lowest energy structure refined by Xplor-NIH (c). The correlation coefficients are 0.63 (a), 0.92 (b) and 0.87 (c). The slope of the line through the origin is 1. 28 Figure S8. (a) A superimposed representation of 10 lowest energy structures. (b) A ribbon representation of the lowest energy structure. 29 Figure S9. NOEs used for the structure calculations are shown by thin black lines on the lowest target function structure of Sin1CRIM. 30 Figure S10. RMSD values of backbone atoms and correlation coefficients between experimental RDC values and back-calculated RDC values obtained from one of the final 10 structures with the lowest target function, which were calculated using reduced PRE distance restraints (left) and PRE distance restraints obtained with any one of S77C, F121C or A146C in addition to 100% of the PRE distance restraints (right). 31 Figure S11. NOE-derived long-range distance restraints that increased by employing PRE-derived distance restraints in the automated NOE assignments by CYANA. NOEs are shown by lines on the lowest target function structure of Sin1CRIM. 32 Figure S12. Influence of the quality of NOESY spectra on structure calculations. Valuable NOE peak lists, which were prepared by increasing the threshold of the NOESY spectra from 100% to 200%, were used in structure calculations. The RMSD values of backbone atoms and correlation coefficients between experimental and back-calculated RDC values using the calculated structures. 33