Crystal Structure of the CorA Mg2+ Transporter Vladimir V. Lunin1*, Elena Dobrovetsky2*, Galina Khutoreskaya2, Rongguang Zhang3, Andrzej Joachimiak3, Declan A. Doyle4, Alexey Bochkarev2,5,6, Michael E. Maguire7, Aled M. Edwards1,2,3,5,6, Christopher M. Koth2,8 1 Department of Medical Biophysics, University of Toronto, 112 College Street, Toronto, ON, Canada M5G 1L6 2 Banting and Best Department of Medical Research, University of Toronto, 112 College Street, Toronto, ON, Canada M5G 1L6 3 Structural Biology Center & Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, 9700 S. Cass Av. Argonne, IL 60439 4 Structural Genomics Consortium Botnar Research Centre, Oxford, Oxon OX3 7LD, United Kingdom 5 Department of Medical Genetics and Microbiology, University of Toronto, 112 College Street, Toronto, ON, Canada M5G 1L6 6 Structural Genomics Consortium, Banting Institute, 100 College Street, Toronto, ON, Canada M5G 1L5 7 Department of Pharmacology, Case Western Reserve University, Cleveland, OH 44106-4965 * These authors contributed equally to this work. 8 Present address: Vertex Pharmaceuticals Incorporated, 130 Waverly St., Cambridge, MA 02139 USA Supplementary Information Methods Expression and purification of full-length and soluble domain of T. maritima CorA. The full coding sequence of T. maritima CorA was amplified by PCR from genomic DNA using high fidelity Pfu polymerase (Stratagene). The restriction site NdeI was added to the 5’ end of the PCR product. The restriction site BamHI was added to the 3' end of the PCR product. The digested PCR product was inserted directionally between the corresponding sites of a modified pET15b (Novagen) T7 polymerase expression vector in which the thrombin cleavage site (LVPR^GS) had been replaced with a TEV protease recognition site (ENLYFQ^G), generating plasmid pET-TMCORA. The recombinant protein was expressed as a fusion to an N-terminal six-histidine tag and a TEV protease site. Nine other prokaryotic CorA homologues were similarly cloned, tested for expression and their purification attempted (data not shown). A construct for expression of the soluble domain of CorA (residues 1-266) was generated by inserting a stop codon (TGA) in plasmid pET-TMCORA, immediately after the sequence coding for residue E266, using QuickChange mutagenesis (Stratagene). E. coli cells containing the T. maritima CorA expression plasmids described above were grown in 12 L Luria Broth at 37ºC to an OD600 of 0.6-0.8. IPTG was then added to a final concentration of 1.0 mM. The cells were grown for an additional 12 hours at 16ºC, harvested by centrifugation and frozen in liquid nitrogen. The frozen pellet was resuspended in 250 mL icecold Lysis Buffer (50 mM HEPES, pH 7.5, 500 mM NaCl, 5% glycerol, protease inhibitors (Complete Protease Inhibitor Cocktail (Roche), according to the technical manual)) and lysed by French press. All subsequent procedures were carried out at 4ºC unless otherwise specified. The suspension was centrifuged for 30 minutes at 100 000 x g. The pellets were then solubilized in 250 ml Binding Buffer (50 mM HEPES, pH 7.5, 500 mM NaCl, 5% glycerol, 10mM imidazole, 1% n-dodecyl--D-maltopyranoside (DDM, Anatrace), protease inhibitors) and stirred gently for 12 hours. The sample was then centrifuged for 30 minutes at 100 000 x g, and the supernatant loaded onto a 1 x 10 cm Ni-NTA gravity column equilibrated with Binding Buffer. The column was washed with 20 column volumes of Wash Buffer (50 mM Hepes, pH 7.5, 500 mM NaCl, 5% glycerol, 35mM imidazole, 0.02% DDM). Bound protein was eluted with Elution Buffer (50 mM HEPES, pH 7.5, 500 mM NaCl, 5% glycerol, 200mM imidazole, 0.02% DDM) and dialyzed overnight against Dialysis Buffer (50 mM HEPES, pH 7.5, 500 mM NaCl, 5% glycerol). During dialysis, removal of the hexahistidine tag was facilitated by the addition of histidine-tagged TEV protease (Invitrogen), according to the TEV protease technical manual (Invitrogen). Digestions were monitored by SDS-polyacrylamide gel electrophoresis. The resulting proteins contained three additional residues at the N terminus (Gly-Ser-His). TEV protease and the histidine tag were separated from CorA by collecting the flow-through from a second Ni-NTA column purification, as described earlier. Expression of selenomethionine-labelled protein was essentially as described (Korolev et al, 2002). The level of selenomethionine incorporation was regulated by the addition of unlabelled methionine to the growth media and monitored by mass spectrometry (data not shown). Crystallization of T. maritima CorA. For crystallization, protein solutions were used immediately after purification or stored at –78°C. Crystals were grown by the hanging drop method at 22°C, at a concentration of 2-4 mg/ml. Initial ‘hits’ were obtained using the Nextal MB Class I and II screens (Nextal). From these, initial conditions, the reservoir solutions were optimized. In the case of full-length T. aritime CorA, 2 µl protein was mixed with 2 µl reservoir solution containing 20% (w/v) PEG 2000 (Fluka), 0.3M Mg(NO3)2 and 0.1M Tris pH 8.0. Needle-like crystals appeared after 3-5 days and matured to full size within 2-3 weeks. In the case of the CorA soluble domain, 2 µl protein was mixed with 2 µl reservoir solution containing 35% (w/v) PEG 3350 (Fluka), 0.2M MgCl2 and 0.1M Tris pH 8.5. Diamond-shaped crystals formed after two days and matured to full size within 1-2 weeks. Structure determination. Native crystals of full-length T. maritima CorA, which formed only in the presence of magnesium, diffracted X-rays to 3.9 Å. Despite extensive efforts, heavy-metal derivatives of full-length T. maritima CorA suitable for phase determination could not be obtained. Selenomethionine-containing crystals were grown and a dataset collected, but the crystals formed only from the selenomethionine-containing protein preparations that had no or poor incorporation (<20%) of selenomethionine. The low occupancy did not enable us to solve the selenium sub-structure from these data. Phase information for the crystals of the full-length transporter was eventually obtained by first determining the 1.85 Å structure of the aminoterminal soluble domain of CorA (residues 1-244, Figure S1) and using this structure as a search model to solve the structure of the full-length protein by molecular replacement. The structure of a selenomethionine-labelled soluble domain (amino acids 1-266) was solved using single wavelength anomalous dispersion (SAD). Diffraction data were collected using a home source generator (Rigaku FR-E with RAXIS-IV++ detector) and at beamline X25 of the National Synchrotron Light Source at Brookhaven National Labs, to 1.85 Å. Data were processed using HKL2000 (Otwinowski and Minor, 2000). Three selenium sites were found using the interface Bake’N’Phase (Weeks et al, 2002), and the obtained phases were improved by solvent flattening using RESOLVE (Terwilliger, 2000). The software packages ARP/WARP (Perrakis et al, 1999) and REFMAC5 (Murshudov et al, 1997) were used to complete the model and refine it to a final R/Rfree of 0.196/0.233 (see Table 1 of the paper). The final model contains residues 13-117, 120-199, 207-244, 236 water molecules and two metal ions, assigned as Mg2+ and Na+. The Mg2+ ion is coordinated by Asp89 O1 and five water molecules (Figure 5a of paper). Also four molecules of detergent (DDM) were modelled in regions of long, continuous density within the detergent environment. Data for the full-length protein were collected at beamlines 17ID and 19ID at the Advanced Photon Source, Argonne National Labs. The full-length model was built first by fitting the known structure of the N-terminal domain and fitting its last helix (residues 208-244) into electron density. Five positions for the soluble domain, arranged as a pentameric ring, were found using the program PHASER (Storoni et al, 2001). The soluble domain core (residues 13199) fit well in the calculated 2Fo-Fc electron density maps, while helix 207-244 was out of the density. After fitting that helix and re-calculating 2Fo-Fc maps, five long helices, initially as 60residue polyalanine templates, were placed according to newly resolved features. When this model was refined and helical templates fitted into the density, five additional 30-residue polyalanine templates were added to the model, as new features developed. Data collected from selenomethionine-labelled full-length protein crystals (with ~20% incorporation of selenomethionine) were used to calculate an anomalous Fourier map with phases taken from this current model. A peak search in this map revealed 3 peaks per soluble domain, corresponding to methionine sidechains. Five additional peaks per molecule were found; two inside the pore as well as three peaks outside the pore. These peaks were used to place methionine residues and thread the amino acid sequence onto the model. The assigned sequence corresponds well with two kinks in the stalk helix at positions Gly274 and Pro303. Five-fold averaging alternated with solvent flattening using PHASES (Furey and Swaminathan, 2001) was used to improve the quality of electron density maps. The final model was refined to 3.9 Å. The final full-length model contains residues 9-315 and 323-349. The loop from residues 316-322 remains disordered or invisible in electron density maps. After refinement using REFMAC5 and CNS (Brunger et al, 1998), the R-factor was 0.361, and the Rfree 0.406. All statistics are summarized in Table 1, below. DALI Analysis. The fold of the soluble, intracellular domain of each subunit is unique in comparison to all other transporters and ion channels of known structure. The DALI (Holm and Sander, 1993) search performed for the soluble domain structure (residues 9-200) found a closest structural homologue with a Z-score of 3.0, and r.m.s.d of 3.5 Å over 88 Ca atoms in 11 fragments, and could not be counted as a structural homologue. The DALI search performed for the full-length protein structure provided a list of structures with Z-scores up to 10, r.m.s.d.’s down to 2.5 Å over 108 Ca atoms (i.e. PDB ID 1UUR). These structures contained three consecutive -helices and could be thus superimposed with the two willow helices and beginning of the stalk helix in CorA. No other structural features of T. maritima CorA were identified. Table 1. X-ray refinement statistics Soluble domain Full-length Resolution (Å) 1.85 3.9 Rwork/Rfree 0.196/0.233 0.361/0.406 Protein 1915 13805 Ligand/ion 134 n.d. Water 236 n.d. Protein 36.3 163.3 Ligand/ion 66.8 n.d. Water 46.3 n.d. 0.017 0.007 1.73 1.16 Number of atoms B-factors R.m.s deviations Bond lengths (Å) Bond angles () Supplementary References Brunger A.T., Adams P.D., Clore G.M., Delono W.L., Gros P., Grosse-Kunstleve R.W., Jiang J.-S., Kuszewski J., Nilges N., Pannu N.S., Read R.J., Rice L.M., Simonson T., and Warren G.L "Crystallography and NMR system (CNS): A new software for macromolecular structure determination." Acta Cryst. D54, 905-921 (1998). Furey, W. and Swaminathan, S. PHASES-95: A Program Package for the Processing and Analysis of Diffraction Data from Macromolecules. Methods in Enzymology, Volume 276: Macromolecular Crystallography, vol 277, Part B, chapter 31, eds. C. Carter & R. Sweet, Academic Press, Orlando, Fl. (1997). Holm, L. and Sander, C. Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138 (1993). Kehres, DG, Lawyer, C.H. and Maguire, M.E. The CorA magnesium transporter gene family. Microbial & Comparative Genomics 43:151-169(1998). Knoop, V, M Groth-Malonek, M Gebert, K Eifler and K Weyand. Transport of magnesium and other divalent cations: evolution of the 2-TM-GxN proteins in the MIT superfamily" Mol Genet.Genomics:1-12 (2005). Korolev, S., Ikeguchi, Y., Skarina, T., Beasley, S., Arrowsmith, C.H., Edwards, A.M., Joachimiak, A., Pegg, A.E. and Savchenko, A. The crystal structure of spermidine synthase with a multisubstrate adduct inhibitor. Nature Struct. Biol. 9: 27 - 31 (2002). Murshudov, G.N., Vagin, A.A., and Dodson, E.J. Refinement of Macromolecular Structures by the Maximum-Likelihood Method. Acta Crystallogr. D53, 240-255 (1997). Otwinowski, Z. and Minor, W. Processing of X-ray Diffraction Data Collected in Oscillation Mode. Methods in Enzymology, Volume 276: Macromolecular Crystallography, part A, p.307326, 1997, C.W.Carter,Jr.&R.M.Sweet, Eds.,Academic Press (New York). Perrakis A, Morris R and Lamzin VS. Automated protein model building combined with iterative structure refinement. Nature Struct. Biol. 6:458-463, (1999). Petrey, D. & Honig, B. GRASP2: visualization, surface properties, and electrostatics of macromolecular structures and sequences. Methods Enzymol. 374, 492-509 (2003). Rost, B., Fariselli, P. & Casadio, R. Topology prediction for helical transmembrane proteins at 86% accuracy. Protein Sci. 5, 1704-1718 (1996). Rost, B., Yachdav, G. & Liu, J. The PredictProtein server. Nucleic Acids Res. 32, W321-W326 (2004). Terwilliger, T. C. Maximum likelihood density modification. Acta Cryst. D56, 965-972 (2000). Storoni, L.C., McCoy, A.J., and Read, R.J. Likelihood-enhanced fast rotation functions. Pushing the boundaries of molecular replacement with maximum likelihood. Acta Cryst. D60, 432-438 (2004). Weeks, C.M., Blessing, R.H., Miller, R., Mungee, R., Potter, S.A., Rappleye, J., Smith, G.D., Xu, H. and Furey, W. Towards automated protein structure determination: BnP, the SnBPHASES interface, Z. Kristallogr. 217, 686-693 (2002).