Comparison of human Solute Carriers Avner Schlessinger, Pär Matsson, James E. Shima, Ursula Pieper, Sook Wah Yee, Libusha Kelly, Leonard Apeltsin, Robert M. Stroud, Thomas E. Ferrin, Kathleen M. Giacomini, Andrej Sali SUPPORTING INFORMATION Fig. S1 Fig S1. Similarity map using permissive E-value cutoff. E-value is a statistical measure corresponding to the number of different alignments with scores better than random in a database search. The lower the E-value, the more likely the alignment is reflecting an evolutionary relationship. Similarity maps are constructed similarly to Figure 1; each link represents a pairwise alignment with an E-value of less than 0.1. 1 Fig. S2 Fig S2. Similarity map using stringent E-value cutoff. Similarity maps are constructed similarly to Figure 1 and Figure S2. Each link represents a pairwise alignment with an Evalue of less than 0.001, corresponding to reliable alignments that are likely to have biological significance. 2 Fig. S3 Fig S3. Conservation and topology in SLC2 and SLC22 families. (A) The topologies of extensively studied SLC2A1 (GLUT1) and SLC22A1 (OCT1) transporters from the SLC2 and SLC22 families, respectively, are shown. The residues are colored according to the sequence conservation, from white (no conservation) to brown (fully conserved). The conservation scores used were from a multiple sequence alignment of all members of the 3 SLC2 and SLC22 families. Thick black line marks residues with structural/functional importance (e.g., decreased transport, changed substrate affinity, or topology changes). Dashed lines mark positions in which other members of the family have functional residues, based on the multiple sequence alignment of the family. For SLC2A1, the transport pathway is thought to consist of transmembrane helices 2, 4, 5, 7, 8, and 10. Many of the residues that are highly conserved within the SLC2 and SLC22 cluster fall within these helices. In addition, the intracellular loops are generally well conserved. (B) Urate transporters from different families are analyzed. Thick black border marks residues in SLC2A9 that are conserved between the two transporters (based on pairwise alignment). To distinguish inter-family conservation from conservation within the SLC2 family, residues are colored according to the conservation with sugar transporters in the SLC2 family: Residues are colored white if the intra-family conservation is below 50% or if the residue in SLC2A9 differs from the SLC2 family consensus. Residues in blue are conserved within the SLC2 family. 4 Fig. S4 Fig. S4 Rate of uptake of uric acid in HEK cells expressing human URAT1 (SLC22A12) or GLUT9 (SLC2A9). HEK-293T cells were transiently transfected with pcDNA5 vector (EV) or pCDNA5 vector containing human URAT1 (white bars) or GLUT9 (black bars) cDNAs. Radiolabeled uptake experiments were performed at least 5 separate times and the uptake values, determined at 2 minutes for URAT1 and 3 minutes for GLUT9, were normalized to total protein. The results are shown as the mean and the standard error of the mean. SLC2A9L and SLC2A9S correspond to isoform 1 and 2 of SLC2A9, respectively. 5 Fig. S5 Fig S5. Cataloging Solute Carriers using Network Filtration Protocol. This alternative clustering method suggests that the SLC16 and SLC22 families are hubs that are similar to many other Solute Carrier families. 6 Fig. S6 Fig S6. Structural coverage of Solute Carriers. Comparative models and alignments for all Solute Carrier sequences that can be related to at least one known structure were computed with default settings by ModPipe1 and are available in ModBase (http://modbase.compbio.ucsf.edu/projects/SLC/)2; briefly, an overlap of at least 70% of the target sequence to a sequence of a known structure at the sequence identity cutoff 30% was required. We also included models that are estimated to have the correct fold using several statistical potentials i.e., their corresponding Z-DOPE3 scores were lower than 0, or MPQS1 scores higher than 1.1. The map is constructed using the same sequence identity cutoffs as in Fig. 2B. The yellow nodes represent modelable sequences. 7 REFERENCES 1. 2. 3. Eswar N, John B, Mirkovic N, Fiser A, Ilyin VA, Pieper U, Stuart AC, Marti-Renom MA, Madhusudhan MS, Yerkovich B and others. Tools for comparative protein structure modeling and analysis. Nucleic Acids Res 2003;31(13):3375-80. Pieper U, Eswar N, Webb BM, Eramian D, Kelly L, Barkan DT, Carter H, Mankoo P, Karchin R, Marti-Renom MA and others. MODBASE, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 2009;37(Database issue):D347-54. Shen MY, Sali A. Statistical potential for assessment and prediction of protein structures. Protein Sci 2006;15(11):2507-24. 8