F07811A_SM4_1 Supplemental Material 4: Subcellular Profiling of Plasmodium falciparum Proteins Expressed in Sporozoites, Merozoites, Trophozoites and Gametocytes. From all four stages analysed, we identified 439 proteins known or predicted to have at least one transmembrane segment or a GPI addition ‘signal’ at the Cterminus (18% of the dataset), 45% of which also contained an N-terminal leader sequence. Our dataset also included 304 soluble proteins with a signal sequence, i.e. proteins potentially secreted or located to organelles (Supplementary Table 4). These structural predictions were based on the TMHMM 1, big-PI Predictor 2 and SignalP 3-5 algorithms, which were run against the entire Plasmodium falciparum genome. Well-characterized integral membrane proteins (i.e, proton-pumping vacuolar pyrophosphatase, P-type calcium-translocating ATPase, and nucleoside transporter 1) and secreted proteins (i.e., KAHRP and CLAG) were identified. However, over half of the potentially secreted and integral membrane proteins detected were annotated as hypothetical (Supplementary Table 4). Cell surface proteins constituted the main class of known proteins with one or two transmembrane domains, whereas proteins with six or more transmembrane domains were mostly classified as transport functions (Supplementary Table 4). Similarly, known proteins with a signal peptide were mostly proteases and organellar proteins. By association, some hypothetical proteins with signal peptides may be as yet uncharacterised components of the apical organelles or cell surface. To determine whether the predicted membrane associated proteins specifically localize in membranes, we systematically sorted the proteins identified from soluble and insoluble protein fractions, which were digested and analysed independently (Supplementary Table 4). Compiling results from all four stages, soluble and insoluble runs contributed 2,164 and 836 proteins, respectively, 585 proteins being found in both (Supplementary Table 4). The majority of these common proteins were abundant soluble proteins, such as histones and ribosomal proteins, contaminating the membrane preparations (Supplementary Table 4). When comparing our experimental solubility-based fractionation results with the computational predictions, it appeared that proteins known or predicted to be secreted and/or integral to the membranes were found throughout the soluble and insoluble fractions (Supplementary Figure 4a). Since no protease inhibitors were used during the sample preparation, protein degradation due to endogenous proteases can occur. Soluble peptides from integral membrane proteins can be released and detected in soluble fractions. This would most likely explain why many membrane proteins were identified from soluble fractions. Whereas the membrane protein class constituted about 16 % of all proteins detected in soluble fractions, 30% of the proteins uniquely identified from pellets were predicted to be integral to the membrane (Supplementary Figure 4b), suggesting that the insoluble fractions were enriched in membrane associated proteins. F07811A_SM4_2 a 1600 1400 Number of Proteins 1200 1000 800 600 400 200 0 Both Supernatant only Fractions Pellet only Number of Proteins (% of Total Number of Proteins identified in Fraction) b SP = 0 and TM = 0 SP=1 and TM = 0 TM>= 1 100 90 80 70 60 50 40 30 20 10 0 Both Supernatant only Pellet only Fractions Supplementary Figure 4: Proteins identified in our analysis were sorted, specifying the biochemical fractions in which they were detected. Three fractions were defined: proteins identified in i) both soluble and insoluble analyses, ii) supernatant only, or iii) membrane pellets only. For each fraction, 3 structural classes were detailed: proteins known or predicted i) to be entirely soluble (SP = 0 and TM = 0), ii) to contain 1 signal peptide (SP=1 and TM =0) and iii) to have at least 1 transmembrane segment or GPI modification site (TM>=1) (a). The number of proteins in each structural class are plotted as a percentage of the total number of proteins detected in the solubility-based fractions (b). Interestingly, 144 proteins found uniquely in carbonate-extracted membrane fractions did not display in their primary sequences any of the known membrane interacting features (Supplementary Table 4). Among these were a putative phospholipase and a phosphatidylserine decarboxylase (Supplementary Table 4), which modify phospholipids, but which have no predicted transmembrane component. An O-sialoglycoprotein endopeptidase, which digests membrane-bound O-sialoglycoproteins, was detected uniquely in the membrane fractions. Soluble subunits can also associate with transmembrane complexes, such as the cytochrome c oxidase subunit II (Supplementary Table 4). Finally, proteins such as kinases and phosphatases, which were identified from membrane fractions (Supplementary Table 4), associate and modify membrane receptors as part of the signal transduction cascade. Of the 43 known proteins found uniquely in membrane fractions, 26 have been shown to function at the lipid interface (Supplementary Table 4). Proteins involved in membrane functions are therefore not limited to integral membrane proteins, soluble proteins are found at the lipid interface, involved in many different processes from energy transduction to signalling. Since it is problematic to identify proteins that associate with a lipid interface based solely on primary amino acid sequence, the differential proteomic method we describe here should help in the characterization of soluble proteins that F07811A_SM4_3 appear to function at the lipid membrane. Taken together, our results indicate that the soluble proteins found uniquely in membrane fractions, 101 of which are hypothetical, can be inferred to have membrane-associated functions (Supplementary Table 4). 1. 2. 3. 4. 5. Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E. L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305, 567-80. (2001). Eisenhaber, B., Bork, P. & Eisenhaber, F. Sequence properties of GPI-anchored proteins near the omega-site: constraints for the polypeptide binding site of the putative transamidase. Protein Eng 11, 1155-61 (1998). Nielsen, H., Engelbrecht, J., Brunak, S. & von Heijne, G. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng 10, 1-6. (1997). Nielsen, H. & Krogh, A. Prediction of signal peptides and signal anchors by a hidden Markov model. Proc Int Conf Intell Syst Mol Biol 6, 122-30. (1998). Nielsen, H., Engelbrecht, J., Brunak, S. & von Heijne, G. A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Int J Neural Syst 8, 581-99. (1997).