Improved phosphoproteome analysis by multi-enzymatic digestions targeting the Trypsin-Resistant Proteome (TReP) Bao Tran1; Celine Hernandez1, 2; Alexandra Potts1; Patrice Waridel1; Frederique Lisacek2; Manfredo Quadroni1 1 University of Lausanne, Lausanne, Switzerland; 2 Swiss Institute of Bioinformatics, Geneva, Switzerland OVERVIEW 2. METHOD 3.3. Improvement of LC-MS/MS mapping of large tryptic peptides 3. RESULTS NL: 2.50E7 Total number of AA 4.00E+05 Isolation of large peptides by SEC and digestion with four cleaving agents yields MS-suitable products, with increase of sequence coverage and identification: Small peptide 3.00E+05 Medium peptide Higher peptide 2.00E+05 All peptides 1.00E+05 Application to human phosphoproteomics : 410 unique (not found with trypsin alone) phosphosites, of which 41% are not annotated in databases 0.00E+00 1 11 21 31 41 51 61 71 81 91 Length of peptide (AA) Proteome covered by: Comparison: only 8% non-annotated phosphosites in our trypsin data “small” peptides (≥ 800 Da): 19.3% TReP analysis reveals a «hidden» part of the proteome “large” peptides (≥3000Da): 25.0% “medium” peptides (3000>Mw>800Da): 55.7% 80 70 60 50 40 30 20 60 40 20 10 0 30 40 50 Time (min) 60 70 400 600 800 10001200140016001800 m/z 30 0 20 100 95 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 50 10 10 Trypsin+ Glu-C Relative Abundance Trypsin Relative Abundance Covered sequence as function of peptide length (1’218’741 peptides of human proteome, Uniprot/Swissprot version 15.13), full trypsin cleavage SDNEDNDFDEDDEDDAALVAAMTNAPGNSLDIEESVGYGATSAPTSNTNHVVESANAAYYQR 90 80 Relative Abundance In silico tryptic peptides of human proteome 5.00E+05 100 90 70 Identification of large peptides by shotgun MS is not efficient NL: 2.50E7 100 3.1. In silico tryptic peptide analysis 20-30% of a trypsinised proteome are constituted of peptides with Mw≥3000 (TReP) 80 0 10 20 30 40 50 Time (min) 60 70 80 Total Ion Current (TIC) of SEC fraction 3 of sample SCVMprot analyzed directly and after Glu-C digestion. The inset spectrum was acquired at RT=31.21 min and assigned to one peptide (displayed in red) of protein ID B3LQ90_YEAS1. The corresponding extended tryptic peptide shown is 63 AA long (Mw: 6.5 kDa). Second digestion may lead to the generation of analyzable fragments out of much larger peptides which, in their intact form, would be very difficult to detect and identify by LCMS/MS. 3.4. Phosphopeptide analysis Concept : after trypsin digestion, isolation of all polypeptides with Mw ≥ 3000 Da, and digestion with alternative proteases Amino acid frequency ratio of in silico large tryptic peptides (69’571) to medium ones (465’160) in human proteome Amino acid frequency ratio of large human peptides vs medium peptides Analysis of complex samples by liquid chromatographyelectrospray MS after trypsin cleavage is the most popular approach for bottom-up proteomics. However, comprehensive mapping of post-translational modifications is limited, among other things, by the presence of long sequences without trypsin cleavage sites. We tried to: Complement the trypsin-based workflow with a strategy that specifically targets the fraction of the proteome that cannot be digested by trypsin (the Trypsin-Resistant Proteome, TReP), in order to give compatible masses for shotgun MS Saccharomyces cerevisiae vacuolar membrane fraction (SCVMprot) Human melanoma cell line lysate (SKMel28 cells) 1.2 1 0.8 0.6 0.4 0.2 SEC fractionation: 11 fractions on Äkta purifier 10 system, Superdex Peptide 10/300 GL column RPLC-MS/MS analysis: Agilent nano 1100 HPLC system coupled to LTQ-Orbitrap-XL Data processing 0 K R E D N H Q Y L W C V A G F T S P Amino acid MS/MS spectra sets of trypsin or 2nd enzymes Uniprot Phospho-database Phosphosite.org Phospho-database K.QPPPNMIFNPNQNPMANQEQQNQSIFHQQSNMAPMNQEQQPMQFQ SQSTVSSLQNPGPTQSESSQTPLFHSSPQIQLVQGSPSSQEQQVTLFLS PASMSALQTSINQQDMQQSPLYSPQNNMPGIQGATSSPQPQATLFHNT AGGTMNQLQNSPGSSQQTSGMFLFGIQNNCSQLLTSGPATLPDQLMAI SQPGQPQNEGQPPVTTLLSQQMPENpSPLASSINTNQNIEK.I The phosphosite (pS) and the protein phosphorylation were not annotated. 3.2. Fractionation of tryptic digests according to peptide size Peptide Mw ≥ 2.4 kDa Peptide Mw < 2.4 kDa LogMw 5 4.5 31kDa 4 10kDa 800 3.5 3.1kDa 3 1.0kDa 2.5 0.3kDa Composition of SKMel phosphosites No New phosphoproteins Phosphoprotein ? 400 800 928 743 trypsin 688 600 400 240 Enhance sequence coverage 1.5 200 No 1 Site Annotated ? Increase number of phosphosite identifications. 0.5 0 Phosphopeptides Ascore ≥ 19 Yes 1 2 3 4 5 6 7 8 9 1011 0.0 Annotated phosphosites New phosphosites 5.0 55 84 43 72 66 chymo Glu-C 66 56 54 Asp-N FA Phosphorylation sites: Localization: Ascore ≥ 19 ref 1 Site mapping (custom-made scripts): • Compare the phosphosite sets from different proteases • Find the new sites not annotated by Uniprot and phosphosite.org databases 15.0 20.0 25.0 ml 4 second enzymes Total Protease non-annotated sites Identified: • SCVMprot: 80 µg • SKMel: 4 mg • Collection of 11 fractions of 1mL with 410 other phosphosites with 4 second enzymes 17 0 trypsin phosphosites 225 sites not annotated in Uniprot/Swiss-prot and phophosite.org databases 703 phosphorylated peptides; 440 phosphoproteins (19 not annotated) 4. CONCLUSION Calibration curve: logMw = -0.2088V + 5.9066 ____ SCVMprot PHOSPHOSITE MAPPING 10.0 225 170 200 annotated sites 2 Yes Localized: 1000 600 Database search Ascore 1. Beausoleil, S. A., Villen, J., Gerber, S. A., Rush, J., and Gygi, S. P. (2006) A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat Biotech 24, 1285-1292. I mAU Apply the method to characterization of trypsinresistant protein sequences: 5. REFERENCE M Frequency of S,T,Y residues in large peptides is higher than in medium peptides. Example: Tryptic phosphopeptide (24.8 kDa) of protein NFAT5_HUMAN, and the corresponding peptide generated with Glu-C (red, identified in fraction 3). Number of phospho sites Samples: 1. INTRODUCTION Amino acid frequency ratio 1.4 ____ Human SKMel Fractions 1-6 containing large peptide were subjected to 2nd digestion with other cleaving agents. Fraction 3 of SCVMprot was used to examine LC-MS behavior of large tryptic peptides before and after second digestion (Glu-C). Fractions of SKMel identification experiments. were used for phosphosite The developed method uses a second digestion step targeting large tryptic peptides and was successfully employed to discover trypsin-inaccessible phosphorylation sites in the proteome of a human melanoma cell line, significantly enhancing the number of phosphosites. Many of the identified phosphosites and phosphoproteins have not been found in UNIPROT and phosphosite.org databases. This strategy complements regular trypsin-based analysis.