The Mtb Proteome Library: Development and application of assays for targeted MS analysis of the complete proteome of Mycobacterium tuberculosis by SRM and SWATH-MS Olga Schubert Group of Prof. Ruedi Aebersold ETH Zurich Skyline User Meeting 2013 Minneapolis Mycobacterium tuberculosis (Mtb) • Mycobacterium tuberculosis is the causative agent of Tuberculosis (TB) • One third of the world’s population latently infected with Mtb • 1.7 million deaths from TB each year More efficient treatments urgently needed • Limited availability of techniques to measure proteins with high sensitivity, selectivity and reproducibility Traditional approach: Antibodies Only for few targets, high cost, low throughput Aim To generate a resource of validated assays for the sensitive detection and accurate quantification of every protein of Mycobacterium tuberculosis, even in complex backgrounds. The Mtb Proteome Library contains SRM assays for the entire proteome of Mtb A Phase I Proteome Mapping Bacterial cultures B Phase II Proteome Library Generation Prediction Scoring C Phase III Proteome Library Validation Bacterial cultures Peptide selection Protein extraction Protein extraction Peptide synthesis Proteolysis + Proteolysis - pI = pH MS Rv0001 Rv0002 Rv0003 Rv0004 Rv0005 Rv0006 Rv0007 MS accessible proteome Peptide fractionation Discovery MS Pools of 96 peptides MS Peptide and protein identification SRM-MS2 No fractionation MS Fragment ion selection Synthetic proteome library SRM Quantitative SRM trace Validated proteome library Schubert et al. Cell Host & Microbe 2013 The Mtb Proteome Library: A Resource of Assays to Quantify the Complete Proteome of Mycobacterium tuberculosis. Saturation Definition of the MS-accessible Mtb proteome by discovery MS A B Extensive fractionation and shotgun MS on Orbitrap XL ≥3 pep/prot 2 pep/prot 1 pep/prot 0 pep/prot 23% 4% 4% 69% Isoelectric focusing (off-gel electrophoresis) 3074 proteins (77%) Proteome mapping C 100% 3,074 (77%) Discovery MS 80% 80% 60% 40% 20% 0% RNA sequencing 456 (13%) 60% Saturation 100% 414 (12%) 2,618 (75%) 40% Acquired data Simulation all Simulation true 20% 0% 0 0 0 0 0 0 0 00 ,00 ,00 ,00 ,00 ,00 , 50 100 150 200 250 300 RNA-seq data: Arnvig et al.,matches PLoS Pathogens 2011 Peptide-spectrum Total: 3,488 (87%) 100% corresponds to the 4,012 annotated ORFs in Mtb (TubercuList v2.3) Generation of SRM assays using crude synthetic peptides 100% 80% 60% 85% ≥3 pep/prot 2 pep/prot 1 pep/prot 0 pep/prot C 20% Not detected 28% 39% 16% ≥3 pep/prot 2 pep/prot 1 pep/prot 0 pep/prot 17% 20% SRM validated Mtb Proteome Library 0% 62,884 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 (72%) Fraction peptidesper per bin bin Fraction of of peptides Fraction of peptides per bin Peptides with assay 100% 40% 60% 40% 60% No MS Previously evidence observed Total Mtb unscheduled Mtb scheduled C Mtb+human unsche 20% Peptides with assay Not detected Mtb+human schedu Peptides with assay Not detected 100% 0% 100% 10 0 1 2 3 4 5 6 80% 80% 8 Transitions 40% 60% 60% 40% 40% 20% 20% 0% 0% 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 Fraction of peptides per bin B C 60% 80% 80% 0% B 80% 100% 100%20% 40% Synthetic 0% Mtb Proteome Library No MS (97%) Total 3894 proteins 3,894 (97%) Previously evidence observed B Fraction of peptides with SRM assay 8% Fraction of peptides with SRM assay A A 3% 4% Peptides A 6 7Hydrophobicity 8 9 10 11 12(SSRCalc) 13 14 15 16 17 18 19 20 21 Length (amino acids) Length (amino acids) d-Score < 1.78 m-Score < 0.01 d-Score < 1.78 1 .0 m-Score < 0.01 17,463 crude synthetic peptides (JPT) measured in pools ofE ~100 on a Qtrap 4000 in SRM-triggered MS2 mode D 6 4 2 Increasing SRM/SWATH assay specificity and throughput by using iRTs and scheduled SRM iRT peptides spiked into each sample allow to determine a chromatography-independent retention time (iRT) for each peptide. Escher et al., Proteomics 2012, Using iRT, a normalized retention time for more targeted measurement of peptides Theoretical assessment of SRM assay specificity using the SRMCollider C 100% Peptides 80% 60% Mtb unscheduled Mtb scheduled Mtb+human unscheduled Mtb+human scheduled 40% 20% 0% 0 1 2 3 4 Transitions 5 6 Röst et al., MCP 2012, A computational tool to detect and avoid redundancy in selected reaction monitoring 80% 60% Validation of the Mtb Proteome Library in C unfractionated whole cell lysates by SRM 20% 0% 40% mProphet analysis SRM validatedMtb MtbProteome Proteome Library SRM2884 validated Library proteins (72%) 2884 at(72%) 1% FDR 2,884 (72% of the Mtb proteome) 2 3 4 Transitions 0 .8 4 0 0 0 1 3 0 0 0 0 5 6 0 .2 17% d e c o y t a r g e t decoy target q 0 .0 16% ≥3 pep/prot 2 pep/prot 1 pep/prot 0 pep/prot Mtb unscheduled Mtb scheduled d-Score < 1.78 m-Score < 0.01 E unsched Mtb+human Mtb+human scheduls 2 0 0 0 39% Transition groups 28% 0% 1 .0 D 20% 0% 20% Length (amino acids) 60% 1 0 0 0 B 40% 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 0 Synthetic Mtb Proteome Library 3,894 (97%) Peptides 80% 60% 0 .6 100% Fraction fr a c tio n 85% 40% 80% 0 .4 ≥3 pep/prot 2 pep/prot 1 pep/prot 0 pep/prot Fraction of peptides per Fraction of peptides per 8% −3 −2 . 9 4 −2 −2 . 0 1 −1 −1 . 0 7 0 1 3 −0 . 1 30 . 4 91 . 1 11 . 7 42 2 . 3 62 . 9 9 n o r m a liz e d d is c r im in a n ts c o r e ( d _ s c o r e ) Normalised discriminant score (d-Score) To validate all these SRM assays, over 200 scheduled SRM runs were needed. Reiter et al., Nature Methods 2011, mProphet: automated data processing and statistical validation for large-scale SRM experiments −3 The Mtb Proteome Library contains SRM assays for the entire proteome of Mtb A Phase I Proteome Mapping Bacterial cultures B Phase II Proteome Library Generation Prediction Scoring C Phase III Proteome Library Validation Bacterial cultures Peptide selection Protein extraction Protein extraction Peptide synthesis Proteolysis + Proteolysis - pI = pH MS Rv0001 Rv0002 Rv0003 Rv0004 Rv0005 Rv0006 Rv0007 Peptide fractionation Discovery MS Pools of 96 peptides MS Peptide and protein identification MS accessible proteome 3074 (77%) proteins SRM-MS2 No fractionation MS Quantitative SRM trace Fragment ion selection Synthetic proteome library 3894 (97%) proteins SRM Validated proteome library 2884 (72%) proteins Schubert et al. Cell Host & Microbe 2013 The Mtb Proteome Library: A Resource of Assays to Quantify the Complete Proteome of Mycobacterium tuberculosis. The Mtb Proteome Library is a publicly available resource of MS reference data, SRM assays and their validation www.PeptideAtlas.org/PASSEL www.PeptideAtlas.org www.SRMAtlas.org Application of the Mtb Proteome Library to study the dynamics of the DosR regulon of Mtb under hypoxia Mycobacterium bovis BCG A Culture density (OD600) 4 No aeration Re-aeration A B 3 2 1 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 Time (days) C Re-aeration Fold change 512 256 128 64 32 16 8 4 2 1 0.5 0.25 0 6h 1 2 3 4 5 Time (days) 6 7 8 9 Application of the Mtb Proteome Library to study the dynamics of the DosR regulon of Mtb under hypoxia Log2 Log 2 fold change fold change ≤-2.0 ≤-2.0 -1.5 -1.5 -1.0 -1.0 -0.5 -0.50.0 0.00.5 0.51.0 1.01.5 1.5 ≥2.0 ≥2.0 Rv1998c|Rv1998c Rv1998c|Rv1998c Rv1735c|Rv1735c Rv1735c|Rv1735c Rv2830c|vapB22 Rv2830c|vapB22 Rv1812c|Rv1812c Rv1812c|Rv1812c Rv2027c|dosT Rv2027c|dosT Rv0573c|pncB2 Rv0573c|pncB2 Rv0570|nrdZ Rv0570|nrdZ Rv1736c|narX Rv1736c|narX Rv0082|Rv0082 Rv0082|Rv0082 Rv2006|otsB1 Rv2006|otsB1 Rv3841|bfrB Rv0083|Rv0083 Rv3841|bfrB Rv2630|Rv2630 Rv0083|Rv0083 Rv2631|Rv2631 Rv2630|Rv2630 Rv0571c|Rv0571c Rv2631|Rv2631 Rv2004c|Rv2004c Rv0571c|Rv0571c Rv2005c|Rv2005c Rv2004c|Rv2004c Rv0081|Rv0081 Rv2005c|Rv2005c Rv2003c|Rv2003c Rv0081|Rv0081 Rv1997|ctpF Rv2003c|Rv2003c Rv2629|Rv2629 Rv1997|ctpF Rv3132c|devS Rv2629|Rv2629 Rv3133c|devR Rv3132c|devS Rv1813c|Rv1813c Rv3133c|devR Rv2028c|Rv2028c Rv1813c|Rv1813c Rv2624c|Rv2624c Rv2028c|Rv2028c Rv1996|Rv1996 Rv2624c|Rv2624c Rv0572c|Rv0572c Rv0079|Rv0079 Rv1996|Rv1996 Rv2625c|Rv2625c Rv0572c|Rv0572c Rv3134c|Rv3134c Rv0079|Rv0079 Rv0080|Rv0080 Rv2625c|Rv2625c Rv2627c|Rv2627c Rv3134c|Rv3134c Rv0569|Rv0569 Rv0080|Rv0080 Rv2029c|pfkB Rv2627c|Rv2627c Rv2030c|Rv2030c Rv0569|Rv0569 Rv2623|TB31.7 Rv2029c|pfkB Rv1738|Rv1738 Rv2030c|Rv2030c Rv3131|Rv3131 Rv2623|TB31.7 Rv2032|acg Rv1738|Rv1738 Rv2007c|fdxA Rv3131|Rv3131 Rv3127|Rv3127 0 6h 1 2 4 6 7 8 Rv2032|acg Rv3130c|tgs1 Rv2626c|hrp1 Time (days) Rv2007c|fdxA Rv2031c|hspX Rv3127|Rv3127 0 6h 1 2 4 6 7 8 Rv3130c|tgs1 Time (days) Rv2626c|hrp1 Rv2031c|hspX Statistical analysis of SRM data by SRMstats Chang et al., Mol Cell Proteomics 2012. Protein significance analysis in SRM measurements Absolute label-free quantification by SRM exploits the linear correlation of the sum of the top transitions of the top peptides per protein and the protein concentration C A B Mean 100 1.2 6 Concentration (fmol/µg) 5 2.30 0.1 2.15 1 2 2.25 1 3 0.0 2.20 10 4 1.0 0 2.0 2.35 6 y = 0.97x – 5.63 R2 = 0.83 n = 34 Density 0.4 0.8 3.0 fold error 1000 Transitions 2 3 4 5 Log10(concentration) [fmol/ug] A -1.0 0.01 5 6 7 8 log10(MS intensity) 9 1 C 2 Proteins 3 4 6 21955 proteins Peptides Linear correlation established using 34 anchor proteins quantified by AQUA peptides MS intensity: sum of 2 most intense transitions of the 3 most intense peptides per protein Ludwig et al., MCP 2011, Estimation of absolute protein quantities of unlabeled samples by SRM Proteome-wide absolute abundance estimates for Mtb 2-Component Systems Ribosome CH Met Nucleotide Metabolism Lipid Metabolism Vit Met AA Met ABC Transporter Xenobiotics Degradation TCA A 1000 Concentration (fmol/µg) Bacterial Secretion 100 10 1 Colour code corresponds to subfigure A. 0.1 0.01 Proteins SWATH-MS: Data-independent Targeted mass spectrometric techniques: SRM acquisition and SWATH with targeted data extraction SRM SWATH-MS TARGETED data acquisition data-independent acquisition Q1 Q2 0.7 Da Q3 Q1 0.7 Da precursor selection fragmentation Q2 TOF 10 ppm 25 Da fragment selection precursor selection fragmentation scanning intensity intensity TARGETED data extraction time time time time Gillet et al., MCP 2012, Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Figure by Christina Ludwig Generation of the Mtb SWATH library Synthetic peptides Fractionated lysates Whole-cell lysates of different stress conditions SWATH-MS TripleTOF shotgun mode Rv0569 GATIDQPDHR 2+ • Search each dataset with Mascot and Sequest • iProphet combination of all datasets • Alignment of runs into a common RT space (iRT) • Consensus spectral library generation with SpectraST • Extraction of the top transitions per precursor Rv0569 FGAVQSAILHAR 3+ Mtb SWATH library 3931 proteins (98%) 39,479 peptides Rv1738 ELVGVGLAR 2+ SRM SWATH-MS allows reproducible, high proteome coverage measurements of Mtb in a single run SWATH-MS 33% 44% 14% 9% ≥3 pep/prot 2 pep/prot 1 pep/prot 0 pep/prot 28% 39% 16% Rv0569 GATIDQPDHR 2+ 17% Proteome coverage by SWATH-MS Proteome coverage by SRM single injection 2683 proteins (67%) >200 injections 2884 proteins (72%) (openSWATH software) (mProphet software) Rv0569 FGAVQSAILHAR 3+ Rv1738 ELVGVGLAR 2+ SRM Summary • The Mtb Proteome Library is a public resource of SRM assays for the entire proteome of Mtb – SRM assays generated from crude synthetic peptides and validated in whole cell lysates – Data analysis with Skyline and supporting tools: iRT peptides, SRMCollider, mProphet, SRMstats – Data can be browsed on and downloaded from www.SRMAtlas.org, www.PeptideAtlas.org/passel • Application of the Mtb Proteome Library to study the dynamics of the DosR regulon of Mtb under hypoxia • Absolute label-free quantification by SRM exploits the linear correlation of the sum of the top transitions of the top peptides per protein and its absolute concentration. • Expansion of the Mtb Proteome Library for use with SWATH-MS • SWATH-MS allows reproducible, high proteome coverage measurements of Mtb in a single run ETH Zurich o Prof. Ruedi Aebersold o Christina Ludwig o Jeppe Mouritsen o George Rosenberger o Hannes Röst (Mon 5:45 pm and WP 595, Wed) o Ludovic Gillet (TOD pm, Tue 5:30 pm) o Ben Collins (WP 688, Wed) o Alessio Maiolica, Mariette Matondo Acknowledgements University of Ghana o Dr. Patrick K. Arthur Max Planck Institute Berlin o Prof. Stefan Kaufmann o Dr. Martin Gengenbacher ISB Seattle o Prof. Rob Moritz o Dave Campbell o Zhi Sun o Terry Farrah o Samuel Bader (ABSciex, Mon 7 am) orig. tríptico SysteMTb exterior ok traz.pdf 20/7/10 21:03:17 University of Washington o Brendan MacLean (TP 499, Tue) C M UBS Promedica Foundation Institute of Molecular Systems Biology