Proteomics Informatics Workshop Part III: Protein Quantitation David Fenyö February 25, 2011 • Metabolic labeling – SILAC • Chemical labeling • Label-free quantitation • Spectrum counting • Stoichiometry • Protein processing and degradation • Biomarker discovery and verification Proteomics Informatics Biological System Experimental Design Samples MS/MS MS Sample Preparation Measurements Data Analysis Information about each sample Information Integration Information about the biological system What does the sample contain? How much? Proteomic Bioinformatics – Quantitation C ij p p p Lysis L ij p D ijk LC Fractionation Pr p ij Digestion p ik I Sample i Protein j Peptide k ik Pep k C ij j Cij k L Pr ij ij p p ik I LC-MS ik MS pijk D MS ik Pep LC MS ik ik ik p p p I p p p p p p ik k L Pr D Pep LC MS ij ij ijk ik ik ik k Quantitation – Label-Free (Standard Curve) Sample i Protein j Peptide k Lysis Fractionation Digestion LC-MS MS C k ij f ( I ik ) I ik Quantitation – Label-Free (MS) Sample i Protein j Peptide k Lysis Fractionation Digestion LC-MS MS Assumption: p p p p p p k L Pr D Pep LC MS ij ij ijk ik ik ik constant for all samples Ci / Ci n MS j m j I in j / I im j Quantitation – Metabolic Labeling L Ci n j H Light Heavy n m j M M pi Ci pi Lysis j j Assumption: All Fractionation losses after mixing are identical for the Digestion heavy and light isotopes and M M LC-MS pi j pi j n Sample i Protein j Peptide k m L Ii n k L H MS m H Ii m k Oda et al. PNAS 96 (1999) 6591 Ong et al. MCP 1 (2002) 376 Comparison of metabolic labeling and label-free quantitation SILAC Metabolic Label free assumption: Label-Free p p p p p p k L Pr D Pep LC MS ij ij ijk ik ik ik constant for all samples Metabolic labeling assumption: p -1 -0.5 0 log2(ratio) 0.5 1 M ij constant for all samples and the behavior of heavy and light isotopes is identical G. Zhang et al., JPR 8 (2008) 1285-1292 Intensity variation between runs Replicates 1 IP 1 Fractionation 1 Digestion 1-1-1 3-3-1 vs 3 IP 3 Fractionations 1 Digestion -1 -0.5 0 log2(ratio) 0.5 1 G. Zhang et al., JPR 8 (2008) 1285-1292 How significant is a measured change in amount? It depends on the size of the random variation of the amount measurement that can be obtained by repeat measurement of identical samples. SILAC Metabolic Label-Free -1 -0.5 0 log2(ratio) 0.5 1 Protein Complexes – specific/non-specific binding Tackett et al. JPR 2005 Protein Turnover Heavy dC C H j dt L j Move heavy labeled cells to light medium (t ) Newly produced proteins will have light label (K C K T ) C (t ) C H j (t ) C H j H j (t ) (0) C H j (t ) C H j Light ( )t (0) e K C K T KC=log(2)/tC, tC is the average time it takes for cells to go through the cell cycle, and KT=log(2)/tT, tT is the time it takes for half the proteins to turn over. I log( H j (t ) I j (t ) L I H j (t ) ) t( 1 t C 1 t T ) log( 2) Super-SILAC Geiger et al., Nature Methods 2010 Quantitation – Protein Labeling Light Lysis Heavy Fractionation Digestion Assumption: All losses after mixing are identical for the heavy and light isotopes and L M L M p i j pi j p i j pi j n n m m LC-MS L H MS Gygi et al. Nature Biotech 17 (1999) 994 Quantitation – Labeled Proteins Recombinant Proteins (Heavy) Lysis Light Fractionation Digestion LC-MS L H MS Assumption: All losses after mixing are identical for the heavy and light isotopes and L M M pi j pi j pi j n n m Quantitation – Labeled Chimeric Proteins Recombinant Chimeric Proteins (Heavy) Lysis Fractionation Light Digestion LC-MS L H MS Beynon et al. Nature Methods 2 (2005) 587 Anderson & Hunter MCP 5 (2006) 573 Quantitation – Peptide Labeling Assumption: All losses after mixing are identical for the heavy and light isotopes and Lysis Fractionation L Light Heavy D M pi pi pi pi n Digestion Pr j n L j n Pr jk D n k M pi pi pi pi m j m j m jk m k LC-MS L H MS Gygi et al. Nature Biotech 17 (1999) 994 Mirgorodskaya et al. RCMS 14 (2000) 1226 Quantitation – Labeled Synthetic Peptides Lysis Fractionation Assumption: All losses after mixing are identical for the heavy and light isotopes and L Enrichment with Peptide antibody Light Anderson, N.L., et al. Proteomics 3 (2004) 235-44 LC-MS L D M pi pi pi pi n Digestion Pr j n j n jk n k p M sk Synthetic Peptides (Heavy) H MS Gerber et al. PNAS 100 (2003) 6940 Quantitation – Label-Free (MS/MS) Lysis Fractionation Digestion LC-MS SRM/MRM MS/MS MS MS MS/MS Quantitation – Labeled Synthetic Peptides Light Lysis/Fractionation Synthetic Peptides (Heavy) Digestion LC-MS L L MS/MS Synthetic Peptides (Heavy) H H MS MS/MS L L H MS/MS MS H MS/MS Quantitation – Isobaric Peptide Labeling Lysis Fractionation Digestion Light Heavy LC-MS MS L Ross et al. MCP 3 (2004) 1154 H MS/MS Quantitation – Label-Free (MS) Quantitation – Label-Free (MS/MS) Quantitation – Label-Free (Standard Curve) Lysis Lysis Lysis Fractionation Fractionation Fractionation Digestion Digestion Digestion LC-MS LC-MS LC-MS MS MS Quantitation – Metabolic Labeling Light MS/MS MS MS MS MS/MS Quantitation – Protein Labeling Quantitation – Labeled Chimeric Proteins Recombinant Chimeric Proteins (Heavy) Heavy Lysis Lysis Light Fractionation Lysis Heavy Fractionation Fractionation Light Digestion Digestion Digestion LC-MS LC-MS LC-MS L H MS L H MS L H MS Quantitation – Peptide Labeling Quantitation – Isobaric Peptide Labeling Quantitation – Labeled Synthetic Peptides Lysis Lysis Lysis Fractionation Fractionation Fractionation Digestion Light Heavy LC-MS L H MS Digestion Light Digestion Light Heavy LC-MS LC-MS MS L H MS/MS L H MS Synthetic Peptides (Heavy) Isotope distributions m = 1878 Da m = 2234 Da Intensity m = 1035 Da m/z m/z m/z Intensity Peak Finding Find maxima of S (l ) I (k ) |k l |w / 2 m/z The signal in a peak can be estimated with the RMSD (I (k ) I ) 2 |k l |w / 2 w /2 and the signal-to-noise ratio of a peak can be estimated by dividing the signal with the RMSD of the background Intensity Background subtraction m/z Estimating peptide quantity Intensity Peak height Curve fitting Peak area m/z Intensity Time dimension Time Time m/z m/z Intensity Sampling Retention Time Sampling 140 3 points 120 100 80 60 5% 40 20 0 0.8 0.85 0.9 0.95 1 0.95 1 30 3 points 25 20 5% 15 10 5 0 0.8 0.85 0.9 Acquisition time = 0.05s Sampling Thresholds (90%) 1.1 1 0.9 0.8 0.7 0.6 0.5 1 2 3 4 5 6 7 # of points 8 9 10 Time Estimating peptide quantity by spectrum counting m/z Liu et al., Anal. Chem. 2004, 76, 4193 What is the best way to estimate quantity? Peak height - resistant to interference - poor statistics Peak area - better statistics - more sensitive to interference Curve fitting - better statistics - needs to know the peak shape - slow Spectrum counting - resistant to interference - easy to implement - poor statistics for low-abundance proteins Examples - qTOF Examples - Orbitrap Examples - Orbitrap Intensity ratio Intensity ratio Isotope distributions Peptide mass Peptide mass Intensity Intensity Intensity AADDTWEPFASGK Time 2 2 1 Ratio 1 0 2 0 2 Ratio Intensity Intensity Intensity AADDTWEPFASGK 1 1 0 0 Time m/z m/z m/z Intensity Intensity Intensity AADDTWEPFASGK G H I Intensity Intensity Intensity YVLTQPPSVSVAPGQTAR Time 2 2 1 Ratio 1 0 2 0 2 Ratio Intensity Intensity Intensity YVLTQPPSVSVAPGQTAR 1 1 0 0 Time m/z m/z m/z Intensity Intensity Intensity YVLTQPPSVSVAPGQTAR Retention Time Alignment Mass Calibration Cox & Mann, Nat. Biotech. 2008 The accuracy of quantitation is dependent on the signal strength Cox & Mann, Nat. Biotech. 2008 Workflow for quantitation with LC-MS LC-MS Data Standardization Retention time alignment Mass calibration Intensity normalization Quality Control Detection of problems with samples and analysis Quantitation Peak detection Background subtraction Limits for integration in time and mass Exclusion of interfering peaks Standardization Quality Control Quantitation Peptide Quantities Biomarker discovery Lysis Fractionation Digestion LC-MS MS MS Reproducibility Paulovich et al., MCP 2010 Biomarker verification Light Lysis/Fractionation Synthetic Peptides (Heavy) Digestion LC-MS L L MS/MS Synthetic Peptides (Heavy) H H MS MS/MS L L H MS/MS MS H MS/MS Reproducibility CPTAC Verification Work Group Study 7 10 peptides 3 transitions per peptide Conc. 1-500 fmol/μl Human plasma Background 8 laboratories 4 repeat analyses per lab Addona et al., Nat. Biotech. 2009 Reproducibility Addona et al., Nat. Biotech. 2009 Correction for interference MRM analysis of low abundance proteins is sensitive to interference from other components of the sample that have the same precursor and fragment masses as the transitions that are monitored. During development of MRM assays, care is usually taken to avoid interference, but unanticipated interference can appear when the finished assay is applied to real samples. Ratios of intensities of transitions 1000 Peptide 1 Measured concentration [fmol/ul] Measured concentration [fmol/ul] 1000 100 10 line tr1 tr2 tr3 1 0.1 100 10 line tr1 tr2 tr3 1 0.1 1 10 100 Actual concentration [fmol/ul] 1000 1 Peptide 1 4 tr2/tr1 tr3/tr1 3 10 100 Actual concentration [fmol/ul] 1000 Peptide 2 100 Intensity ratio Intensity ratio Peptide 2 2 1 0 tr1/tr2 tr3/tr2 10 1 0.1 1 10 100 Concentration 1000 1 10 100 Concentration 1000 Detection of interference Interference is detected by comparing the ratio of the intensity of pairs of transitions with the expected ratio and finding outliers. Transition i has interference if z threshold z i where Zthreshold is the interference detection threshold; z max z i j i ji max j i r ji I s j I i ; ji zji is the number of standard deviations that the ratio between the intensities of transitions j and i deviate from the noise; Ii and Ij are the intensities of transitions i and j; rji is the expected ratio of the intensity of transitions j and i; and sji is the noise in the ratio. Correction for interference in experimental data 1000 Peptide 1 100 10 line 1 Uncorrected corrected 0.1 Measured concentration [fmol/ul] Measured concentration [fmol/ul] 1000 Peptide 2 100 10 line 1 Uncorrected corrected 0.1 1 10 100 Actual concentration [fmol/ul] 1000 1 10 100 Actual concentration [fmol/ul] 1000 Correction for interference in experimental data 1000 Peptide 1 100 10 line 1 Uncorrected corrected Measured concentration [fmol/ul] Measured concentration [fmol/ul] 1000 100 0.1 10 line 1 Uncorrected corrected 0.1 1 0.2 10 100 Actual concentration [fmol/ul] 1000 1 1000 Peptide 2 line 40 0.6 error Relative Relative error 0 line -0.1 10 100 Actual concentration [fmol/ul] 0.8 50 Peptide 1 0.1 Relative error Peptide 2 Uncorrected Corrected Uncorrected Corrected 30 0.4 20 0.2 10 0 0 Uncorrected Corrected -0.2 -10 -0.2 1 10 100 Actual concentration [fmol/ul] 1000 11 10 100 Actual Actual concentration [fmol/ul] 1000 1000 Proteomics Informatics Workshop Part I: Protein Identification, February 4, 2011 Part II: Protein Characterization, February 18, 2011 Part III: Protein Quantitation, February 25, 2011