Mol. Cell. Proteomics

advertisement
Mining Clinical Proteomes for
Post-Translational Modifications
David L. Tabb, Ph.D.
david.l.tabb@vanderbilt.edu
Overview
• Analytical chemistry to enhance PTM ID
• Database searching for PTMs
• Sequence tagging for blind PTM searches
• Problem 1: Localization ambiguity
• Problem 2: Higher FDR as a function of PTMs
Discovery Proteomics
Protein
Mixture
Peptide
Mixture
High-Resolution
Isolate
Mass Spectrometry Ions of Peptide
Peptide
Liquid
Electrospray
Fractionation Chromatography Ionization
Collide Ions to
Dissociate
Collect Fragments
in Tandem MS
Two types of measurements for each peptide: intact m/z
(mass/charge) and a list of fragment m/zs.
Immobilized metal affinity columns
• Phosphopeptides bear affinity to
metal. Iron, gallium, and titanium
oxides have been used in columns.
• Conducting peptide capture rather
than protein capture can lead to a
proliferation of one-hit wonders.
• The enrichment from “IMAC” is not
perfect, and acidic peptides may
bind just as favorably.
http://www.sigmaaldrich.com, “Phosphopeptide Enrichment Kit”
pTyr antibodies
• Monoclonal antibodies
were raised against
phosphorylated Tyr.
• Technology propelled
investigations of signals
from pTyr, despite the
relatively low frequency
of this residue.
http://www.cellsignal.com/products/7902.html
Rush et al (2005) Nature Biotechnol. 23: 94-101.
Glycan columns
• Lectins are carbohydrate-binding proteins.
• Columns of immobilized lectins can enrich for
proteins with specific sugars.
• Amidases may be used to label sugar sites.
Hirabayashi (2004) Glycoconjugate J. 21: 35-40.
Tateno et al (2007) Nature Protocols 2: 2529-2537.
Electron transfer dissociation (ETD)
• Charge difference draws positively-charged
peptides to accept electrons from radicals.
• The peptide is cleaved between nitrogen and
alpha carbon for a particular amino acid (c-z).
• This gentle process produces fragments
without disrupting labile PTMs.
• Fragments are measured with high mass
accuracy in Orbitrap mass analyzer.
+
−
Database search overview
Eng et al (1994) J. Amer. Soc. Mass Spectrom. 5: 976-989.
Yates et al (1995) Anal. Chem. 67: 1426-1436.
Dynamic PTMs grow search space
Because multiple
PTMs may be in
each peptide,
adding PTMs to a
search creates an
exponential cost.
Here, three sites
lead to eight PTM
variants.
CASA1_BOVIN
More PTMs, more candidates
No PTMs
PTMs on M, S, T, R, K...
Larger peptides gain disproportionately more comparisons
when PTMs are in play.
Tabb, Anal. Chem. (2005) 77: 2464-2474.
Amino acids differ in frequency
Ala
8.26
Gly
7.08
Pro
4.71
Arg
5.53
His
2.27
Ser
6.58
Asn
4.05
Ile
5.94
Thr
5.34
Asp
5.46
Leu
9.66
Trp
1.09
Cys
1.37
Lys
5.83
Tyr
2.92
Gln
3.93
Met
2.41
Val
6.87
Glu
6.74
Phe
3.86
Adding a mass shift for a common amino acid slows
performance far more than for a rare amino acid.
Swiss-Prot release notes, Jan 9, 2015
Where can I find potential PTMs?
http://www.unimod.org
Search spaces and run-times
Precursor
Tolerance
PTM Set
Trypsin
Specificity
Variants ⱡ
Comparisons
Time
(sec)
20 ppm
(none)
Unconstrain
1689446563
10503186797
1877
10 ppm
(none)
Semi
340208357
1243965215
261
20 ppm
(none)
Semi
340208357
2246604590
392
30 ppm
(none)
Semi
340208357
2943698877
471
40 ppm
(none)
Semi
340208357
3389096687
513
20 ppm
(none)
Fully
21393082
150628871
28
20 ppm
NtermQ-17
Fully
22592311
166565665
31
20 ppm
M+16
Fully
30951288
229212380
44
20 ppm
M+16STY+80
Fully
476292461
986998364
265
20 ppm
M+16STY+80
Semi
9501671953
16133509368
4678
ⱡ number of candidates generated after PTM expansion and filtering on basis of mass,
length, invalid residues, excess numbers of PTMs
20141202_LZ_FFPE_standard_1, 16320 HCD MS/MS scans,
MyriMatch 2.1.138, eight Linux CPUs, RefSeq 62 human
How do comparisons scale?
Trypsin Specificity
PTMs, fully tryptic
Area of circle represents the number
of comparisons between a
decorated peptide and an MS/MS.
Sequence tagging uses spectra twice
Dasari et al. (2010) J. Proteome Res. 9: 1716-1726
How should we field mass shifts?
•
•
•
•
Dynamic: expand sequences exponentially
Preferred: combine specified mass shifts
Mutant: substitute residues for each other
Blind: allow any shift on any single residue
Dasari et al. (2011) Chem. Res. Tox. 24: 204-216
Blind searches yield mass shift tables
Now in IDPicker 3.1
Dasari, Chem. Res. Tox. (2011) 24: 204-216.
Sprung, Mol. Cell. Proteomics (2009) 8: 1988-1998.
Gaining expertise in blind interpretation
• Boring is more often correct than brilliant.
+22 Da is Sodium, not Asp → His
• Blind PTMs are useful for finding patterns
of mass shifts; do not put faith in
individual peptide-spectrum matches.
• Unusual cleavages can appear as peptideterminal mass shifts in blind searches.
Blind PTM workflow
Holman et al. Meth. Molec. Biology (2013) 1002: 167
Problem 1: Site localization
• Multiple PTM-decorations of a sequence may
tie in score for a spectrum, or nearly tie.
• One can say this peptide and PTM explain the
spectrum, but is position correct?
• Ascore defines a new score to estimate
probability from differentiating fragments.
• Delta score techniques compare original DB
search scores of variants to assess site error.
Beausoleil. Nature Biotechnology (2006) 24: 1285-1292.
Savitski. Mol. Cell. Proteomics (2011) 10: M110.003830.
Taus. J. Proteome Res. (2011) 10: 5354-62.
Which fragment ions move?
Shifts in PTM
position cause
changes for few
fragments.
These fragments
take on special
importance in
localization.
Absence of these
fragments results
in ambiguity.
CASA1_BOVIN
Problem 2: FDR escalation
• Even though one controls global FDR, errors
are not uniform throughout collection.
• More PTMs implies more ways for software to
distort a sequence to force it to fit a spectrum.
• More modifiable sites implies more degrees of
freedom for distortion.
• Examine empirical FDR rates for peptides
containing different numbers of PTMs.
Real world data
PTMs/Peptide Target
Decoy
FDR
0
58587
363
1.2%
1
169416
1046
1.2%
2
49911
706
2.8%
3
9678
321
6.6%
4
1265
89
14.1%
Search engine: MS-GF+ v9733
Peptide Variants
Modifications: [STY] 80 [M] 16, max PTMs=4
Data: 468 Q-Exactive LC-MS/MS Experiments
Site: Broad Institute, TCGA Breast Collection
PSM Filters: 0.01 PSM aggregate FDR, 2 spectra per peptide
Result: Identified 316,3440 spectra to 291,382 peptide variants
Summary
• Recognizing PTMs through database search has
been possible since 1995. It is the most common
way that PTM inventories are built.
• Adding even a few PTMs to database search will
greatly reduce its speed (and sensitivity).
• Blind search is appropriate only when you are
determining which patterns of modification are
present; it should not be your final search.
• Software may be of assistance, but your eyes and
critical thinking are your biggest assets.
Download