m/z

advertisement
Proteomics Informatics –
Protein characterization I:
post-translational modifications (Week 10)
Post-translational modification
• Biologically important post-translational modification
(phosphorylation, acetylation, glycosylation, etc.)
• Introduced on purpose during sample preparation (alkylation,
iTRAQ, TMT etc.)
• Side-products of sample preparation (oxidation, deamidation,
carbamylation, formylation etc.)
Post-translational modification
Mann and
Jensen, Nature
Biotech. 21,
255 (2003)
Phosphorylation examples
Unmodified
b
1
--261.1556 2
421.1862 3
520.2546 4
621.3022 5
718.3549 6
819.4025 7
920.4502 8
1080.481 9
1167.513 10
1281.556 11
1382.603 12
1495.687 13
1610.714 14
1723.798 15
1820.851 16
1951.891 17
2038.923 18
2135.976 19
20
---
F
I
C
V
T
P
T
T
C
S
N
T
I
D
L
P
M
S
P
R
y
--2163.024
2049.94
1889.909
1790.841
1689.793
1592.741
1491.693
1390.645
1230.615
1143.583
1029.54
928.4923
815.4083
700.3814
587.2974
490.2447
359.2042
272.1722
175.1195
pS18
b
--261.1556
421.1862
520.2546
621.3022
718.3549
819.4025
920.4502
1080.481
1167.513
1281.556
1382.603
1495.687
1610.714
1723.798
1820.851
1951.891
2118.923
2215.976
---
1F
2I
3C
4V
5T
6P
7T
8T
9C
10 S
11 N
12 T
13 I
14 D
15 L
16 P
17 M
18 S
19 P
20 R
y
--2243.024
2129.94
1969.909
1870.841
1769.793
1672.741
1571.693
1470.645
1310.615
1223.583
1109.54
1008.492
895.4083
780.3814
667.2974
570.2446
439.2042
272.1722
175.1195
pT5
b
--261.1556
421.1862
520.2546
701.3022
798.3549
899.4025
1000.45
1160.481
1247.513
1361.556
1462.603
1575.687
1690.714
1803.798
1900.851
2031.891
2118.923
2215.976
---
y"
1 F --2I
2243.024
3C
2129.94
4V
1969.909
5T
1870.841
6P
1689.793
7T
1592.741
8T
1491.693
9C
1390.645
10 S
1230.615
11 N
1143.583
12 T
1029.54
13 I
928.4923
14 D
815.4083
15 L
700.3814
16 P
587.2974
17 M 490.2447
18 S
359.2042
19 P
272.1722
20 R
175.1195
Potential modifications
Enrichment Strategies for the
Detection of Phosphorylated Peptides
Enrichment Strategies for the
Detection of Phosphorylated Peptides
Unphosphorylated
single phosphorylation
multiple phosphorylation
• Hydrophilic Interaction Chromatography (HILIC)
• Phosphopeptides elute later than their unphosphorylated
counterparts
• Stationary phase is hydrophilic
• Mobile phase is hydrophobic
Enrichment Strategies for the
Detection of Phosphorylated Peptides
SCX
Time (min)
neutral peptides
basic peptides
• Strong Cation Exchange Chromatography
• Stationary phase is negatively charged
• Mobile phase is a buffer that is increasing the pH (if peptide
becomes neutral it elutes)
• Neutral peptides elute earlier: XXpSxxxxxR/K
• Positive peptides elute late: XXXXHXXXXR/K
Several Strategies are often combined
Loss of the phosphate group
Localization of modifications
Probability of Localization
1.2
1
0.8
Phosphopeptide
identification
0.6
0.4
0.2
0
0
5
10
15
20
25
Number of fragment ions
mprecursor = 2000 Da
Dmprecursor = 1 Da
Dmfragment = 0.5 Da
Phosphorylation
Localization of modifications
Probability of Localization
1.2
1
0.8
dmin>=3 for 47%
of human tryptic
peptides
Localization (dmin=3)
0.6
0.4
0.2
ID
3
0
0
5
10
15
20
Number of fragment ions
25
mprecursor = 2000 Da
Dmprecursor = 1 Da
Dmfragment = 0.5 Da
Phosphorylation
Localization of modifications
Probability of Localization
1.2
1
dmin=2 for 33% of
human tryptic
peptides
0.8
Localization (dmin=2)
0.6
0.4
ID
3
2
0.2
0
0
5
10
15
20
Number of fragment ions
25
mprecursor = 2000 Da
Dmprecursor = 1 Da
Dmfragment = 0.5 Da
Phosphorylation
Localization of modifications
Probability of Localization
1.2
1
dmin=1 for 20% of
human tryptic
peptides
0.8
0.6
Localization (dmin=1)
0.4
ID
3
2
1
0.2
0
0
5
10
15
20
Number of fragment ions
25
mprecursor = 2000 Da
Dmprecursor = 1 Da
Dmfragment = 0.5 Da
Phosphorylation
Localization of modifications
Probability of Localization
1.2
1
0.8
0.6
0.4
Localization
(d=1*)
0.2
ID
3
2
1
1*
0
0
5
10
15
20
Number of fragment ions
25
mprecursor = 2000 Da
Dmprecursor = 1 Da
Dmfragment = 0.5 Da
Phosphorylation
Localization of modifications
Peptide with two possible modification sites
Localization of modifications
Peptide with two possible modification sites
Intensity
MS/MS spectrum
m/z
Localization of modifications
Peptide with two possible modification sites
Matching
Intensity
MS/MS spectrum
m/z
Localization of modifications
Peptide with two possible modification sites
Matching
Intensity
MS/MS spectrum
m/z
Which assignment does
the data support?
1, 1 or 2, or 1 and 2?
Visualization of evidence for localization
AAYYQK
AAYYQK
Visualization of evidence for localization
AAYYQK
AAYYQK
Visualization of evidence for localization
1
2
3
1
2
3
Estimation of global false
localization rate using decoy sites
False localization frequency
By counting how many times the phosphorylation is localized to
amino acids that can not be phosphorylated we can estimate the
false localization rate as a function of amino acid frequency.
0.02
0.015
0.01
0.005
0
0
0.05
Y
0.1
Amino acid frequency
0.15
How much can we trust a
single localization assignment?
If we can generate the distribution of scores for
assignment 1 when 2 is the correct assignment, it is
possible to estimate the probability of obtaining a certain
score by chance for a given peptide sequence and
MS/MS spectrum assignment.
1.
2.
S
S
m
1
m
1

S
m
2
S1 2
2
2
 F (S 1 )dS 1
m
p
2
1

0

F (S 1 )dS 1
0
S
2
1
2
2
2
Is it a mixture or not?
If we can generate the distribution of scores for
assignment 2 when 1 is the correct assignment, it is
possible to estimate the probability of obtaining a certain
score by chance for a given peptide sequence and
MS/MS spectrum assignment.
1.
2.
S
m
1

S
m
2
1
S
p2 
m
2
Sm
2
1
1
1
(
)
 F S 2 dS 2
0

F
0
S
1
2
1
1
1
( S 2) dS 2
Localization of modifications
Peptide with two possible modification sites
Matching
Intensity
MS/MS spectrum
m/z
Which assignment does
the data support?
1, 1 or 2, or 1 and 2?
p
p
p
p
2
1
2
1
2
1
2
1
 p and
p
 p and p
 p and p
 p and p
1
th
2
1
th
2
1
th
2
1
th
2
p 
th
1 and 2
th
1
p 
p 
(S 1  S 2  p
th
Ø
th
1 or 2
p 
m
m
2
1
p )
1
2
Top down / bottom up
Top down
intensity
Bottom up
mass/charge
Charge distribution
Top down
Bottom up
2+
31+
intensity
intensity
27+
3+
4+
1+
mass/charge
mass/charge
Isotope distribution
Top down
Bottom up
m = 1878 Da
intensity
intensity
m = 1035 Da
mass/charge
mass/charge
Fragmentation
Top down
Bottom up
Fragmentation
Alternative Splicing
Top down
Exon 1
2
Bottom up
3
Correlations between modifications
Top down
Bottom up
The Nucleosome Core Complex
H3
H3 ‘tail’
H4
H2A
H2B
Luger et al., Nature, 389, 251-260, 1997
The N-terminal Tails of Histone H3 and H4
M
M
P
M P
Ac
P
Ac
M Ac
P M
M
M P
M M
H3 1-ARTKQTARKSTGGKAPRKQLATKAARKSAPATGGVKKPHRYRPTVALRE-50
Ac
P
M
Ac
Ac
Ac
Ac
M
Ac
H4 1-SGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARRGGVKRISGLIYE-52
P
Phosphorylation
M
Methylation: mono-, di-, or trimethylation
Ac
Acetylation
The Histone Code Hypothesis
Specific post translational modifications (PTMs) of the
N-terminal tails of histones function as a scaffold for
binding of protein factors leading to transcriptional
activation or inactivation.
Jenuwein, T., Allis, C.D., Science, 293, 2001
Interdependence of Modifications is lost in
Standard Mass Spectrometry Analysis
M
M
P
Ac P
M P
Ac
Ac M
Ac
P
M
Ac
MM P
M M
H3 1-ARTKQTARKSTGGKAPRKQLATKAARKSAPATGGVKKPHRYRPTVALRE-50
M
TKQTAR
3-8
M P
Ac
KSTGGKAPR
9-17
Ac
M
KQLATKAAR 18-26
Ac
M
KSAPATGGVKKPHR
41-50
27-40
YRPTVALRE
Histone Proteins are a Highly Complex Mixture
of a Single Protein….
M
M M
M
ARTKQTARKSTGAKAPRKQLASKAARKSAPATGGIKKPHRFRPGTVALRE
M
Ac
ARTKQTARKSTGAKAPRKQLASKAARKSAPATGGIKKPHRFRPGTVALRE
M M
M
M
ARTKQTARKSTGAKAPRKQLASKAARKSAPATGGIKKPHRFRPGTVALRE
M
Ac
ARTKQTARKSTGAKAPRKQLASKAARKSAPATGGIKKPHRFRPGTVALRE
M M
M M
ARTKQTARKSTGAKAPRKQLASKAARKSAPATGGIKKPHRFRPGTVALRE
M
M M
M
ARTKQTARKSTGAKAPRKQLASKAARKSAPATGGIKKPHRFRPGTVALRE
……………… and many many more!
Protocol
LTQ-ETD/PTR
LTQ-FTMS
Glu-C generated
N-terminal H3 peptide (1-50)
N
50
4
9
14
18
23
27
36 37
546.3
547.6
+10
+11
• Isolate m/z ± 0.5 Da
+9
+12
549.1
550.4
551.9
+8
+7
m/z
• 60 ms ETD
544.9
•~ 3 min acquisition
245.2
+ 10 charge states
346.3
982.5
502.4
D 1.4 Da
D 1.4 Da
824.5
D 1.4 Da
892.5
630.5
731.5
672.3
288.1
1647.9
1055.6
571.3
479.9
802.5
958.6
1715.0
1216.7
401.8
1129.6
1255.2
1616.0
m/z
m/z
1784.1
1878.2
1515.4
1373.8 1424.8
1937.8
Group ‘4’: 4 Acetyl Groups
Relative Abundance
100
c2 c3
z9
c4
z7
z2
z4
z3 *
*
*
c5 c6
z5 z6
c7
*
*
* **
*
c8
*
*
*
**
z14
c9 z10c10
z12
z11
* c11c12 c13 *
**
*
* *
z15
* c16
z16
* c17
0
400
M M
800
Ac
Ac
m/z
Ac
Ac
1200
1600
M
AR T K Q TAR K S T GAKAP R K Q LAS KAAR K SAPAT G G I K K P H R F R P G T VAL R E
M
M M
Ac
Ac
Ac
Ac
AR T K Q TAR K S T GAKAP R K Q LAS KAAR K SAPAT G G I K K P H R F R P G T VAL R E
M
Ac
Ac
Ac
Ac
M M
A R T K Q TA R K S T GA K A P R K Q LA S K AA R K S A PAT G G I K K P H R F R P G T V A L R E
2000
Group ‘5’: 5 Acetyl Groups
c4
Relative Abundance
100
c2 c
3
K4: trimethyl
c6
c5
z3
z2 * *z * z5 z6z
4
7
* *
c7
z15
z11 c11
* * c8 *
z12
* c z14 * c16z16z17 c
z10c9 c10* c12
17
c13 14
*
* *
*
*
0
400
600
z9
800
1000
1200
1400
1600
1800
2000
m/z
M
M M
Ac
Ac
Ac
Ac
Ac
AR T K Q TAR K S T GAKAP R K Q LAS KAAR K SAPAT G G I K K P H R F R P G T VAL R E
Proteomics Informatics –
Protein characterization I:
post-translational modifications (Week 10)
Download