here

advertisement
PROTEORED Multicentric Study
QUANTITATIVE PROTEOMICS METHOD
SPECTRAL COUNT
2010
UNIVERSITY OF BARCELONA
Salamanca 16th March
Objectives:
•Test each laboratory abilities to perform quantitative
proteomic analysis.
•Comparison of methodologies for relative
quantitative analysis of proteomes. The study should
provide data to assess and compare performance of
different methodologies and intra- and inter-lab
reproducibility of these.
•Evaluation of data reporting and data sharing tools
(MIAPE documents, standard formats, public
repositories).
Samples:
•Each participant laboratory will receive two protein mixture samples, labeled A and
B, containing each 100 µg of total protein.
•100 micrograms of each protein mixture A and B dissolved in 6M Urea /1% CHAPS,
at 6 micrograms/microliter concentration.
Samples contain:
A mixture of around 150 E. Coli proteins (identical in each sample). This mixture
has been prepared by fractionation of the cytoplasmatic proteome of E.Coli. It
contains soluble proteins, of a wide range of pI and Mw.
Four spiked mammalian proteins:
•CYC_HORSE (Cytochrome C, Mw 12362), added at the ~ 1 pmol/ 1 mg total
protein level.
•MYG_HORSE (Apomyoglobin, Mw 16952), at ~ 200 fmol / 1 mg total protein
•ALDOA_RABIT (Aldolase, Mw 39212), at ~ 25 fmol / 1 mg total protein
•ALBU_BOVIN (Serum albumin, Mw 66430), at ~ 1 fmol / 1 mg total protein
These four proteins have been spiked in different amounts in samples A and B,
with ratios ranging from 1.5:1 to 5:1 between the two samples.
Purpose of the analysis:
•The intended purpose of the analysis is to
measure the ratios between samples A and B for
the four spiked proteins. The “matrix” E. Coli
proteins, which should be unchanged, will provide
a measure of dispersion for the method used.
•The samples can be also used to test methods for
absolute quantitation, if desired.
•In order to evaluate reproducibility in an
homogenous dataset, we ask to perform a
minimum of 4 replicate analysis of the samples.
(Depending on the method of choice this would
demand a maximum of 4 + 4 LC-MS runs).
Methods:
•Sample complexity has been chosen to allow for the analysis of the mixture on single LC-MS
runs. In principle, there is no need for pre-fractionation. A long enough gradient (90-120 min)
gradient is suggested, but this of course will strongly depend on the MS instrument available
for analysis.
•1-2 micrograms of total protein per run should be enough to cover the range of abundances
of the spiked proteins in the samples. Again, this will depend a lot on the instrument used,
and should be adjusted by each Lab. according to their expertise.
•The sample is primarily intended to test non-targeted relative quantitation methodologies.
Both label-based methods (ICPL, iTRAQ, TMT, O18,...) and label-free methods (based on
spectral counts, Hi3, “LCMS Image analysis”...) can be performed and tested to analyze
the samples. Some of them will require 4 + 4 LC-MS runs, while others (i.e. 8-plex iTRAQ)
could require a single run to provide comparable measurements of reproducibility.
Try to choose the number of replicate analysis in a way that 4 independent measurements of
each A:B ratio are obtained, so that comparable statistics can be calculated.
•The sample can be also used if desired to test targeted methods, such MRM methods for
relative or absolute quantitation. The concentration of the spiked proteins is probably too
high to provide a real challenge for those methods, but it can still be useful for test purposes
(one can test accuracy, sensitivity on serial dilutions of the sample...)
•The amount of sample provided, as well as the concentration of the spiked proteins, should
allow also a 2D-DIGE analysis of the samples, although this is not the main purpose of the
experiment.
Quantitative Proteomic Approaches
• Label free
– Spectral counting
– Ion current based (Extracted ion chromatograms)
– Other
• Stable isotope labeling
– Stable isotope label reagent as ICAT and ITRAQ
– Metabolic labeling (SILAC, 15N)
– Others
Shotgun Proteomics
• Digestion of proteins and separation of peptides
– Extensive chromatographic separation (one or mutliple
dimensional separations, columns,..)
• Data acquisition
– Data-dependent acquisition (Automated acquisition of MS/MS spectra from as
many precursor ions as possible)
• Data analysis
– Automated interpretation of the MS/MS spectra (DB search)
Spectral Counting Summary
• Spectral count correlates well with protein abundance
• Fold change can be calculated and statistically evaluated
• Simple and straightforward implementation
• Sensitive to protein abundance changes – for abundant proteins 2 fold change
easily detected with high confidence
Fu et al, 2006
Limitations
• The response to increasing protein amount is saturable
• Noisy data at low spectral counts – large difference in spectral count necessary to
determine significant change
Spectral count reflects relative abundance of a protein (r2 ≥0.99)
Issues to address:
- Variability of Spectral counts
- Sensitivity of Spectral count to protein abundance changes
- How to determine relative changes between two samples
Variability of Spectral counting
LCMSMS analysis of replicate SCX fractions of K562 cell lysates, G-test
Old W. et al, MCP 2005
How to determine relative changes between two samples Fold change determination
Old W. et al, MCP 2005
• Practical issue – no peptides found in one of the compared samples
• Data discontinuity (spectral count – integers) – not amenable to Student t-test
• Differences in sampling depth
Fold change determination.
RSC = log2[(n2 + f)/(n1 + f)] + log2[(t1 - n1 + f)/(t2 - n2 + f)]
n1, n2 - spectral counts for sample 1 and 2
t1, t2 – total spectral count (sampling depth) for samples 1 and 2
f – correction factor 1.25 (Beissbarth et al – Bioinformatics 2004)
Observed RSC correlates well with expected RSC for standard proteins spiked into
complex samples (Old W. et al, MCP 2005)
•100 micrograms of each protein mixture A and B are dissolved
•in 6M Urea /1% CHAPS, at 6 micrograms/microliter concentration.
Samples were kept at -20ºC .
Precipitation with TCA/ACETONE
Re suspended in 100 uL 0.3 % SDS/50 mM Tris HCl pH 8.0/200 mM DTT
5 uL(5ug) Sample digested with trypsin O/N at 1/100 ratio
Separate with nanoHPLC (4 replicas 1uL)
MS/MS LTQ VelosOrbitrap
Analysis
Proteored A1
Proteored A2
Proteored A3
Proteored A4
Spectra analyzed
11.976
12.090
12.567
14.889
Proteored B1
Proteored B2
Proteored B3
Proteored B4
14.444
14.936
15.115
15.852
TOTAL
111.869
SEQUEST PARAMS
peptide_mass_tolerance = 0.07
fragment_ion_tolerance = 0.6
diff_search_options = 15.9949 M 0.000 C 0.000 X
Item LC-MS run
1
Number of MS/MS spectra acquired
2
A-1
A-2
A-3
A-4
B-1
B-2
B-3
B-4
Total Sample A-B
(Combined AB14 runs)**
11976
12090
12567
14889
14444
14936
15115
15852
13984
Number of total assigned peptides id.
1724
1702
1705
2660
2136
2388
2440
2576
2166
3
Number of unique peptides id.
1226
1183
1190
1559
1389
1447
1499
1660
1394
4
Number of E Coli proteins id. (total)
209
213
211
266
223
250
244
266
235
5
Number of E Coli single hit- proteins id
19
31
28
28
18
20
18
30
24
6
Number of Spiked proteins id.
4
3
3
3
4
4
4
4
3.6
7
FDR*
0
0
0
0
0
0
0 0.0038
0.0005
8
Total Number of proteins quantitated
5
9
Number of proteins quantitated > 3 peptides
2
10
Number of proteins quantitated > 2 peptides
3
11
Number of proteins quantitated 1 peptide
A/B ratio
12
Average of A/B ratios for E Coli proteins
13
Standard deviation A/B ratios
14
% CV A/B ratios E Coli proteins
The Normalized Spectrum Counts bar chart shows a protein's
relative abundance across different samples.
The y-axis is the normalized count of the spectra matching any of the
peptides in the protein. This count depends upon the protein, peptide,
required mods and search filters set on the Samples page.
Each bar along the x-axis is for a different biological sample.
The bars are color coded. Each sample category is colored a different
color.
The bar chart can be used as a visual confirmation of a differential
expression flagged by the Quantitative Analysis in the Samples view.
FDR= 0 242 Proteins
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
A1
A2
A3
A4
B1
B2
B3
B4
ALDOA_RABIT 16
18
21
18
10
12
10
10
ALBU_BOVIN
2
1
0
1
5
1
1
1
MYG_HORSE
17
21
15
19
9
14
14
12
CIC_HORSE
13
14
12
21
22
27
27
19
30
25
Av A
18.25
1.00
18.00
15.00
STD A
2.06
0.82
2.58
4.08
Av B
10.50
2.00
12.25
23.75
STD B
1.00
2.00
2.36
3.95
A/B
1.74
0.50
1.47
0.63
B/A
0.58
2.00
0.68
1.58
ALDOA_RABIT
MYG_HORSE
CIC_HORSE
20
15
10
5
0
ALBU_BOVIN
P-value=0.52
ALBUMIN_BOVIN
SAMPLE A2 Xcorr 0.88 DeltaCn 0.46
SAMPLE A1 Xcorr 3.16 DeltaCn 0.43
SAMPLE B2 Xcorr 2.9 DeltaCn 0.57
SAMPLE A1 Xcorr 3.23 DeltaCn 0.64
P-value=0.0.00053
ALDOA_RABIT
SAMPLE A2 Xcorr 2.12 DeltaCn 0.5
SAMPLE A3 Xcorr 5.33 DeltaCn 0.78
SAMPLE B3 Xcorr 5.35 DeltaCn 0.77
P-value=0.0.018
CYC_HORSE
SAMPLE B3 Xcorr 5.46 DeltaCn 0.66
SAMPLE A1 Xcorr 4.78 DeltaCn 0.66
SAMPLE B1 Xcorr 3.87 DeltaCn 0.55
MYG_HORSE
y9
100%
1,605.85 AMU, +2 H (Parent Error: 3.7 ppm)
P-value=0.000019
V
E
D
A
R
I
I
A
E
V
L
G
H
Q
G
Q
G
H
E
A
G
80%
V
L
I
I
D
R
A
E
V
y10
60%
516.2?
y13-NH3+2H+1
y9+1
y10+1
40%
635.3?
y13+2H
653.4?
y8
y13-H2O-H2O+2H
y10+2H
y11+2H
y6
y8+1
y7+1
a11-H2O-H2O+2H
20%
y4
y3
b4
b3
b14
b5
y11
b12
b11
b10
b6y5b7
b13
b9
0%
0
250
500
750
1000
1250
1500
m/z
SAMPLE A3 Xcorr 4.66 DeltaCn 0.72
y9
100%
1,852.96 AMU, +2 H (Parent Error: 1.3 ppm)
G
H
K
H
T
A
S
E
H
A
E
Q
L
A
K
L
P
L
P
A
L
K
Q
E
S
A
H
E
A
T
H
K
H
y15+2H
80%
Relative Intensity
Relative Intensity
y7
b14-NH3-H2O+2H
60%
b7
40%
y10
y9+1
b6
20%
y4
y11
y13
y7
b4
y3
y15-NH3-NH3+2H
b7+1b8-NH3-H2O
y5
b5
y8 b8
y6
y12
b12 b13
b11
b9
b10
y14
b14
0%
0
250
500
750
1000
m/z
1250
1500
1750
G
ECOLI
1.2
1
0.95
0.8
0.6
ECOLI
0.4
0.30
0.2
0.12
0
AV STD R A
FDR= 0 219
A STD R B
AV A/B
CIC_HORSE
MYG_HORSE
ALDOA_RABIT
Conclusions:
•Spectral count can be an easy way to try to perform quantitative proteome analysis, but
:
•Needs the ability to perform different LC runs with very low dispersion.
• The response to increasing protein amount is saturable.
• Noisy data at low spectral counts – large difference in spectral count necessary to
determine significant change.
Proteomic Facility
University of Barcelona
M José Fidalgo
Eva Olmedo
Francisco Fernández
Josep M Estanyol
Oriol Bachs
SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE SAMPLE
A1
A2
A3
A4
B1
B2
B3
B4
MA
ALDOA_RABIT 23
27
26
29
18
17
15
15
26.25
ALBU_BOVIN
2
1
0
1
5
1
1
2
1.00
MYG_HORSE
31
28
31
31
18
17
15
18
30.25
CIC_HORSE
22
18
17
26
28
33
31
27
20.75
DES A
2.50
0.82
1.50
4.11
MB
16.25
2.25
17.00
29.75
DES B
1.50
1.89
1.41
2.75
A/B
1.62
0.44
1.78
0.70
B/A
0.62
2.25
0.56
1.43
35
ALDOA_RABIT
30
MYG_HORSE
CIC_HORSE
25
ALBU_BOVIN
20
15
10
5
0
fmol/microgram E. Coli
protein
ALDOA_RABIT
BSA_BOVIN
MYG_HORSE
CYC_HORSE
MW
39212
66430
16952
12362
A
50
1
520
1000
B
25
5
200
1500
B/A
0.50
5.00
0.38
1.50
A/B
2.00
0.20
2.60
0.67
Download