pmic7773-sup-0001-SupMat

advertisement
1
Supplementary Material
2
Methods
3
Programs were written in Microsoft Visual Studio 2008 in Visual Basic .NET. For fast sorting
4
of the selected peptides masses, List(Of Long) Visual Basic class was used. Masses with five-
5
digit precision and indices containing sequence information (protein number, start position in the
6
sequence, and length of the peptides) were concatenated and converted to a long integer value
7
and fast sorted using sort method of the class. After sorting, the long integer values were
8
converted back to peptide masses and corresponding peptide sequence information was
9
associated with them. The algorithm was implemented in the DXMSMS Match Sorting
10
program, which is available free for download at
11
http://www.creativemolecules.com/CM_Software.htm.
12
13
Experimental crosslinking data were generated as described previously [1, 2]. Briefly, proteins
14
were crosslinked with isotopically-coded amine reactive crosslinker CBDPS-H8/D8 (Creative
15
Molecules Inc.), digested with proteinase K, the crosslinked peptides were enriched by using
16
monomeric avidin and were analyzed by ESI-LC-MS and MS/MS using an Orbitrap mass
17
spectrometer and the Mass Tags acquisition method.
18
19
Mass spectrometric analysis was carried out with a nano-HPLC system (Easy-nLC II,
20
ThermoFisher) coupled to the ESI-source of an LTQ Orbitrap Velos mass spectrometer
21
(ThermoFisher Scientific). Samples were injected onto a 100 μm ID, 360 μm OD IntegraFrit
22
trap column (New Objective Inc.) packed with Magic C18AQ (5 µm particle size, 100 Å pore
23
size, Bruker-Michrom) and desalted by washing for 15 min 300 nl/min with 0.1% (v/v) formic
24
acid. Peptides were subsequently injected onto a 75 μm ID, 360 μm OD IntegraFrit analytical
25
column packed with Magic C18AQ (5 µm particle size, 100 Å pore size), equilibrated with 95%
26
solvent A (2% (v/v) acetonitrile, 98% (v/v) water, 0.1% (v/v) formic acid), 5% solvent B (90%
27
(v/v) acetonitrile, 10% (v/v) water, 0.1% (v/v) formic acid). Peptides were separated at a flow
28
rate of 300 nl/min using a 70 minutes gradient (0–60 min: 4–40% solvent B, 60–62 min: 40–80%
29
solvent B, 62–70 min: 80% solvent B).
30
31
MS data were acquired with Xcalibur (ver. 2.1.0.1140) with Mass Tags and Dynamic Exclusion
32
precursor selection methods enabled in global data dependent settings. For CBDPS-H8/D8, a
33
mass difference between the light and heavy isotopic forms of 8.05824 Da was used in Mass
34
Tags setting. Mass tags and inclusion list runs used the Top 3 method. MS scans (m/z range
35
from 400 to 2000) and MSMS scans were acquired in the Orbitrap mass analyzer at 60000 and
36
30000 resolution, respectively. Fragment ions for MSMS acquisition were produced by collision-
37
induced dissociation (CID) at normalized collision energy of 35% for 10 ms at activation q=0.25.
38
Data analysis was performed with DXMSMS Match of ICC-CLASS [3].
39
40
Computationally, in order to implement the algorithm, all of the peptide masses are first sorted
41
into a one-dimensional array. The search iterations are performed as two nested loops, with
42
indices starting from opposite ends of the array and with index increments going in opposite
43
directions (Figure 1S). The outer loop starts from the top of the one-dimensional array and the
44
inner loop starts from the bottom of the one-dimensional array. The inner loop is set to start
45
from MP2 + and exit after reaching the element of the one-dimensional array equal to MP2 –.
46
Compared to conventional search algorithms, this strategy reduces the number of iterations
47
required to search n peptides from ~0.5 n2 to ~6.8 n (Figure 2S).
48
49
50
51
52
Figure 1S. Illustration of the fast mass matching algorithm calculation. Calculated peptide
53
masses are sorted ascending in one-dimensional array. The sums of possible peptide masses are
54
tested pairwise (Mi+Mj) to see if they fit the experimentally determined mass (Mobs). The search
55
is organized as two nested loops. For each Mi (upper box, outer loop), which corresponds to
56
MP1, only masses in the vicinity of Mj (i.e., masses between Mj+k and Mj-l -- the start and end
57
points of the inner loop), are searched (lower box). The inner loop index (j+k) starts from the
58
position where the previous examination of the MP2 values ended, and ends with j-l (i.e., when
59
the element of the array with mass equal to MP2 - is reached. As the algorithm proceeds, the
60
peptide mass Mi increases, while the peptide mass Mj decreases (arrows). In this way, for each
61
Mi, only elements from j+k to j-l are searched instead of the entire array (from 1 to n). This
62
results in a significant reduction in the number of iterations needed, and a significant
63
improvement in search speed.
64
M1
.
Mi
.
.
.
.
.
.
Mj-l
.
.
Mj
.
.
.
Mj+k
.
.
.
Mn
|Mobs-(Mi+Mj+MCL)|< 
65
Figure 2S. Dependence of the ratio of the number of iterations required for the conventional
66
method to the number of iterations required for the fast sorting algorithm on the size of the
67
protein database. The ratio was calculated as number of iterations necessary for the conventional
68
algorithm ((n+1)n/2) divided by the number of iterations used in the fast sorting algorithm for the
69
peptide selections shown in Table 1, where n is the number of the peptides selected from the
70
protein database in each case. The calculated curve for the conventional method is shown in
71
Figure 2Sa, and the number of iterations required is approximately equal to ~0.5 n2. By
72
performing a linear regression on the data for the fast sorting method, the number of iterations
73
required is ~6.8n, Figure 2Sb. Thus the calculated number of iterations needed is reduced by a
74
factor of ~0.07n (i.e., 0.5 n2/0.07n = 6.8n), Figure 2Sc.
75
76
77
a)
number of iterations w/o fast sort
78
79
4E+11
3.5E+1
1
y = 0.500n2
R2 = 1
3E+11
2.5E+1
1
2E+11
1.5E+1
1
1E+11
5E+10
0
0
100000 200000 300000 400000 500000 600000
700000 800000
900000 1000000
n, number of peptides
b)
number if iterations with fast sort
80
81
82
7000000
y = 6.78n + 51786
R2 = 0.9603
6000000
5000000
4000000
3000000
2000000
1000000
0
0
100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000
n, number of peptides
c)
ratio of iterations w/ and w/o fast sort
83
84
85
70000
y = 0.074n + 252.64
R2 = 0.9805
60000
50000
40000
30000
20000
10000
0
0
100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000
n, number of peptides
86
87
88
89
90
91
References
92
Proteinase K Non-Specific Digestion for Selective and Comprehensive Identification of
93
Interpeptide Crosslinks: Application to Prion Proteins., Mol. Cell. Proteomics 2012, 11,
94
M111.013524.
95
[2] Petrotchenko, E. V., Serpa, J. J., Borchers, C. H., An Isotopically-coded CID-cleavable
96
biotinylated crosslinker for structural proteomics, Mol. Cell. Proteomics 2011,
97
doi:10.1074/mcp.M110.001420
98
[3] Petrotchenko, E. V., Borchers, C. H., Isotopically-Coded Cleavable CrossLinking Analysis
99
Software Suite (ICC-CLASS) for the automated analysis of MALDI- and ESI-LC-MS-MS/MS
100
101
102
[1] Petrotchenko, E. V., Serpa, J. J., Berjanskii, M., Suriyamongkol, B. P., et al., Use of
crosslinking data., BMC Bioinformatics, submitted.
Download