Supplementary figures, text, tables

advertisement
Supplementary Information
Database searching extant protein sequences for fossil proteins
We searched the moa peptides against databases of extant proteins, including closely
related avian taxa, allowing us to determine large stretches of the collagen sequences. This
approach has been used previously for determination of partial protein sequences not in the
database, such as sequences from >100 proteins from mammoth [1-6]. Our search space allows
us to detect a broad set of peptides from many taxa and our subsequent alignment of the
sequenced peptides allowed us to combine homologous sequences for moa that produce a unique
total sequence. Alternatively, de novo sequencing is an option to determine sequences from
extinct taxa; however, this approach requires complete fragmentation of the peptides and/or high
resolution to provide accurate novel protein sequences [7]. These limitations are confounded by
the requirement of homology comparison to determine both what protein the peptide derives
from and whether the peptides are the result of contamination (e.g., human keratin, bacterial
peptides). This approach remains limited for highly variable proteins (i.e., ones that have very
species specific sequences) [7], so the majority of protein and peptide sequences determined
from fossil taxa are from highly conserved proteins and/or highly conserved portions of proteins.
Phylogenetic analysis of Moa and other Archosaurian Collagen I
We aligned our moa sequences to collagen I alpha 1 and alpha 2 sequences of other taxa
(Table S9) in Seaview. Mature collagen I sequences of alpha 1 and alpha 2 were generated by
cutting at the following motifs: alpha 1 N-terminus: FAP|QM, alpha 1 C-terminus: RYY|RAD,
alpha 2 N-terminus: FAA|QYD, alpha 2 C-terminus: GPPG|PNGGG. After alignment, the
sequences were analyzed using “Traditional Search” in TNT [8] with the following parameters:
TBR, 1000 random seeds, 10,000 replicates, 100 trees/replicated stored. Tree support was
calculated using 10,000 jackknife rearrangements with 36% removal probability and Bremer
support was calculated for suboptimal trees within 10 steps of the most parsimonious tree.
Table S8: Protein coverage for Collagen II and Collagen V
Mascot
Sequest
PEAKS
Combined
Collagen II alpha 1 25.85%
22.64%
15%
37.55%
Collagen V alpha 1 4.24%
N.D.
N.D.
Collagen V alpha 2 4.07%
6.41%
N.D.
Collagen V alpha 3 N.D.
4.64%
N.D.
Collagen II compared to mouse collagen II because chicken sequence is incomplete. Collagen V
alpha 1 and 2 compared to mouse and collagen V alpha 3 compared to human. N.D. = not
detected
Table S9: Collagen I sequences for phylogeny from Uniprot (ex. G1NB83) or NCBI nr (ex.
gi|557280805|ref|XP_006023387.1|).
Species
Gallus gallus
Halieetus leucocephalus
Acanthisitta chloris
Pseudopodoces humilis
Falco cherrug
Falco peregrinus
Corvus brachyrhychos
Serinus canaria
Colius striatus
Aquila chrysaetos canadensis
Mesitornis unicolor
Caprimulgus carolinensis
Fulmarus glacialis
Nestor notabilis
Tinamus guttatus
Picoides pubescens
Nipponia nippon
Calypte anna
Cuculus canorus
Tauraco erythrolophus
Pelecanus crispus
Opisthocomus hoazin
Merops nubicus
Egretta garzetta
Corvus cornix cornix
Anas platyrhynchos
Meleagris gallopavo
Alligator mississippiensis
Alligator sinensis
Collagen (I) alpha 1
P02457
gi|729725683|ref|XP_010583768.1|
gi|678004726|ref|XP_009078835.1|
gi|543378683|ref|XP_005532542.1|
gi|541971469|ref|XP_005439463.1|
gi|529432211|ref|XP_005235901.1|
gi|669287747|ref|XP_008636727.1|
gi|683934078|ref|XP_009096333.1|
gi|706139306|ref|XP_010202685.1|
gi|706121071|ref|XP_010196347.1|
gi|768393217|ref|XP_011592259.1|
gi|704579485|ref|XP_010188005.1|
gi|704549976|ref|XP_010179386.1|
gi|704320308|ref|XP_010168851.1|
gi|697032266|ref|XP_009578273.1|
gi|701314007|ref|XP_010017878.1|
gi|719764165|ref|XP_010216927.1|
gi|699701676|ref|XP_009907392.1|
gi|694843991|ref|XP_009462891.1|
gi|663256038|ref|XP_008490435.1|
gi|696962633|ref|XP_009569448.1|
gi|678130795|gb|KFU99981.1|
gi|694654154|ref|XP_009483316.1|
gi|700386166|ref|XP_009933168.1|
gi|675615678|ref|XP_008936670.1|
gi|726995379|ref|XP_010412372.1|
gi|514767400|ref|XP_005024678.1|
gi|733929245|ref|XP_010725420.1|
gi|564240577|ref|XP_006277120.1|
gi|557280805|ref|XP_006023387.1|
Collagen (I) alpha 2
P02467
gi|729719813|ref|XP_010568320.1|
gi|677971594|ref|XP_009071196.1|
gi|543346326|ref|XP_005518863.1|
gi|541950754|ref|XP_005432180.1|
gi|529417920|ref|XP_005228841.1|
gi|669272225|ref|XP_008628130.1|
gi|683900718|ref|XP_009084209.1|
gi|706114711|ref|XP_010194131.1|
gi|768337876|ref|XP_011583521.1|
gi|704555715|ref|XP_010181131.1|
gi|704291102|ref|XP_010176198.1|
gi|697023504|ref|XP_009573423.1|
gi|701308887|ref|XP_010015343.1|
gi|719730974|ref|XP_010210602.1|
gi|699630750|ref|XP_009905768.1|
gi|694847576|ref|XP_009464876.1|
gi|663262068|ref|XP_008492371.1|
gi|696975215|ref|XP_009557161.1|
gi|701313377|ref|XP_009983928.1|
gi|694628038|ref|XP_009490310.1|
gi|700388462|ref|XP_009934436.1|
gi|675608533|ref|XP_008948217.1|
gi|697818148|ref|XP_009640166.1|
gi|727017062|ref|XP_010397702.1|
gi|514708862|ref|XP_005010992.1|
G1NB83
gi|564228122|ref|XP_006258514.1|
gi|557286414|ref|XP_006025957.1|
Figure S1: Collagen II alpha 1 sequence coverage determined by Sequest, Mascot, and PEAKS.
Figure S2: Collagen I alpha 2 peptide showing dimethylation of asparagine.
Figure S3: Collagen I alpha 1 peptide showing hydroxylation of proline and acetylation of lysine.
Figure S4: Collagen I alpha 2 peptide showing acetylation of alanine. Additionally, this peptide
was detected with (bottom) and without (top) deamidation of asparagine.
Figure S5: Collagen II alpha 1 peptide showing fucose on serine.
Figure S6: Collagen I alpha 1 peptide showing hydroxylation of proline and carboxymethyllysine
on the C-terminal lysine residue. The position of this CML residue could represent a backbone
cleavage because other CML peptides showed missed cleavages at the modified lysine residue.
Figure S7: TNT-based parsimony phylogeny derived from Collagen I sequences. Values above
branches are jackknife values and values below branches are Bremer support.
Orbitrap XL Parameters
API SOURCE
Source Voltage (kV):
Source Current (uA):
Capillary Voltage (V):
Capillary Temp (C):
Tube Lens Voltage (V):
2.02
0.28
47.01
150.01
99.97
VACUUM
Ion Gauge (E-5 Torr):
1.80
Convectron Gauge (Torr):
0.93
FT VACUUM
FT Penning Gauge (E-10 Torr) 0.42
FT Pirani Gauge 1 (Torr):
0.85
FT Pirani Gauge 2 (Torr):
0.00
ION OPTICS
Multipole 00 Offset (V):
-5.50
Lens 0 (V):
-5.90
Multipole 0 Offset (V):
-5.76
Lens 1 (V):
-10.01
Gate Lens (V):
-31.99
Multipole 1 Offset (V):
-15.37
Multipole RF Amplitude (Vp-p 401.02
Front Lens (V):
-6.21
Front Section (V):
-9.00
Center Section (V):
-12.03
Back Section (V):
-6.99
Back Lens (V):
0.00
Trap Eject Offset (V):
6.00
FT Transfer Multipole Offset 4.36
FT Transfer Multipole Amplit 500.00
FT Gate Lens Offset (V):
6.40
FT Trap Lens Offset (V):
7.88
FT Storage Multipole Offset 8.55
FT Storage Multipole Amplitu 500.00
FT Reflect Lens Offset (V): 18.31
FT Main RF Amplitude (Vp-p): 2305.30
FT Main RF Current (A):
0.31
FT Main RF Frequency (kHz): 3090.31
FT HV Ion Energy (V):
1076.41
FT HV Lens 1 (V):
197.11
FT HV Lens 2 (V):
-0.09
FT HV Lens 3 (V):
-176.61
FT HV Lens 4 (V):
0.09
FT HV Push Voltage (V):
176.61
FT HV Pull Voltage (V):
-215.33
ION DETECTION SYSTEM
Dynode Voltage (kV):
-14.88
Multiplier 1 (V):
-1000.00
Multiplier 2 (V):
-973.47
FT Analyzer
FT CE Measure Voltage (V): -3453.72
FT CE Inject Voltage (V): -2467.06
FT Deflector Measure Voltage 315.71
FT Deflector Inject Voltage 21.64
FT Analyzer Temp. (°C):
25.99
FT Analyzer TEC Voltage:
1.16
FT Analyzer TEC Current:
0.20
FT Analyzer TEC Temp. (°C): 27.14
FT CE Electronics Temp. (°C) 33.13
FT CE Electronics TEC Temp. 29.92
Reagent Ion Source
Status:
Standby
Filament:
Off
Emission Current (uA):
2.69
CI Gas Pressure (psi):
19.96
Source Temp (°C):
160.12
Vial 1 Temp (°C):
67.41
Restrictor Temp (°C):
160.18
Transfer Line Temp (°C):
159.85
Reagent Ion Optics
Reagent Ion Lens 1 (V):
Reagent Ion Lens 2 (V):
Reagent Ion Lens 3 (V):
42.03
21.95
17.99
Reagent Vacuum
Ion Gauge (E-5 Torr):
20.11
Convectron Pressure OK:
Yes
Convectron Pressure (Torr): 0.04
MS Detector Settings:
Experiment Type: Nth Order Double Play
Tune Method: JK_14Apr2011_tune
Scan Event Details:
1: FTMS + p norm o(375.0-2000.0)
CV = 0.0V
2: ITMS + p norm Dep MS/MS Most intense ion from (1)
Activation Type:
CID
Min. Signal Required: 1000.0
Isolation Width:
2.00
Normalized Coll. Energy: 35.0
Default Charge State: 2
Activation Q:
0.250
Activation Time:
30.000
CV = 0.0V
Scan Event 2 repeated for top 5 peaks.
Data Dependent Settings:
Use separate polarity settings disabled
Parent Mass List:
(none)
Reject Mass List:
(none)
Neutral Loss Mass List: (none)
Product Mass List:
(none)
Neutral loss in top:
3
Product in top:
3
Most intense if no parent masses found not enabled
Add/subtract mass not enabled
FT master scan preview mode enabled
Charge state screening enabled
Charge state dependent ETD time not enabled
Monoisotopic precursor selection enabled
Non-peptide monoisotopic recognition not enabled
Charge state rejection enabled
Unassigned charge states : rejected
Charge state 1 : rejected
Charge state 2 : not rejected
Charge state 3 : not rejected
Charge states 4+ : not rejected
Chromatography mode is disabled
Global Data Dependent Settings:
Use global parent and reject mass lists not enabled
Exclude parent mass from data dependent selection not enabled
Exclusion mass width relative to mass
Exclusion mass width relative to low (ppm): 5.000
Exclusion mass width relative to high (ppm): 5.000
Parent mass width relative to mass
Parent mass width relative to low (ppm): 5.000
Parent mass width relative to high (ppm): 5.000
Reject mass width relative to mass
Reject mass width relative to low (ppm): 5.000
Reject mass width relative to high (ppm): 5.000
Zoom/UltraZoom scan mass width by mass
Zoom/UltraZoom scan mass width low: 5.00
Zoom/UltraZoom scan mass width high: 5.00
FT SIM scan mass width low:
5.00
FT SIM scan mass width high:
5.00
Neutral Loss candidates processed by decreasing intensity
Neutral Loss mass width by mass
Neutral Loss mass width low: 0.50000
Neutral Loss mass width high: 0.50000
Product candidates processed by decreasing intensity
Product mass width by mass
Product mass width low: 0.50000
Product mass width high: 0.50000
MS mass range: 0.00-1000000.00
MSn mass range by mass
MSn mass range: 0.00-1000000.00
Use m/z values as masses not enabled
Analog UV data dep. not enabled
Dynamic exclusion enabled
Repeat Count:
2
Repeat Duration: 30.00
Exclusion List Size: 500
Exclusion Duration: 10.00
Exclusion mass width relative to mass
Exclusion mass width relative to low (ppm): 5.000
Exclusion mass width relative to high (ppm): 5.000
Expiration: disabled
Isotopic data dependence not enabled
Custom Data Dependent Settings:
Not enabled
Tune File Values
Source Type: NSI
Capillary Temp (C): 150.00
APCI Vaporizer Temp (C): 0.00
Sheath Gas Flow (): 0.00
Aux Gas Flow ():
0.00
Sweep Gas Flow (): 0.00
Injection Waveforms:
Off
Ion Trap Zoom AGC Target: 3000.00
Ion Trap Full AGC Target: 30000.00
Ion Trap SIM AGC Target: 10000.00
Ion Trap MSn AGC Target: 10000.00
FTMS Injection Waveforms: Off
FTMS Full AGC Target:
500000.00
FTMS SIM AGC Target:
50000.00
FTMS MSn AGC Target:
200000.00
Reagent Ion Source Polarity:
Negative
Reagent Ion Source Temp (C):
160.00
Reagent Ion Source Emission Current (uA):
50.00
Reagent Ion Source Electron Energy (V): -70.00
Reagent Ion Source CI Pressure (psi):
20.00
Reagent Vial 1 Ion Time:
100.00
Reagent Vial 1 AGC Target: 400000.00
Reagent Vial 2 Ion Time:
50.00
Reagent Vial 2 AGC Target: 100000.00
Supplemental Activation Energy: 15.00
POSITIVE POLARITY
Source Voltage (kV): 2.00
Source Current (uA): 100.00
Capillary Voltage (V):
47.00
Tube Lens (V):
100.00
Skimmer Offset (V): 0.00
Multipole RF Amplifier (Vp-p):
400.00
Multipole 00 Offset (V):
-5.50
Lens 0 Voltage (V): -6.00
Multipole 0 Offset (V):
-5.75
Lens 1 Voltage (V): -10.00
Gate Lens Offset (V): -32.00
Multipole 1 Offset (V):
-15.50
Front Lens (V):
-6.25
Ion Trap Zoom Micro Scans: 1
Ion Trap Zoom Max Ion Time (ms): 25.00
Ion Trap Full Micro Scans: 1
Ion Trap Full Max Ion Time (ms): 100.00
Ion Trap SIM Micro Scans: 1
Ion Trap SIM Max Ion Time (ms): 25.00
Ion Trap MSn Micro Scans: 1
Ion Trap MSn Max Ion Time (ms): 25.00
FTMS Full Micro Scans:
1
FTMS Full Max Ion Time (ms):
500.00
FTMS SIM Micro Scans:
1
FTMS SIM Max Ion Time (ms):
50.00
FTMS MSn Micro Scans:
1
FTMS MSn Max Ion Time (ms):
100.00
Reagent Ion Lens 1 (V):
-20.00
Reagent Ion Gate Lens (V): -120.00
Reagent Ion Lens 2 (V):
-15.00
Reagent Ion Lens 3 (V):
-15.00
Reagent Ion Back Lens Offset (V): -6.50
Reagent Ion Back Multipole Offset (V):
-7.00
Calibration File Values
Multiple RF Frequency:
2522.800000
Main RF Frequency: 1184.500000
QMSlope0: 32.183829
QMSlope1: 32.186907
QMSlope2: 31.931027
QMSlope3: 0.000000
QMSlope4: 0.000000
QMInt0:
-33.766301
QMInt1:
0.000000
QMInt2:
-31.779899
QMInt3:
0.000000
QMInt4:
0.000000
End Section Slope: 0.000000
End Section Int:
12.000000
PQD CE Factor:
12.615609
IsoW Slope: 0.000419
IsoW Int:
0.146691
Reagent MP Slope: 5.965347
Reagent MP Int:
-2.863759
Tickle Amp. Slope0: 0.000054
Tickle Amp. Int0:
0.001125
Tickle Amp. Slope1: 0.002000
Tickle Amp. Int1:
0.400000
Tickle Amp. Slope2: 0.002000
Tickle Amp. Int2:
0.400000
Tickle Amp. Slope3: 0.002000
Tickle Amp. Int3:
0.400000
Multiplier 1 Normal Gain (pos):
-990.000000
Multiplier 1 High Gain (pos):
-1110.000000
Multiplier 2 Normal Gain (pos):
-965.000000
Multiplier 2 High Gain (pos):
-1080.000000
Multiplier 1 Normal Gain (neg):
-775.000000
Multiplier 1 High Gain (neg):
-870.000000
Multiplier 2 Normal Gain (neg):
-770.000000
Multiplier 2 High Gain (neg):
-860.000000
Normal Res. Eject Slope:
0.011245
Normal Res. Eject Intercept: 7.582153
Zoom Res. Eject Slope:
0.005004
Zoom Res. Eject Intercept: 2.034130
Turbo Res. Eject Slope:
0.069200
Turbo Res. Eject Intercept: 35.000000
AGC Res. Eject Slope: 0.069200
AGC Res. Eject Intercept: 17.300000
UltraZoom Res. Eject Slope: 0.001200
UltraZoom Res. Eject Intercept:
0.642694
Normal Mass Slope: 28.233332
Normal Mass Intercept:
-35.169076
Zoom Mass Slope: 26.658070
Zoom Mass Intercept:
-41.694562
Turbo Mass Slope: 27.889913
Turbo Mass Intercept:
133.764542
AGC Mass Slope:
27.889913
AGC Mass Intercept: 133.764542
UltraZoom Mass Slope:
26.704217
UltraZoom Mass Intercept: -39.008584
Vernier Fine Mass Slope: 421.986199
Vernier Fine Mass Intercept:
0.000000
Vernier Coarse Mass Slope: 0.000000
Vernier Coarse Mass Intercept:
0.000000
Cap. Device Min (V): -139.809167
Cap. Device Max (V):139.706461
Tube Lens Device Min (V): 259.660557
Tube Lens Device Max (V): -257.960396
Skimmer Device Min (V): -139.385391
Skimmer Device Max (V): 139.269295
Multipole 00 Device Min (V):
-140.659337
Multipole 00 Device Max (V):
140.202135
Lens 0 Device Min (V):
-140.349642
Lens 0 Device Max (V):
140.041140
Gate Lens Device Min (V): -136.758808
Gate Lens Device Max (V): 136.192680
Split Gate Device Min (V): 0.133214
Split Gate Device Max (V): 0.149519
Multipole 0 Device Min (V): -139.939776
Multipole 0 Device Max (V): 139.401846
Lens 1 Device Min (V):
-140.051652
Lens 1 Device Max (V):
139.488103
Multipole 1 Device Min (V): -140.038065
Multipole 1 Device Max (V): 139.874486
Front Lens Device Min (V): -139.968421
Front Lens Device Max (V): 139.513134
Front Section Device Min (V):
-142.778426
Front Section Device Max (V):
142.375895
Center Section Device Min (V):
-141.345987
Center Section Device Max (V):
140.935342
Back Section Device Min (V):
-141.806801
Back Section Device Max (V):
141.426282
Back Lens Device Min (V): -142.903688
Back Lens Device Max (V): 142.584530
Reagent Lens 1 Device Min (V):
-142.051170
Reagent Lens 1 Device Max (V):
141.776483
Reagent Gate Lens Min (V): -132.319901
Reagent Gate Lens Max (V): 132.171194
Reagent Lens 2 Device Min (V):
-141.875420
Reagent Lens 2 Device Max (V):
141.580896
Reagent Lens 3 Device Min (V):
-143.387012
Reagent Lens 3 Device Max (V):
143.118333
Reagent Electron Lens Device Min (V):
-0.255970
Reagent Electron Lens Device Max (V):
150.339771
1.
Cappellini E., Jensen L.J., Szklarczyk D., Ginolhac A., da Fonseca R.A.R., Stafford T.W.,
Holen S.R., Collins M.J., Orlando L., Willerslev E., et al. 2012 Proteomic Analysis of a
Pleistocene Mammoth Femur Reveals More than One Hundred Ancient Bone Proteins.
Journal of Proteome Research 11(2), 917-926. (doi:10.1021/pr200721u).
2.
Wadsworth C., Buckley M. 2014 Proteome degradation in fossils: investigating the
longevity of protein survival in ancient bone. Rapid Communications in Mass Spectrometry
28(6), 605-615. (doi:10.1002/rcm.6821).
3.
Orlando L., Ginolhac A., Zhang G., Froese D., Albrechtsen A., Stiller M., Schubert M.,
Cappellini E., Petersen B., Moltke I., et al. 2013 Recalibrating Equus evolution using the
genome sequence of an early Middle Pleistocene horse. Nature 499(7456), 74-78.
(doi:10.1038/nature12323).
4.
Asara J.M., Schweitzer M.H., Freimark L.M., Phillips M., Cantley L.C. 2007 Protein
Sequences from Mastodon and Tyrannosaurus Rex Revealed by Mass Spectrometry. Science
316(5822), 280-285. (doi:10.1126/science.1137614).
5.
Schweitzer M.H., Zheng W., Organ C.L., Avci R., Suo Z., Freimark L.M., Lebleu V.S.,
Duncan M.B., Vander Heiden M.G., Neveu J.M., et al. 2009 Biomolecular Characterization
and Protein Sequences of the Campanian Hadrosaur B. canadensis. Science 324(5927), 626631. (doi:10.1126/science.1165069).
6.
Schweitzer M.H., Zheng W., Cleland T.P., Bern M. 2013 Molecular analyses of
dinosaur osteocytes support the presence of endogenous molecules. Bone 52, 414-423.
(doi:10.1016/j.bone.2012.10.010).
7.
Medzihradszky K.F., Chalkley R.J. 2013 Lessons in de novo peptide sequencing by
tandem mass spectrometry. Mass Spectrometry Reviews, n/a-n/a.
(doi:10.1002/mas.21406).
8.
Goloboff P.A., Farris J.S., Nixon K.C. 2008 TNT, a free program for phylogenetic
analysis. Cladistics 24, 774-786.
Download