Top-down Characterization of Proteins in Bacteria

Top-down characterization of proteins in bacteria with unsequenced genomes Nathan Edwards Georgetown University Medical Center Microorganism Identification  Homeland-security/defense applications   Clinical applications in strain identification:   Selection of treatment and/or antibiotics New applications in microbiome analysis:    Long history of fingerprinting approaches Bacterial colonies in gut, .... Chronic wound infections Compete with genomic approaches?   PCR, Next-gen sequencing Primary sales-pitch is speed. 2 Microorganism Identifications  Match spectra with proteome (or genome) sequence for (species) identity    Provides robust match with respect to instrumentation and sample prep Many bacteria will never be sequenced or "finished"...  Pathogen simulants, for example ...but many have – about 2500 to date. 3 Microorganism Identifications  Match spectra with proteome (or genome) sequence for (species) identity     Provides robust match with respect to instrumentation and sample prep Many bacteria will never be sequenced or "finished"...  Pathogen simulants, for example ...but many have – about 2500 to date. Can we use the available sequence to identify proteins from unknown, unsequenced bacteria?  Yes, for some proteins in some organisms! 4 Intact protein LC-MS/MS  Crude cell lysate  Capilary HPLC  LTQ-Orbitrap XL  Precursor scan: 30,000 @ 400 m/z Data-dependent precursor selection:   C8 column      5 most abundant ions 10 second dynamic exclusion Charge-state +3 or greater CAD product ion scan  15,000 @ 400 m/z 5 [195.00-2000.00] MS yr_inclusion 60 40 20 CID Protein Fragmentation Spectrum from Y. rohdei 21.03 21.46 0 19.5 20.0 20.5 21.0 21.5 22.0 22.5 Time (min) yr_inclusion #1937-2437 RT: 19.45-24.36 AV: 21 NL: 4.80E4 F: FTMS + p ESI d Full ms2 756.70@cid35.00 [195.00-2000.00] 576.83 z=2 100 23.0 23.5 24.0 24.5 25.0 756.70 +8 MW 6044.11 90 80 70 584.57 z=4 720.39 z=2 60 50 785.41 z=4 40 694.62 z=4 30 20 10 840.16 z=7 200.78 329.71 z=? z=? 903.81 z=3 928.49 z=4 461.16 559.55 z=4 z=? 992.53 z=3 555.29 z=4 0 200 400 600 800 1118.93 z=? 1000 1253.14 z=? 1345.30 z=? 1200 1400 1804.48 z=? 1491.23 1610.27 1666.89 1883.75 z=? z=? z=? z=? 1600 1800 2000 m/z 6 Enterobacteriaceae Protein Sequences  Exhaustive set of all Enterobacteriaceae family protein sequences from   ...plus Glimmer3 predictions on RefSeq Enterobacteriaceae genomes    Swiss-Prot, TrEMBL, RefSeq, Genbank, and [CMR] Primary and alternative translation start-sites Filter for intact mass in range 1 kDa – 20 kDa 253,626 distinct protein sequences, 256 species  Derived from "Rapid Microorganism Identification Database" (RMIDb.org) infrastructure. 7 ProSightPC 2.0  Product ion scan decharging    Absolute mass search mode    Enabled by high-resolution fragment ion measurements THRASH algorithm implementation 15 ppm fragment ion match tolerance 250 Da precursor ion match tolerance "Single-click" analysis of entire LC-MS/MS datafile. 8 Other tools  Explored using standard search engines:     Decharge and format as charge +1 spectrum X!Tandem scoring plugin (ProSight, delta M) OMSSA, Mascot, etc… MS-Tools:   MS-Deconv, MS-TopDown, MS-Align, MS-Align+, MS-Align-E! 9 60 CID Protein Fragmentation Spectrum from Y. rohdei 756.70@cid35.00 [195.00-2000.00] MS yr_inclusion 40 20 21.03 21.46 0 19.5 20.0 20.5 21.0 21.5 22.0 22.5 Time (min) yr_inclusion #1937-2437 RT: 19.45-24.36 AV: 21 NL: 4.80E4 F: FTMS + p ESI d Full ms2 756.70@cid35.00 [195.00-2000.00] 576.83 z=2 100 23.0 23.5 24.0 24.5 25.0 756.70 +8 MW 6044.11 90 Match to Y. pestis 50S Ribosomal Protein L32 80 70 584.57 z=4 720.39 z=2 60 50 785.41 z=4 40 694.62 z=4 30 20 10 840.16 z=7 200.78 329.71 z=? z=? 903.81 z=3 928.49 z=4 461.16 559.55 z=4 z=? 992.53 z=3 555.29 z=4 0 200 400 600 800 1118.93 z=? 1000 1253.14 z=? 1345.30 z=? 1200 1400 1804.48 z=? 1491.23 1610.27 1666.89 1883.75 z=? z=? z=? z=? 1600 1800 2000 m/z 10 Exact match sequence… 11 Phylogeny: Protein vs DNA Protein Sequence 16S-rRNA Sequence 12 What about mixtures? 13 Shared Small Ribosomal Proteins 14 Shared Small Ribosomal Proteins 15 Identified E. herbicola proteins  30S Ribosomal Protein S19   m/z 686.39, z 15+, E-value 1.96e-16, Δ 0.007 Six proteins identified with |Δ| < 0.02 16 Identified E. herbicola proteins  DNA-binding protein HU-alpha   m/z 732.71, z 13+, E-value 7.5e-26, Δ -14.128 Eight proteins identified with "large" |Δ| 17 Identified E. herbicola proteins  DNA-binding protein HU-alpha    m/z 732.71, z 13+, E-value 1.91e-58 Use "Sequence Gazer" to find mass shift ΔM mode can "tolerate" one shift for free! 18 ProSightPC: ΔM mode b- and y-ions Protein Sequence Experimental Precursor ΔM Also: PIITA - Tsai et al. 2009 19 ProSightPC: ΔM mode Match a single "blind" mass-shift for free! ΔM b'- and y'-ions b- and y-ions Protein Sequence Experimental Precursor ΔM Also: PIITA - Tsai et al. 2009 20 ProSightPC: ΔM mode Match a single "blind" mass-shift for free! ΔM b-, b'-, y- and y'-ions Protein Sequence Experimental Precursor ΔM Also: PIITA - Tsai et al. 2009 21 Identified E. herbicola proteins  DNA-binding protein HU-alpha   m/z 732.71, z 13+, E-value 7.5e-26, Δ -14.128 Extract N- and C-terminus sequence supported by at least 3 b- or y-ions 22 E. herbicola protein sequences 23 E. herbicola sequences found in other species 24 Phylogenetic placement of E. herbicola Phylogram Cladogram phylogeny.fr – "One-Click" 25 Genome annotation errors  UniProt: E. coli Cell division protein ZapB MQFRRGMTMSLEVFEKLEAKVQQAIDTITL… 3 17  (204) (166) 0 (2) 22 (371) E. coli strains 26 Genome annotation errors  UniProt: E. coli Cell division protein ZapB MQFRRGMTMSLEVFEKLEAKVQQAIDTITL… 3 17 0   (204) (166) (2) 22 (371) E. coli strains Need ±1500 Da precursor tolerance… 27 Conclusions  Protein identification for unsequenced organisms.  Identification and localization for sequence mutations and post-translational modifications.  Extraction of confidently established sequence suitable for phylogenetic analysis.  Genome annotation correction.  New paradigm for phylogenetic analysis? 28 Acknowledgements  Dr. Catherine Fenselau    Dr. Yan Wang   University of Maryland Proteomics Core Dr. Art Delcher   Avantika Dhabaria, Joe Cannon*, Colin Wynne* University of Maryland Biochemistry University of Maryland CBCB Funding: NIH/NCI 29

Top-down Characterization of Proteins in Bacteria

Related documents

Products

Support

Top-down Characterization of Proteins in Bacteria

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib