PPT - HUPO Proteomics Standards Initiative

advertisement
PSI Mass Spectrometry Standards
Working Group Summary
HUPO PSI MS Standards Working Group
PSI MSS & PI WGs Joint Session
2:00 – 5:30
10 minutes each:
• mzIdentML
• mzQuantML (incl support for SRM)
• mzTab (incl. support for SRM)
• TraML
• mzML & imzML & compression
• metabolomics & ion mobility & SWATH-MS
• PEFF
• MIAPE: MS & MSI & Quant
• Controlled vocabulary
• Cross-linking
• Protein grouping
• PTM localization
• RNA-seq assisted proteomics
2
TraML
•
•
•
Discussed ongoing implementation and use efforts
Identified some outdated information in documentation. TODO: update
Online validator has outdated CV. TODO: update
•
No schema or CV updates needed at this time
•
Discussed adding Waters format to jTraML converter
3
mzML
•
•
Schema remains stable with no needed changes
Many updates to the CV continue
•
•
imzML continues to be aligned with mzML
mz5 format was recently published as mzML clone using HDF5 not XML
•
Discussions ongoing by metabolomics groups about how mzML would meet
the needs of the metabolomics community. Some terms already added.
Expect some more proposals soon.
•
File size continues to be a significant problem…
4
mzML (or other) compression
•
•
•
•
Andy Dowsey & Faviel Gonzalez Galarza presented their work on
compression
mz5 claims 50%+ space savings by using HDF5
Implementation discussed vs. alternate HDF5 implementations
But significant space savings was achieved via tricks that could work in mzML
•
Discussed other work and proposed work in mass-spec aware compression
•
Possibility: alternative to zlib internal compression (currently supported) could
be a mass-spec aware “mszip” compression scheme. Provided as a simple,
open-source routine available in many languages
•
Possibility: Develop a variant of zlib that creates files that can be
uncompressed normally, but allow indexing into the compressed file
5
File compression results
~50% compression using mz5
Orbitrap profile-mode spectra
3500
3000
RAW
mzXML
mzML
mzXML+zlib
mzML+zlib
mzML.gz
mz5
.fi
.fi.gz
2500
MB
2000
1500
1000
500
0
File 1
File 2
File 3
File 4
April 2013
Compressing mzML
SYNAPT
1600.0
1400.0
1200.0
1000.0
800.0
600.0
400.0
57.1%
50.6%
45.1%
43.3%
36.8%
200.0
0.0
April 2013
Other presentations
•
Shin presented on the use of RDF & TogoDB
•
Mathias presented about qcML
8
Ion mobility MS & SWATH-MS
•
Discussed with Waters their ion mobility data
•
Discussed required terms and practices for encoding raw IMS and peakpicked IMS data. Proposal to be publicized on lists for further comment
•
No schema change necessary
9
RNA-seq assisted proteomics
•
Good discussion of the state of the field on this workflow
•
Discussed using/promoting the PEFF format as a useful mechanism for
encoding some of the RNA-seq results for use by proteomics searches
•
Discussed possible need to update MIAPE documents to capture
information about what is done in this type of a workflow
10
Controlled Vocabulary
•
It is time to get the vendors to update their instrument and software terms
again. Gerhard will repeat the effort done by Luisa years ago.
•
Worked to get rid of purgatory branch in CV
•
Discussed what to do with multiple SoftwareName:specificTerm entries that
are effectively the same concept. Start by grouping similar terms under a
common parent
•
Discussed constraining some terms with an is_a relationship to a concept
like “value between 0 and 1 inclusive”
11
PEFF (PSI Extended Fasta Format)
• Interest in finalising the format specification and make it available
• Cannot expect that (most of the) DB providers will produce it in
addition to their existing format
• Cannot expect that (most of the) search engines will fully take
advantage of its structure (variants, PTMs, …) in the identification
jobs
• A converter («source»-to-PEFF) and a reader already exist . Could
be a reference implementation
=>A few minor open issues to be resolved and finalise the
recommendation
MIAPE-MSI
• Document is updated
• Mapping to mzIdentML is validated
• Collection of up-to-date example instance documents
ongoing
• Semantic validator for mzIdentML ongoing
=> Prepare submission of version 1.2 to PSI doc process
Download