PSI Mass Spectrometry Standards Working Group Summary HUPO PSI MS Standards Working Group PSI MSS & PI WGs Joint Session 2:00 – 5:30 10 minutes each: • mzIdentML • mzQuantML (incl support for SRM) • mzTab (incl. support for SRM) • TraML • mzML & imzML & compression • metabolomics & ion mobility & SWATH-MS • PEFF • MIAPE: MS & MSI & Quant • Controlled vocabulary • Cross-linking • Protein grouping • PTM localization • RNA-seq assisted proteomics 2 TraML • • • Discussed ongoing implementation and use efforts Identified some outdated information in documentation. TODO: update Online validator has outdated CV. TODO: update • No schema or CV updates needed at this time • Discussed adding Waters format to jTraML converter 3 mzML • • Schema remains stable with no needed changes Many updates to the CV continue • • imzML continues to be aligned with mzML mz5 format was recently published as mzML clone using HDF5 not XML • Discussions ongoing by metabolomics groups about how mzML would meet the needs of the metabolomics community. Some terms already added. Expect some more proposals soon. • File size continues to be a significant problem… 4 mzML (or other) compression • • • • Andy Dowsey & Faviel Gonzalez Galarza presented their work on compression mz5 claims 50%+ space savings by using HDF5 Implementation discussed vs. alternate HDF5 implementations But significant space savings was achieved via tricks that could work in mzML • Discussed other work and proposed work in mass-spec aware compression • Possibility: alternative to zlib internal compression (currently supported) could be a mass-spec aware “mszip” compression scheme. Provided as a simple, open-source routine available in many languages • Possibility: Develop a variant of zlib that creates files that can be uncompressed normally, but allow indexing into the compressed file 5 File compression results ~50% compression using mz5 Orbitrap profile-mode spectra 3500 3000 RAW mzXML mzML mzXML+zlib mzML+zlib mzML.gz mz5 .fi .fi.gz 2500 MB 2000 1500 1000 500 0 File 1 File 2 File 3 File 4 April 2013 Compressing mzML SYNAPT 1600.0 1400.0 1200.0 1000.0 800.0 600.0 400.0 57.1% 50.6% 45.1% 43.3% 36.8% 200.0 0.0 April 2013 Other presentations • Shin presented on the use of RDF & TogoDB • Mathias presented about qcML 8 Ion mobility MS & SWATH-MS • Discussed with Waters their ion mobility data • Discussed required terms and practices for encoding raw IMS and peakpicked IMS data. Proposal to be publicized on lists for further comment • No schema change necessary 9 RNA-seq assisted proteomics • Good discussion of the state of the field on this workflow • Discussed using/promoting the PEFF format as a useful mechanism for encoding some of the RNA-seq results for use by proteomics searches • Discussed possible need to update MIAPE documents to capture information about what is done in this type of a workflow 10 Controlled Vocabulary • It is time to get the vendors to update their instrument and software terms again. Gerhard will repeat the effort done by Luisa years ago. • Worked to get rid of purgatory branch in CV • Discussed what to do with multiple SoftwareName:specificTerm entries that are effectively the same concept. Start by grouping similar terms under a common parent • Discussed constraining some terms with an is_a relationship to a concept like “value between 0 and 1 inclusive” 11 PEFF (PSI Extended Fasta Format) • Interest in finalising the format specification and make it available • Cannot expect that (most of the) DB providers will produce it in addition to their existing format • Cannot expect that (most of the) search engines will fully take advantage of its structure (variants, PTMs, …) in the identification jobs • A converter («source»-to-PEFF) and a reader already exist . Could be a reference implementation =>A few minor open issues to be resolved and finalise the recommendation MIAPE-MSI • Document is updated • Mapping to mzIdentML is validated • Collection of up-to-date example instance documents ongoing • Semantic validator for mzIdentML ongoing => Prepare submission of version 1.2 to PSI doc process