(CompOmics).

advertisement
CoLIMS progress
Computational Omics and Systems Biology
(CompOmics) Group
Niels Hulstaert
niels.hulstaert@ugent.be
outline
• predecessor: ms-lims
• database schema
• architecture
• status
• in the pipeline
• bumpy road
• demo
ms-lims lifetime growth
Millions
of spectra
140
120
100%
90%
Identification ratio
80%
100
70%
60%
80
50%
60
40%
30%
40
20%
20
0
08-2003
10%
0%
03-2005
10-2006
05-2008
12-2009
07-2011
02-2013
ms-lims usage
Format A
Agilent HPLC
MySQL
DB
Micromass Q-TOF I
Matrix Science Mascot
Format B
Identification
Bruker Ultraflex
Format C
Bruker Esquire HCT
Format D
Applied 4X00
MS or MS/MS analysis
spectra
Consumer 1
Consumer 2
Results interpretation
Consumer 3
time for an update
• mascot centric
• no maxquant support
• database schema limitations
• hard to maintain legacy code
memory issues
cyclic dependencies
• minimalist gui
ms-lims-X -> CoLIMS
• take the good things (and start from scratch)
rich client
straightforward installation
lightweight
• PeptideShaker support
• MaxQuant support
• ProteomeXchange/PRIDE support
• more mature database schema
unique protein sequences
unique modifications
database schema
metadata
search input
identification results
quantification
user management
architecture
architecture
storage task server
ActiveMQ
storage engine
in-house client
colims-distributed
colims-client
colims-core
colims-distributed
colims-repository
colims-core
colims-model
colims-repository
database server
colims-model
colims DB
• JMS and JMX java technologies
• widely used and has proven to be a stable component
in distributed architectures
• loose coupling of clients and storage engine
• sequential storing: unique protein and modification
tables
• transactional and retry mechanism
quantification status
• in progress: MaxQuant import functionality
need for validator
• in the pipeline
Mascot quant support
first: mzTab support
later: mzQuantML support
supported search engines
• MaxQuant
• In the pipeline: native Mascot support
• PeptideShaker: MS-GF+, OMSSA, X!Tandem, MS
Amanda and Mascot
ProteomeXchange export
• PRIDE XML
• mzIdentML
• PeptideShaker imported data in ProteomeXchange/PRIDE
93 submissions, comprising 11 408 817 spectra
50 submissions are public, containing 3 774 937 spectra
122 675 spectra on average per PeptideShaker project
in the pipeline
• PeptideShaker like data viewer
• data query tool
• native ProteomeXchange/PRIDE export (mzML, mzIdentML,
mzTab)
• built-in distributed search architecture and identification
interpretation (SearchGUI/PeptideShaker)
• improve client – storage task server interaction
• replace ms-lims and import existing data
• web interface third party access
design bumps
• ActiveMQ instead of in-house solution
• various database schema changes
• auditing issues
• unique protein accession -> unique sequence
adapting to PeptideShaker
• fast release cycles
• PSI-MOD -> UNIMOD modifications (multi search engines)
• protein inference strategy (protein tree)
adapting to MaxQuant
• no access to used FASTA
• spectral matching across searches
• black box
DEMO
http://colims.googlecode.com
Download