Beating Mass Spectrometry Output Data into Shape: Dr. Mia Jüllig Data Processing, Pathway Analysis & Visualisation How can I help? Metabolomics and proteomics projects generate copious amounts of data in formats that are often difficult to manage for the individual researcher. This talk will show the services available to: 1) Process and convert the raw outputs to organised, ready-to-publish tables listing proteins or metabolites with basic summary statistics 2) Pathway analysis by a range of software as well as functional grouping 3) Visualisation of the results to maximize the impact of the key findings for publication, will also be presented. GC or LC data (ASAS Mass Spec Centre) Alternative means of initial analysis (e.g. Statistics Consulting Centre) Data Analysis • Pathway Analysis • Visualisation Other data… Client provided processed data… Client provided idea Data Analysis • Pathway Analysis • Visualisation “Metabolomics and proteomics projects generate copious amounts of data in formats that are often difficult to manage for the individual researcher.” Oh really? Data Analysis • Pathway Analysis • Visualisation Searched LC-MS/MS output (proteomics/iTRAQ projects): Relative abundance of all detected proteins Functional grouping of proteins – biological context Processing of quantitative data 95% confidence intervals (Prism) Statistical likelihood of difference between treatment groups (Prism) (50,000-1,000,000 rows…) Publication quality table Data Analysis • Pathway Analysis • Visualisation “Metabolomics and proteomics projects generate copious amounts of data in formats that are often difficult to manage for the individual researcher.” Oh really? There is nothing impossible about this but getting the hang of it takes time… Data Analysis • Pathway Analysis • Visualisation Software: Excel • Prism J. Xu, et al., Modelling atherosclerosis by proteomics: Molecular changes in the ascending aortas of cholesterolfed rabbits, Atherosclerosis (2015), http://dx.doi.org/10.1016/j.atherosclerosis.2015.07.001 Data Analysis • Pathway Analysis • Visualisation Validation of proteomic findings the old fashioned way: Petrak et al. Proteome Science 2011, 9:69 Data Analysis • Pathway Analysis • Visualisation Validation of proteomic findings through MRM: (MS Centre acquires the data, I process and present.) J. Xu, et al., Modelling atherosclerosis by proteomics: Molecular changes in the ascending aortas of cholesterolfed rabbits, Atherosclerosis (2015), http://dx.doi.org/10.1016/j.atherosclerosis.2015.07.001 Data Analysis • Pathway Analysis • Visualisation Validation of proteomic findings through MRM: (Martin acquires the data, I process and present.) J. Xu, et al., Modelling atherosclerosis by proteomics: Molecular changes in the ascending aortas of cholesterolfed rabbits, Atherosclerosis (2015), http://dx.doi.org/10.1016/j.atherosclerosis.2015.07.001 Data Analysis • Pathway Analysis • Visualisation Quantification and preliminary IDs (MS) of metabolomics data – MZ Mine Final IDs (MS/MS) using NIST MS Search and suitable libraries e.g. Lipid Blast (for lipids) Publication quality table The problem with tables… Data Analysis • Pathway Analysis • Visualisation Highlighting of relevant pathways and functions using: • • • • MetaboAnalyst (proteins and/or metabolites) GO enrichment analysis (proteins) STRING analysis (proteins) Manual methods MetaboAnalyst (http://www.metaboanalyst.ca/faces/ModuleView.xhtml) MetaboAnalyst (http://www.metaboanalyst.ca/faces/ModuleView.xhtml) metabolites proteins Limited to proteins and metabolites present in metabolic pathways… STRING (http://string-db.org/newstring_cgi/) STRING (http://string-db.org/newstring_cgi/) Limited to proteins but includes most proteins, not just those in metabolic pathways J. Xu, et al., Modelling atherosclerosis by proteomics: Molecular changes in the ascending aortas of cholesterolfed rabbits, Atherosclerosis (2015), http://dx.doi.org/10.1016/j.atherosclerosis.2015.07.001 GO enrichment analysis (http://geneontology.org/) Returns GO terms for specific gene products Returns enriched GO terms for lists of gene products - Biological process - Molecular function - Cellular component Limited to proteins GO enrichment analysis (http://geneontology.org/) Go-enrichment (HP ) regulation of complement activation regulation of humoral immune response regulation of acute inflammatory response blood coagulation, fibrin clot formation Oxidative stress regulation of response to oxidative stress enzyme linked receptor protein signalling pathway positive regulation of signalling negative regulation of extrinsic apoptotic signalling pathway regulation of protein activation cascade positive regulation of cell communication regulation of response to stress Signalling DNA & protein regulation of response to external stimulus response to lipid response to other organism response to biotic stimulus response to external biotic stimulus response to alcohol response to selenium ion DNA conformation change DNA packaging chromatin organization protein-DNA complex assembly 'de novo' protein folding Limited to proteins – and also no real graphic output Data Analysis • Pathway Analysis • Visualisation GO Enrichment Analysis MetaboAnalyst STRING Analysis • Publication quality tables • Comprehensive report • Any automated graphical output Data Analysis • Pathway Analysis • Visualisation Most automated outputs are appallingly ugly… (Your reviewer) Data Analysis • Pathway Analysis • Visualisation I can help turn that frown upside down. (Your reviewer) Data Analysis • Pathway Analysis • Visualisation Highlighting of pathways Data Analysis • Pathway Analysis • Visualisation Highlighting of pathways Outlining workflows Data Analysis • Pathway Analysis • Visualisation Highlighting of pathways Outlining workflows Experimental setup Data Analysis • Pathway Analysis • Visualisation Data Analysis • Pathway Analysis • Visualisation Highlighting of pathways Outlining workflows Experimental setup Refinement of automated outputs Data Analysis • Pathway Analysis • Visualisation Highlighting of pathways Outlining workflows Experimental setup Refinement of automated outputs Schematics Data Analysis • Pathway Analysis • Visualisation Highlighting of pathways Outlining workflows Experimental setup Refinement of automated outputs Schematics Illustration of novel hypotheses J. Xu, et al., Modelling atherosclerosis by proteomics: Molecular changes in the ascending aortas of cholesterolfed rabbits, Atherosclerosis (2015), http://dx.doi.org/10.1016/j.atherosclerosis.2015.07.001 Data Analysis • Pathway Analysis • Visualisation Typical proteomics/metabolomics project (Raw output Publication quality table): 4-5 days $2,000-$2,600 Typical proteomics/metabolomics project (3 types of analysis): 2-3 days $1,000-$1,500 Simple image: 0.5-1 day $260-$520 More complex images: 2-3 days… $65 ph – to earn my keep Data Analysis • Pathway Analysis • Visualisation Whatever you need of the above, I am keen as a bean… Email is a good place to start: m.jullig@auckland.ac.nz You can also submit a request in iLab.