Mobile Client Data Privacy and IP Protection

advertisement
Enrichment Network Analysis and
Visualization (ENViz)
Cytoscape plugin for integrative statistical
analysis and visualization of multiple sample
matched data sets
Anya Tsalenko
Agilent Laboratories
December 14, 2012
Why ENViz?
Many high throughput data sets measured in the same set of
samples:
- ‘omics’
- proteomics
- metabolomics
Rich databases with systematic annotations:
- GO
- pathways
- drug targets
How do we analyze this data together to get deeper biological
insights into studied phenotype?
Integrated Biology
Primary Analysis
NMR
Proteins
Genomic Workbench
Metabolites
LC/MS
GC/MS
Integrated Biology
Informatics
Microarrays
DNA / RNA Target Enrichment
miRNA
Microfluidics
MassHunter
Workstation
GeneSpring
Network Biology
Hypothesis, experiment, model
Integrated Analysis
Genome Browser
Public Data
Example: breast cancer study
“miRNA-mRNA integrated analysis
reveals roles for miRNAs in primary
breast tumors”
Enerly et al, PLoS One 2011
• Cancer dataset from Anne-Lise
Børresen-Dale Lab in Norwegian
Radium Hospital, Oslo
• 100 breast tumor samples with
various characteristics
• Matched miRNA and mRNA data,
Agilent microarrays
Correlation of miRNA and mRNA
expression, miR-150
Sorted expression of miRNA -150
Genes sorted by correlation to miR-150
across 100 breast cancer samples
Enrichment analysis of ranked list of
genes correlated to miR-150
mHG p-value<E-147
GO terms enrichment analysis in the top of the list of genes ordered by
correlation to miR-150 based on minimum Hypergeometric Statistics
(Eden et al, PLoS CB 2007)
Analysis and visualization in GOrilla software
http://cbl-gorilla.cs.technion.ac.il/
Biological validation
Association between miR-19a
and the cell-cycle module was
substantiated as an association
to proliferation.
Further validated using highthroughput transfection assays
where transfection of miR-19a
to MCF7 cell lines resulted in
increased proliferation.
GO enrichment
for genes
correlated to
miR-19a
Generic 3 matrices enrichment
analysis
samples
Pathways/GO/other
Roy Navon
genes
Two different types of measurements
Primary
Data
in the same set of samples:
Annotation
 mRNA and miRNA expression (or other
non-coding RNAs)
 mRNA expression and quantitative
clinical phenotypes
 mRNA expression and metabolites
levels
miRNAs/other
samples
Pivot
Data
Correlations
 mRNA expression and copy number
Enrichments
Analysis is based on statistical
enrichment of annotation elements in
lists ranked by correlation
Enrichment can be calculated based on any
annotation such as GO, pathway, disease
ontology or other custom primary data
categories
ENViz: what it is
Enrichment Network Visualization (ENViz): a Cytoscape plugin
for integrative statistical analysis and visualization of multiple sample matched
data sets
Control Panel
Use the main control panel to:
• Input primary data, pivot, and
annotation files
• Run analysis
• Set thresholds that control the size of
the enrichment network to visualize
• Run the visualization
Separate sub-panels can be collapsed or
expanded by clicking on their handles
(collapsible subpanels, Bader Lab, U Toronto)
Interactive Legend:
• graphical overview of the workflow.
• click on labeled boxes for file prompt.
• drag and drop a file reference onto a
labeled box.
Enrichment Network
Enrichment network built from mRNA and miRNA data from Enerly et al, using WikiPathway
annotation.
Results are represented as bi-partite graph: nodes = pathways (yellow->red) and miRNAs (grey).
Edge represents enrichment of pathway node in the set of genes whose expression correlate the
expression pattern of miRNA node, red = positive correlation, blue = negative correlation
Enrichment Network Zoom:
• Zoom in to see details around selected nodes and edges
• See zoomed-in network in the context of the whole network on the bottom left
Pathway visualization in
WikiPathways
• Click on selected edge loads and shows corresponding WikiPathway
• All gene nodes in the mRNA processing pathway that map to primary data elements
are color coded (blue -> red) for correlation score between the primary data element
(mRNA) and the pivot data element for the clicked edge (hsa-miR-92a)
• thick borders and high opacity show genes above
correlation threshold that were included in the gene set
used for enrichment analysis.
Tiling Pathway views
• Double-click on a Pathway Node to loads multiple WikiPathways, each one colored by
correlation with the specific pivot datum for an Edge, connected to the Node, up to a userconfigurable limit
• Network views are tiled in a small multiples view that accentuates contrasts between
correlations for different pivot data.
Gene Ontology enrichment and
visualization
• Enrichment network built from Enerly et al. mRNA and miRNA data, and Gene Ontology
annotation.
• left = bi-partite graph for GO terms (yellow -> red scale) and miRNA (grey)
• edge is enrichment of the GO term in the set of genes most correlated with the miRNA.
• right = GO summary network for GO terms in the left enrichment network. Each GO nodes
color-coded by cumulative enrichment score for its set of pivot nodes.
• parent terms are added, to complete the GO hierarchy view.
miR-150 - oriented GO Terms
• Double-click on an pivot node in the enrichment network to show GO terms in the GO Summary
network that have significant enrichment values for selected pivot.
• GO Summary network on the right is color-coded by enrichment of genes correlated to miR-150
Summary: key features of ENViz
• Enrichment of annotation elements among primary
data most correlated to secondary(pivot) data
across a set of samples for each pivot and each
annotation node
• Representation of results as bi-partite graph
(network)
• Pathway and GO enrichment analysis with
customized visualization
• Zoom-in into results in the context of WikiPathways
• Interactive and intuitive data loading and analysis
• Power of network analysis in Cytoscape
Next steps
• Beta-release for collaborators
email: anya_tsalenko@agilent.com
• Working on performance, completeness, robustness for Cytoscape plugin
release
• Extend support for other organisms beyond Homo sapiens, Mus
Musculus, mycobacterium tuberculosis
• Extend the range of database id mappings
• Possible future features: heatmap view, sample grouping, more built-in
annotation types (TFs, disease ontologies)
Acknowledgements
• Agilent Team
– Allan Kuchinsky, Roy Navon, Zohar Yakhini, Michael Creech
•
Technion
– Israel Steinfeld
•
Collaborators
– Norwegian Radium Hospital, Oslo: Espen Enerly, Kristine
Kleivi, Vessela N. Kristensen, Anne-Lise Børresen-Dale
– UCSF/Gladstone: Alex Pico, Nathan Salomonis, Kristina
Hanspers, Bruce Conklin, Scooter Morris
– Maastricht University: Thomas Kelder, Martijn van Iersel, Chris
Evelo
– Cytoscape core developers and PIs: Trey Ideker, Chris Sander,
Gary Bader, Benno Schwikowski, Mike Smoot, Peng Liang, Kei Ono,
Leroy Hood, Ben Gross, Ethan Cerami
Download