Oncomine Database www.oncomine.org Lauren Smalls-Mantey Georgia Institute of Technology June 19, 2006 Note: This presentation contains animation What is Oncomine? • A unifying bioinformatic resource of analyzed cancer transcriptome data. • A resource to help clinicians to view others research data to aide in the discovery of gene interaction. Oncomine Database Data Pipeline • Includes studies from published literature and other databases. • Screens microarray data only to include those studies that meet the oncomine standards (clinically important). Annotation Data Warehouse • Links 14 databases • Includes automatic updates from the database sites. Log-In Screen This is the first screen seen after loging in to www.oncomine.org Search by: Gene Name Gene Symbol Unigene ID Entrez Gene ID Affy Probeset # Image Clone ID Accession # Search by: Keyword Author_Tissue Tissue type Cancer Type Clinical/Pathological Parameters Search for Breast Cancer Author_type of cancer Description of Cancer Class: # of patients in study Measured: # of reporters analyzed Up: # of Over expressed genes Down: # of Under expressed genes Diff: # of Differentially expressed genes 1 2 1. Clicking on the author gives a description of the study analyzed and a copy of the paper 2. A thesaurus is attached to many of the words, which creates a standard language for all users. Oncomine Database Automated Data Analysis • Performs logical differential expression analyses, cluster analyses, and concept analyses. • Permits the use of other hypothesis not explored in the paper Sample Facts Standardization • Standardizes and curates sample information Simple Analysis of Data To perform a simple analysis on the data in the published work simply click on the highlighted regions. Heat Map Gene List Box plot of gene Analysis Modules Advance Analysis By clicking on the highlighted icon, advanced analysis (AA) of the data can be computed. This encompasses all of the analysis modules. Advanced Analysis AA: Differential Expression Clicking on any gene will give the Gene Annotation Box plots of each gene are available by clicking on the icon. Enrichment This module provides links to the databases in the data warehouse. Below is a list of some of the advanced analysis it provides: GO Cellular Component- Gene Ontology Database (gene product descriptions) KEGG Pathway- KEGG (gene-gene interaction) Literature-define Concepts- Pubmed Interpro- Interpro (protein database) Chromosome Subregion- NCBI Mapview (organism genome search) Oncomine Gene Expression Signatures- Oncomine (where gene is found in other publications) Chromosome Arm- NCBI Mapview (organism genome search) Conserver Promoter Motif- Pubmed (publications where motif can be found) picTar predicted miRNA target genes – picTar (miRNA constructs) Enrichment Module Interaction Networks Clicking on this button provides a gene interaction pathway, which includes the genes listed. Co-Expression Co expression plots of the two genes are also available. Meta Analysis Meta analysis between different experiments allows for validation and assessment of accurate results. It only compares the statistical measurements because preparation methods are different between experiments. It also attempts to eliminate artifacts and cross hybridization. 1 At the first results page click the experiments to be compared. Meta Analysis Cont… 2 3 Click the Advanced Analysis at the bottom of the screen The screen that appears is similar to the results page At the bottom select the expression type and any filters Meta Analysis Cont… Click on Metamap or Gene List to display module 4 Box plots are also available Using Oncomine Advantages • Web based program • Meta analysis between different experiments • Easy to use interface across the spectrum of researchers • Bridges the gap between clinicians (can use when tumor samples are low due to wide variety of samples on the database). • High level analysis • All analyzed data standardized • Co-Expression Analysis Identifies genes that are similarly expressed across several tissue samples within various experiments Disadvantages • Only includes cancer specific data • Must go through external sites for raw data therefore raw data is not in a standardized format. • Need login access (free for academic not free for civilians) • The link to the databases does not take you directly to the page for the specific gene. The user must search the database themselves. Help is no problem At the upper right of each module is a ‘question mark.’ This will take you to a help page for the module. Data Pipeline Data Bases: • Gene Expression Omnibus (GEO) • Stanford Microarray Database (SMD) • Array Express Gene Annotation Data Warehouse: From: www.oncomine.org