GOSlimViewer

advertisement
GO-based tools for
functional modeling
TAMU GO
Workshop
17 May 2010
Functional Modeling
1.
Grouping by function
 GO Slim sets
 GO browser tools
 GOSlimViewer
2.
Expression analysis




3.
DAVID
EasyGO/agriGO
Onto-Express
Funcassociate 2.0
Pathway & network analysis
Workshop Part 2 contains some functional
modeling tutorials that use some of these tools.
1. Grouping by function
GO Slim Sets




slim sets are abbreviated versions of the GO
contain broader functional terms
made by different GO Consortium groups (for different
purposes, eg. plant, yeast, etc)
need to cite which one you used!
More information about GO terms for each slim set can be
found at EBI QuickGO:
http://www.ebi.ac.uk/QuickGO/
GO Slim and Subset Guide
http://www.geneontology.org/GO.slims.shtml
QuickGO: Create your own
subset/slim of GO terms



http://www.ebi.ac.uk/QuickGO/
GO slims tutorial available
This tutorial will describe GO slims, what they are
used for and how to use QuickGO for:
* creating a custom GO slim
* using a pre-defined GO slim
* obtaining GO annotations to a GO slim
* customising a set of slimmed annotations
* using statistics calculated by QuickGO to
generate graphical representations of the data
AmiGO: GO Slimmer

http://amigo.geneontology.org/cgibin/amigo/slimmer?session_id=4878amig
o1273279396
Note – AmiGO browser does not include
IEA annotations.
GOSlimViewer input file
Input is a text file containing 3 tab
separated columns:
1. accession
2. GO:ID
3. aspect (P,F or C)
• file provided by GORetriever
• can manually add to it from
GOanna excel file
allows you to include your
additional GO annotations in
the analysis
GOSlimViewer output
GOSlimViewer output
GOSlimViewer output
2. Expression analysis
Determining which classes of gene products
are over-represented or under-represented.
http://www.geneontology.org/
However….
 many of these tools do not support
agricultural species
 the tools have different computing
requirements
A list of these tools that can be used for
agricultural species is available on the
workshop website at the “Summary of Tools
for gene expression analysis” link.
Evaluating GO tools
Some criteria for evaluating GO Tools:
1. Does it include my species of interest (or do I have to
“humanize” my list)?
2. What does it require to set up (computer usage/online)
3. What was the source for the GO (primary or secondary)
and when was it last updated?
4. Does it report the GO evidence codes (and is IEA
included)?
5. Does it report which of my gene products has no GO?
6. Does it report both over/under represented GO groups
and how does it evaluate this?
7. Does it allow me to add my own GO annotations?
8. Does it represent my results in a way that facilitates
discovery?
Some useful expression analysis tools:
Database for Annotation, Visualization and
Integrated Discovery (DAVID)


http://david.abcc.ncifcrf.gov/
agriGO -- GO Analysis Toolkit and Database for
Agricultural Community





http://bioinfo.cau.edu.cn/agriGO/
used to be EasyGO
chicken, cow, pig, mouse, cereals, dicots
includes Plant Ontology (PO) analysis
Onto-Express



http://vortex.cs.wayne.edu/projects.htm#Onto-Express
can provide your own gene association file
Funcassociate 2.0: The Gene Set Functionator



http://llama.med.harvard.edu/funcassociate/
can provide your own gene association file
http://david.abcc.ncifcrf.gov/





functional grouping – including GO,
pathways, gene-disease association
ID Conversion
search functionally related genes
regular updates
online support & publications
http://bioinformatics.cau.edu.cn/easygo/

May 2010: EasyGO replaced by agriGO
http://bioinfo.cau.edu.cn/agriGO/
enrichment analysis using either GO or
Plant Ontology (PO)
 40 species: chicken, cow, pig, mouse,
cereals, poplar, fruits
 GenBank, EMBL, UniProt
 Affymetrix, Operon, Agilent arrays

Onto-Express
http://vortex.cs.wayne.edu/projects.htm
Onto-Express analysis instructions are
Available in onto-express.ppt
Species
represented in
Onto-Express
Can upload your own
annotations using
OE2GO
http://llama.med.harvard.edu/funcassociate/
3. Pathway & network analysis
GO, Pathway, Network Analysis
Many GO analysis tools also include
pathway & network analysis
 Ingenuity Pathways Analysis (IPA) and
Pathway Studios – commercial software
 DAVID – includes multiple functional
categories
 Onto-Tools – includes Pathways Express
tool

Pathways & Networks

A network is a collection of interactions

Pathways are a subset of networks
Network of interacting proteins that carry out biological
functions such as metabolism and signal transduction

All pathways are networks of interactions

Not all networks are pathways
Pathways Resources
KEGG
BioCyc
Reactome
GenMAPP
BioCarta
http://www.genome.jp/kegg/pathway.html/
http://www.biocyc.org/
http://www.reactome.org/
http://www.genmapp.org/
http://www.biocarta.com/
Pathguide – the pathway resource list
http://www.pathguide.org/
Biological Networks
Networks often represented as graphs
 Nodes represent proteins or genes that
code for proteins
 Edges represent the functional links
between nodes (ex regulation)
 Small changes in graph’s
topology/architecture can result in the
emergence of novel properties

Types of interactions

protein (enzyme) – metabolite (ligand)


protein – protein


metabolic pathways
cell signaling pathways, protein complexes
protein – gene

genetic networks
Network example: STRING Database
http://string.embl.de/
Sod1
Mus musculus
Database/URL/FTP
PLoS Computational Biology March 2007, Volume 3 e42
•DIP http://dip.doe-mbi.ucla.edu
•BIND http://bind.ca
•MPact/MIPS http://mips.gsf.de/services/ppi
•STRING http://string.embl.de
•MINT http://mint.bio.uniroma2.it/mint
•IntAct http://www.ebi.ac.uk/intact
•BioGRID http://www.thebiogrid.org
•HPRD http://www.hprd.org
•ProtCom http://www.ces.clemson.edu/compbio/ProtCom
•3did, Interprets http://gatealoy.pcb.ub.es/3did/
•Pibase, Modbase http://alto.compbio.ucsf.edu/pibase
•CBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbm
•SCOPPI http://www.scoppi.org/
•iPfam http://www.sanger.ac.uk/Software/Pfam/iPfam
•InterDom http://interdom.lit.org.sg
•DIMA http://mips.gsf.de/genre/proj/dima/index.html
•Prolinks http://prolinks.doembi.ucla.edu/cgibin/functionator/pronav/
•Predictome http://predictome.bu.edu/
Retrieval of interaction datasets

Evaluate PPI resources such as
Predictome or Prolinks for existence of
species of interest

If unavailable, find orthologous proteins
in related species that have interactions!
4. Hypothesis testing using GO
http://www.genetools.microarray.ntnu.no/common/intro.php

eGOn v2.0 can test statistical hypotheses
of association between gene reporter lists:



Master-Target
Mutually Exclusive Target-Target
Intersecting Target-Target situation
statistical hypothesis testing using the GO
 allows addition of extra GO annotation

http://gdm.fmrp.usp.br/cgi-bin/gc/upload/upload.pl
visualization for mapping SAGE data onto
GO
 graphical visualization of the percentage
of SAGE tags in each GO category, along
with confidence intervals and hypothesis
testing

http://www.agbase.msstate.edu/cgi-bin/tools/GOModeler.pl

takes a user generated of hypothesis/GO
term statements and tests the
quantitative effect of gene expression
values on these statements
Some comments on analysis tools:
> 68 GO based analysis tools listed on the
GO Consortium website (not a
comprehensive list!)
 several tools combine GO, pathway and
network functional analysis
 many different ways of visualizing the
results
 expanding the species supported by
analysis tools – check with tool developers
 check for last updates & user support
information

HOMEWORK !
For those in the afternoon session who have
sequence files (eg. RNA-Seq data, EST
data, etc):
Please prepare a sample (approx 100
sequences) and send it through GOanna
so that you can get your results emailed
back for this afternoon.
- please try to use a species specific
database to improve run times
- see me if you have questions
Download