Using Gene Ontology (GO) to Characterise Key Players in Parkinson’s Disease

advertisement
GENEONTOLOGY
Unifying Biology
Using Gene Ontology (GO) to Characterise
Key Players in Parkinson’s Disease
Rebecca E. Foulger1, Paul Denny1, Claire O’Donovan3, John Hardy2 and Ruth C. Lovering1
1. Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London, WC1E 6JF
2. Department of Molecular Neuroscience, Institute of Neurology, University College London, Queen Square, London, WC1N 3BG
3. European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD
Introduction to GO
• 
The Gene Ontology (GO) project is a collaborative effort to provide consistent descriptions of
gene products across all kingdoms of life, and is a key resource for researchers wishing to
understand the biological role of a gene product.
• 
GO contains three structured controlled vocabularies (ontologies) that describe gene products in
terms of their associated biological processes, cellular locations and molecular functions, in
a species-independent manner.
• 
Originally developed in 1998, the ontologies have grown to include nearly 40,000 terms
describing a wide range of concepts to differing levels of specificity.
Anatomy of a GO annotation
• 
GO annotation is the practice of capturing information about a gene product using terms from the
Gene Ontology.
GO terms are assigned to proteins based on
different evidence:
IDA = inferred from direct assay
IMP = inferred from mutant phenotype
TAS = traceable author statement
IC = inferred by curator
Each
annotation is
attached to a
reference for
traceability.
Figure 1: Placement of ‘negative
regulation of neuron apoptotic
process’ (GO:0043524) in the Gene
Ontology.
denotes GO terms
assigned by this project
Blue arrows represent is_a relationships
between GO terms
Purple arrows represent regulates and
negatively_regulates relationships between
GO terms
Each GO term has a unique ID, name and
definition. A GO term may also contain one
or more synonyms to aid searching.
Image taken from OBO-Edit, version 2.3.1
Figure 2: Anatomy of an annotation: a subset of biological process GO annotations for human PARK7
(PARKIN-7, DJ-1). Displayed in the EBI GO browser (www.ebi.ac.uk/QuickGO).
Gene Ontology and Parkinson’s Disease
•  The discovery of genes linked to familial forms of Parkinson’s Disease, including SNCA (αsynuclein), PARK2 (parkin), LRRK2 (PARK8), PARK7 (DJ-1) and PINK1 (PARK6) has yielded
important insights into the pathogenesis of Parkinson’s disease. Further elucidating the roles of
these genes will help identify the cellular mechanisms and machinery underlying disease risk,
onset and progression.
Project aims, priorities and progress
•  We have used multiple approaches to select our annotation priorities, including:
•  Started in January 2014, the Parkinson’s UK GO annotation project is a collaboration
between University College London (UCL) and the European Bioinformatics Institute (EMBLEBI), and is funded by Parkinson’s UK. Our aim is to extend GO annotation into neurological
areas and provide high-quality GO annotations to the products of genes relevant to Parkinson’s
Disease.
•  Previous annotation projects have demonstrated the effectiveness of topic-based GO curation
(Alam-Faruque et al., 2011 and 2014, Khodiyar et al., 2013). This is the first annotation effort to
focus on a neurological disease, and working at UCL has enabled us to establish collaborations
with local neurological researchers to guide and verify our annotations.
•  Our work focuses on elucidating the ‘normal’ function of a Parkinson’s-associated gene product,
providing an additional challenge for a disease-related project.
References and further reading
•  Ten quick tips for using the Gene Ontology. Blake J.A. PLoS Comput Biol. Nov;9(11):e1003343
(2013). PMID 24244145
Parkinson’s risk genes: compiled from PDGene (database for Parkinson’s Disease
genetic association studies, www.pdgene.org) and reviews on Parkinson’s Disease.
• 
Interactors of risk genes: In collaboration with the IntAct Parkinson’s project funded by
the MJ Fox Foundation, we are initially prioritising interactors of three proteins: LRRK2
(PARK8), α-SYNUCLEIN (SNCA/PARK1) and TAU (MAPT).
• 
Processes that are often disrupted in cases of Parkinson’s: In consultation with UCL
researchers, we have identified a set of processes that are of great interest in Parkinson’s
research:
• 
• 
• 
• 
• 
• 
•  Housing these annotations in the GO database allows researchers to find out more about their
gene of interest, search for common processes within a gene list, or perform more complex
queries on their data set. Our annotation efforts will therefore improve the analysis of highthroughput datasets, which rely on large numbers of high-quality annotations for correct
interpretation.
•  We extract data from primary papers and reviews to attach GO terms to Parkinson’s-relevant
proteins. Our primary focus is human, but we also capture information from model organisms
including fly, rat and mouse.
• 
Mitophagy
Mitochondrial fusion and fission
Ubiquitination & protein degradation
Vesicular transport
Regulation of neuron death
Lysosomal pathways
• 
• 
• 
• 
• 
• 
Autophagy
Synaptic transmission
Unfolded protein response
Oxidative stress response
Dopamine transport
Wnt signaling
•  Our curation feeds back into development of the Gene Ontology itself as we expand and
improve areas of the ontology relevant to Parkinson’s Disease such as vesicle trafficking,
regulation of neuron death, mitophagy etc. In addition to revision of existing terms, the project
has so far lead to creation of over 100 new GO terms including ‘L-dopa decarboxylase
activity’ (GO:0036468), ‘synaptic vesicle recycling’ (GO:0036465) and ‘negative regulation of
oxidative stress-induced neuron death’ (GO:1903204).
•  We have so far created 1113 annotations to 274 distinct proteins (including 171 human
proteins) from 94 papers (statistics correct as of September 2nd 2014). We aim to curate 520
Parkinson’s-relevant proteins by the end of 2015 and 800 in total by the end of 2016.
•  To follow our progress, please ask to be added to our quarterly newsletter, or visit our project at
www.ucl.ac.uk/functional-gene-annotation/neurological.
•  The Gene Ontology: enhancements for 2011. The Gene Ontology Consortium. Nucleic Acids
Res. 40, D559-564 (2012). PMID 22102568
•  The IntAct molecular interaction database in 2012. Kerrien et al. Nucleic Acids Res. 40,
D841-846 (2012). PMID 22121220
•  Representing kidney development using the Gene Ontology. Alam-Faruque et al. PLoS One.
Jun 18;9(6):e99864 (2014). PMID 24941002
How YOU can help
• 
•  From zebrafish heart jogging genes to mouse and human orthologs: Using Gene Ontology
to investigate mammalian heart development. Khodiyar et al. F1000Res, Nov 13 (2013). PMID:
24627794
We are keen to hear from you about the genes and processes YOU think we should be
annotating. Please speak to us or email rebecca.foulger@ucl.ac.uk or p.denny@ucl.ac.uk.
• 
Search the GO annotations associated with your favourite Parkinson’s gene - let us know if
you think any annotations are missing.
•  The impact of focused Gene Ontology curation of specific mammalian systems. AlamFaruque et al. PLoS One. 6(12):e27541 (2011). PMID 22174742
• 
Send us your Parkinson’s-relevant papers to be annotated.
www.ucl.ac.uk/functional-gene-annotation/neurological
www.geneontology.org
The Parkinson’s UK annotation
project is funded by Parkinson’s UK,
grant G-1307. Project members are
part of the GO Consortium.
Download