Using Gene Ontology (GO) to Annotate Key Players in Parkinson’s Disease GENEONTOLOGY Unifying Biology Paul Denny1, Rebecca E. Foulger1, Maria J. Martin3, John Hardy2 and Ruth C. Lovering1 1. Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London, WC1E 6JF 2. Department of Molecular Neuroscience, Institute of Neurology, University College London, Queen Square, London, WC1N 3BG 3. European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD Introduction to GO and GO annotation • The Gene Ontology (GO) project is a collaborative effort to provide consistent descriptions of gene products across all kingdoms of life. • GO contains three structured controlled vocabularies (ontologies) that describe gene products in terms of their associated biological processes, cellular locations and molecular functions, in a species-independent manner. • Originally developed in 1998, the ontologies have grown to include over 40,000 terms describing a wide range of concepts to differing levels of specificity. Blue arrows denote is_a relationships Purple arrows denote regulates relationships Background to Parkinson’s Disease • Parkinson’s Disease is a progressive, neurological condition resulting from loss of dopamineproducing neurons in the substantia nigra, a region of the brain controlling balance and movement. • One in every 500 people develop Parkinson’s Disease, and the complex condition can affect people in different ways: • Motor symptoms of Parkinson’s Disease include tremor (shaking), slowed movement (bradykinesia) or loss of movement (akinesia) and rigidity/stiffness. • Non-motor symptoms also affect the day-to-day life of sufferers and include problems with the bladder, bowel and eyes, sleep disruption, loss of cognition, depression, dementia and other mental health effects. • There’s currently no cure, however some drugs are available, which do help manage the symptoms of Parkinson’s. • The discovery of genes linked to familial forms of Parkinson’s Disease, including SNCA (αsynuclein), PARK2 (parkin), LRRK2 (PARK8), PARK7 (DJ-1) and PINK1 (PARK6) has yielded important insights into the pathogenesis of Parkinson’s Disease. Further insight into the roles of these genes will help identify the cellular mechanisms and machinery underlying disease risk, onset and progression. ID ! Name ! Definition ! ! ! ! ! ! ! Exact Synonym! Image from OBO-Edit, version 2.3.1 GO terms are associated with a gene product in a GO annotation: ! ! !GO:0000422! !mitochondrion degradation! !The autophagic process in which !mitochondria are delivered to the !vacuole and degraded in response to !changing cellular conditions. !! !mitophagy! Each annotation is attached to a reference for traceability Evidence codes give an indication of the underlying assay supporting the annotation IMP = inferred from mutant phenotype IGI = inferred from genetic interaction A subset of proteins annotated to ‘mitophagy’ and descendants during our project. Displayed in the EBI GO browser (www.ebi.ac.uk/QuickGO). The UK Parkinson’s GO annotation project: aims, priorities and progress • Our project is a collaboration between University College London (UCL) and the European Bioinformatics Institute (EMBL-EBI), and is funded by Parkinson’s UK. Our aim is to extend GO annotation into neurological areas and provide high-quality GO annotations to the products of genes relevant to Parkinson’s Disease. • This is the first annotation effort to focus on Parkinson’s Disease, and working at UCL has enabled us to establish collaborations with local neurological researchers to guide and verify our annotations. • We extract data from primary papers and reviews to attach GO terms to Parkinson’s-relevant proteins. Our primary focus is human, but we also capture information from model organisms including fly, rat and mouse. Denotes GO terms assigned by our project • Our work focuses on elucidating the ‘normal’ function of a Parkinson’s-associated gene product, which poses an additional challenge for a disease-related project. • Our annotation priorities include: • Parkinson’s risk genes: compiled from PDGene (database for Parkinson’s Disease genetic association studies, www.pdgene.org) and reviews on Parkinson’s Disease. • Interactors of risk genes: In collaboration with the IntAct Parkinson’s project funded by the MJ Fox Foundation, we are initially prioritising interactors of three proteins: LRRK2 (PARK8), α-SYNUCLEIN (SNCA/PARK1) and TAU (MAPT). • Publications of Parkinson’s UK-funded research: focused on high-throughput analyses, as well as genes with only a few annotations. • Processes that are often disrupted in cases of Parkinson’s: In consultation with UCL researchers, we have identified a set of processes that are of great interest in Parkinson’s research (* denotes current or completed topics): Improving the Gene Ontology: a focus on autophagy • Our curation feeds back into development of the Gene Ontology itself, as we expand and improve areas of the ontology relevant to Parkinson’s Disease. • In addition to revision of existing terms, our project has so far lead to creation of over 200 new GO terms including ‘L-dopa decarboxylase activity’ (GO:0036468), ‘synaptic vesicle recycling’ (GO:0036465) and ‘negative regulation of oxidative stress-induced neuron death’ (GO:1903204). • Through curation, it has become clear that the ‘autophagy’ node of GO requires improvements. • • • • • • Mitophagy* Mitochondrial fusion and fission Ubiquitination & protein degradation Vesicular transport Regulation of neuron death* Lysosomal pathways • • • • • • Autophagy* Synaptic transmission* ER stress response* Oxidative stress response* Dopamine transport Wnt signaling • We have so far created over 3800 annotations to more than 900 distinct proteins (including over 600 human proteins) from over 260 papers (statistics correct as of May 23rd 2015). We aim to curate 800 Parkinson’s-relevant human proteins by the end of 2016. • We are holding consultations with researchers, other curators and GO editors to expand and improve the representation of autophagy in GO. Annotations: all species Number of annotations/ proteins annotated • Initial discussions have already led to the rearrangement of existing terms, and the creation of new terms such as chaperone-mediated autophagy (GO:0061684). Annotations: human Proteins annotated: all species Proteins annotated: human We are keen to work with YOU to improve the arrangement of autophagy and other PD-relevant processes in GO, and to hear about your Parkinson’s priorities: please get in touch with p.denny@ucl.ac.uk Further reading • Ten quick tips for using the Gene Ontology. Blake J.A. PLoS Comput Biol. Nov;9(11):e1003343 (2013). PMID 24244145 To follow our progress, ask to receive our quarterly newsletter, email us, or follow us on Twitter: @UCLgene goannotation@ucl.ac.uk • Gene Ontology Consortium: going forward. The Gene Ontology Consortium. Nucleic Acids Res. 43, D1049-56 (2015). PMID 25428369 • The impact of focused Gene Ontology curation of specific mammalian systems. Alam-Faruque et al. PLoS One. 6(12):e27541 (2011). PMID 22174742 www.ucl.ac.uk/functional-gene-annotation/neurological www.geneontology.org The Parkinson’s UK annotation project is funded by Parkinson’s UK, grant G-1307. Project members are part of the GO Consortium.