Minutes from the first meeting of the Rosaceae Gene Naming Standardisation committee held on July 3rd 2013 Attendees: Jung Sook, Chris Dardick, Lee Meisel, Douglas Bielenberg, Carole Bassett Slovin, Janet Apologies: Michela Troggio, Dorrie Maine Goal: To come up with Rosaceae gene naming principle and mechanism for enforcement (TGG and other journals mandate authors to submit gene names to GDR before manuscript submission?) Agenda Rosaceae gene naming principle 1. Two or three letter abbreviation of species name as a prefix to gene names? 2. How about gene names that already exist in NCBI (with and without prefix)? 3. How do we treat homeologous genes (in the duplicated apple genome) 4 do we try and link names across Rosaceae species (So MdARF1 = PpARF1) This is a challenge when you get clade expansions but would be of considerable use How to enforce? 1. Submission of gene name to GDR before manuscript submission to TGG and other journals 2. Publicise in GDR These minutes try to capture themes rather than record progress through the meeting Theme 1 Gene naming convention It was all agreed that using the standard Arabidopsis/Bacteria/ protein naming convention made sense to follow. The standard convention is: PROTEIN: CAPS NON ITALIC GENE: CAPS ITALIC mutant: lower case italic Carole bought up the issue around excessive abbreviations, especially in the RUBISCO gene nomenclature, highlighting the importance of being brief. Theme 2 Species abbreviations There was considerable discussion around the abbreviation of species. Many people use 2 letter abbreviations (eg Md = Malus X domestica) However, this has major limitations even within Rosaceae (eg Pp = Prunus persica or Pyrus pyrofolia?). There is a strong need to clarify on this. After some discussion we were faced with the facts that there are 3100 Rosaceae species, how do we differentiate these. It was pointed out that there is a good website UNIPROT ( http://www.uniprot.org/docs/speclist ) where they have given a 5 letter abbreviation for many of the taxa 3 for the Genus name and 2 for the species giving Maldo and Prupe. By using these a lot of ambiguity can be removed, it was decided that we should promote these Task: need to find a reference for the UNIPROT website Need to draw up a list of abbreviations for a table of most commonly used species Need to create a Taxa page in GDR with naming conventions Point to note should we define whether it should be Maldo, MALDO, or MALDO Theme 3 Gene names Unfortunately due to time constraints there were no firm decisions made on the naming, however a number of issues were raised Issues: 1. We do not want the same name to cover 2 genes (really bad) 2. We do not want the same gene with different names (just bad) 3. NCBI does not allow researchers to add gene names to a sequence that is composed of more than one DNA molecule 4. Calling a gene name LFY just because of homology does not say anything of function 5. Naming to the closest arabidopsis gene (eg AtARF9 == PrupeARF9) 6. Linking gene names with gene model numbers 1/2/3/6. We all agreed that we should try to get unique names for genes Tasks Set up a list of gene names that are currently in the literature in GDR linked to individual gene models Create an upload area for researchers to put new gene names into GDR (this should be passive we do not want to spend time assessing gene names 4. The actual name. Currently there are no functional tests of many of the genes in Rosaceae species so should the gene be called –like eg the best homologue to LFY should be LFY-like. There are good guidelines for choosing a name in TAIR (http://www.arabidopsis.org/portals/nomenclature/guidelines.jsp). These guidelines point out the danger of calling a gene family XX-like, when they are so diverse at the sequence level that they cannot, possibly have any similar function. Unfortunately if you condense the “–like” convention down to a 3 letter abbreviation it becomes LFL With almost all the gene names ending in “L”. Certainly for many of the rosaceae species the functional characterisation of the genes are unlikely to be tested in the host species. It was mentioned that the Arabidopsis naming convention was deficient and rice were trying to address this. Tasks read gene nomenclature conventions in TAIR Set a cutoff for identity needed to call a gene the Arabidopsis name or “–like” Critically ( or not too much) read short communication put together by Robert 5. Co-naming with Arabidopsis is virtually impossible due to clade expansions in the 2 species. Within Rosaceae is more possible however there are still issues. We will recommend to use similar when possible. Theme 4 How do we enforce/ advertise It was agreed that we should try to put an article together for publication (like the Lizard community: http://www.biomedcentral.com/content/pdf/1471-2164-12-554.pdf ). They published in BMC Genomics, it was discussed whether we should try to reach the more molecular types by targeting BMC Plant Biology rather than TGG. This will be advertised on GDR. We will use linkages to disseminate the information to reviewers and editors. Tasks The paper will be put on Google docs for editing by the wider team Action points: