Minutes from the first meeting of the Rosaceae Gene Naming

advertisement
Minutes from the first meeting of the Rosaceae Gene Naming Standardisation committee held on
July 3rd 2013
Attendees: Jung Sook, Chris Dardick, Lee Meisel, Douglas Bielenberg, Carole Bassett Slovin, Janet
Apologies: Michela Troggio, Dorrie Maine
Goal:
To come up with Rosaceae gene naming principle and mechanism for enforcement (TGG and other
journals mandate authors to submit gene names to GDR before manuscript submission?)
Agenda
Rosaceae gene naming principle
1. Two or three letter abbreviation of species name as a prefix to gene names?
2. How about gene names that already exist in NCBI (with and without prefix)?
3. How do we treat homeologous genes (in the duplicated apple genome)
4 do we try and link names across Rosaceae species (So MdARF1 = PpARF1) This is a challenge when
you get clade expansions but would be of considerable use
How to enforce?
1. Submission of gene name to GDR before manuscript submission to TGG and other journals
2. Publicise in GDR
These minutes try to capture themes rather than record progress through the meeting
Theme 1 Gene naming convention
It was all agreed that using the standard Arabidopsis/Bacteria/ protein naming convention made
sense to follow. The standard convention is:



PROTEIN: CAPS NON ITALIC
GENE: CAPS ITALIC
mutant: lower case italic
Carole bought up the issue around excessive abbreviations, especially in the RUBISCO gene
nomenclature, highlighting the importance of being brief.
Theme 2 Species abbreviations
There was considerable discussion around the abbreviation of species. Many people use 2 letter
abbreviations (eg Md = Malus X domestica) However, this has major limitations even within
Rosaceae (eg Pp = Prunus persica or Pyrus pyrofolia?). There is a strong need to clarify on this. After
some discussion we were faced with the facts that there are 3100 Rosaceae species, how do we
differentiate these.
It was pointed out that there is a good website UNIPROT ( http://www.uniprot.org/docs/speclist )
where they have given a 5 letter abbreviation for many of the taxa 3 for the Genus name and 2 for
the species giving Maldo and Prupe. By using these a lot of ambiguity can be removed, it was
decided that we should promote these



Task: need to find a reference for the UNIPROT website
Need to draw up a list of abbreviations for a table of most commonly used species
Need to create a Taxa page in GDR with naming conventions
Point to note should we define whether it should be Maldo, MALDO, or MALDO
Theme 3 Gene names
Unfortunately due to time constraints there were no firm decisions made on the naming, however a
number of issues were raised
Issues:
1. We do not want the same name to cover 2 genes (really bad)
2. We do not want the same gene with different names (just bad)
3. NCBI does not allow researchers to add gene names to a sequence that is composed of more
than one DNA molecule
4. Calling a gene name LFY just because of homology does not say anything of function
5. Naming to the closest arabidopsis gene (eg AtARF9 == PrupeARF9)
6. Linking gene names with gene model numbers
1/2/3/6. We all agreed that we should try to get unique names for genes
Tasks


Set up a list of gene names that are currently in the literature in GDR linked to individual
gene models
Create an upload area for researchers to put new gene names into GDR (this should be
passive we do not want to spend time assessing gene names
4. The actual name. Currently there are no functional tests of many of the genes in Rosaceae
species so should the gene be called –like eg the best homologue to LFY should be LFY-like. There
are good guidelines for choosing a name in TAIR (http://www.arabidopsis.org/portals/nomenclature/guidelines.jsp).
These guidelines point out the danger of calling a gene family XX-like, when they are so diverse at
the sequence level that they cannot, possibly have any similar function.
Unfortunately if you condense the “–like” convention down to a 3 letter abbreviation it becomes
LFL With almost all the gene names ending in “L”. Certainly for many of the rosaceae species the
functional characterisation of the genes are unlikely to be tested in the host species.
It was mentioned that the Arabidopsis naming convention was deficient and rice were trying to
address this.



Tasks read gene nomenclature conventions in TAIR
Set a cutoff for identity needed to call a gene the Arabidopsis name or “–like”
Critically ( or not too much) read short communication put together by Robert
5. Co-naming with Arabidopsis is virtually impossible due to clade expansions in the 2 species.
Within Rosaceae is more possible however there are still issues. We will recommend to use similar
when possible.
Theme 4 How do we enforce/ advertise
It was agreed that we should try to put an article together for publication (like the Lizard community:
http://www.biomedcentral.com/content/pdf/1471-2164-12-554.pdf ). They published in BMC Genomics, it
was discussed whether we should try to reach the more molecular types by targeting BMC Plant
Biology rather than TGG.
This will be advertised on GDR. We will use linkages to disseminate the information to reviewers
and editors.
Tasks
The paper will be put on Google docs for editing by the wider team
Action points:
Download