release-notes

advertisement
9709
The updated database contains information about nearly 50,000 alleles of nearly 15,000 genes.
FlyBase contains descriptions of over 15,000 chromosomal aberrations, as well as molecular
maps and information about more that 2,300 molecular constructs, including more than 1,200
transposons, and information about 1,350 different transposon insertions. The FlyBase
bibliography of publications about Drosophila now contains more than 89,000 listings, many
with links to the genes, aberrations, and molecular constructs they discuss.
9612
The updated database contains information about more than 38,000 alleles of more than 11,000
genes. More than 650 of the gene reports now include links to reports about expression patterns
and other information for proteins and transcripts associated with the gene. FlyBase contains
descriptions of over 13,300 chromosomal aberrations, as well as molecular maps and information
about more that 2,000 molecular constructs, including more than 1,000 transposons. The FlyBase
bibliography of publications about Drosophila now contains more than 81,000 listings, many
with links to the genes and aberrations they discuss.
9603The genes file is an updated file with the addition of new genes and alleles. The total
number of 'genes' in this release is nearly 10,000; that of alleles is over 30,000. The number of
aberrations is over 12,500. The bibliography includes over 76,000 records, of which over 4000
are theses.
The following are the more important structural changes:
1. The cross links to genes of other species (e.g. mouse, human, yeast) that are reported to be
'similar' to Drosophila genes have been extensively updated and now include gene symbols (and,
where available, unique identifier numbers) from the mouse (MGD), human (OMIM and GDB),
yeast and other single organism databases. Nearly 1500 links to genes in other species now exist.
Over half of these links are to human genes.
2. The nucleic acid sequence data libraries now attach identifier numbers to individual coding
regions within a record. This means that in a sequence record which includes information from
more than one gene, each gene's coding region has its own identifier number (known as a PID).
For this reason nucleic acid sequence accession number cross references now have the syntax:
accession_number; <PID_number>
e.g.: X12345; g9876543
3. The controlled vocabulary is now strictly hierarchical in organization.
4. FlyBase now distinguishes several different classes of genetic element and codes these in a
special field (*t in star-coded output). These include foreign and honorary genes (i.e. genes from
organisms other than drosophilids that may be used in constructs), fusion genes, transposons
(this includes all repetitive elements and sequences, regardless of whether they are know to
transpose), mitochondrial encoded genes, viruses, pathogens and symbionts, and structural
elements. See flybase/docs/genes.doc for further information. Most of these classes are available
as individual files.
9507
Cross references to the G protein-coupled receptor database (GCRDb) and to the TRANSFAC
database have now been added.
A file of wild stocks and chromosomes has been added to FlyBase (see
flybase/wild_stocks/wild_stocks.doc.
The genes and aberrations files are an updated files with the addition of new records and data.
The total number of 'genes' in this release is 9,012, that of alleles is 25,998 and that of
aberrations 11,699.
The bibliography includes 73,186 records.
9502
The genes file is an updated file with the addition of new genes and alleles. The total number of
'genes' in this release is 8,732 (7,473 for D. melanogaster; that of alleles is 24,656 (24,163 D.
melanogaster). The number of aberrations is 11,188 (11,181 D. melanogaster). The bibliography
includes 70,617 records.
Over 200 alleles have been removed from the genes file as they are due to position effect
variegation. We now use a controlled syntax for describing these in the aberrations file. See the
section describing the *V field in the Aberrations file documentation.
In previous releases of the genes file the five HSP70 encoding genes were grouped as two genes,
Hsp70A and Hsp70B, despite the fact that the former locus usually has two genes, and the latter
three. Each of these five genes now has its own record.
In previous releases of the genes file there was a single record for the mitochondrial DNA
(mtDNA). This has now been split into records for each of the different 37 mitochondrial genes.
Each of these has the symbol prefix mt:. For example the symbol for the gene encoding the small
rRNA from the mitochondrial genome is mt:srRNA, that for subunit II of the cytochrome oxidase
is mt:CoII. In addition there is a 'gene' record of the gene-less A+T rich mitochondrial origin
(mt:ori) and a dustbin record for all general mitochondrial references (MT:DNA).
Records for species other than for D. melanogaster are now considered to be different genes (as
they indeed are). Their gene symbols are identical to those of their homologs in D. melanogaster
but these symbols are prefixed Nnnn\, where N is the initial letter of the genus (e.g., D for
Drosophila) and nnn is a three-letter code, normally the first three letters of the specific name.
See flybase/nomenclature/species- abbreviations.txt for the abbreviations used. Although, in
this release, all of the genes from species other than D. melanogaster have been given separate
identities in the genes file, the process of splitting the bibliographic records (in particular) is not
yet complete.
The 'function' fields in the genes file have been rationalized to some extent. For enzymes with
Enzyme Commission numbers synonymous names are no longer included. The only name is that
in the DE line of the ENZYME DATA BANK. Synonyms can be obtained from this data bank
(AN lines). In previous releases some entries in the function field (e.g., 'nuclear protein',
'mitochondrial protein') were inappropriate and belonged in a field describing the cellular
localization of the gene's product. They have now been moved to such a field (*f in the starcoded format). The terms used in this field are to be found in the controlled vocabulary. Another
former inappropriate use of the function field was for terms that described some aspect of the
DNA sequence itself (e.g., 'repetitive-sequence', 'CAX-(opa)-repeat'). These have now been
moved to a new field, *v in the star-coded format. Finally, the function (*d) field has been
rationalized with respect to the use of synonyms. The ideal, to which we are working, this field
will not include synonyms at all, these will be represented only in the controlled vocabulary.
Over 1000 new theses, largely European and Australian, have been added to the bibliography.
9412
The genes file is an updated file with the addition of new genes and alleles. The total number of
'genes' in this release is 7,009; that of alleles is 19,542. The number of aberrations is 10,286.
Revisions have been made to 3,104 gene records. The bibliography includes 68,161 records.
All records are now date stamped. All gene records existing on May 16 1994 have the date 16
May 94 in a *H field (or displayed in the pretty text versions). The date of any subsequent
updates, or the date of entry of a new record, is also shown in this field.
Previous versions of the genes file included a number of records that were not genes, in any
obvious sense of the word, i.e., EST sequences, vectors etc. These have now been removed to the
new clones and transposons directories.
Cross-references between FlyBase and the PDB and NRL_3D protein structure databases have
now been made. Cross-references to NCBI gi and gibbsq numbers have now been removed from
FlyBase.
Three new directories have been created: flybase/docs/personal-communications/ to archive
personal communications to FlyBase; flybase/nomenclature/ for documents and tables
concerning the nomenclature of genes, alleles, aberrations and transposons and
flybase/transposons/ for information on transposons and vectors. None are fully implemented,
but will be in the coming months. For details, see the relevant sections of the Reference Manual.
The subdirectory structure of the allied-data directory has been rationalized.
Files of scanning electron microscope images of embryos (from Rudi Turner) have been added
(flybase/allied-data/images/embryo-kaufman-turner).
9404
The two major changes to FlyBase since the last release have been the integration of the genes
file with the files of molecular data from UCLA and the integration of the ABERRATIONS of
Lindsley and Zimm (1992) with Ashburner's original aberrations file.
The genes table of this release of FlyBase includes 6786 'genes' and 18,954 alleles, an increase of
more than 1000 'genes' since release 9309. The aberrations table includes over 10,000 records.
Since genes93 there has been a major edit of the material incorporated from Lindsley and Zimm
(1992). This has included the correction of errors, the standardization of many terms and phrases,
for example in the 'origin' field and in the way normal cytology is described, the removal of
some material that is redundant for a database, and reciprocally, the duplication of material
required to make the database records more self contained, and the use of a rigorous syntax for
describing transposable elements.
The synonym and function tables are now output from the same source as the genes table.
There have been major additions to the bibliography and further rationalization of abbreviations
of journal titles etc. The bibliography now includes 62,516 records and the crosslinking between
the bibliography and other tables has been implemented via unique reference identifier numbers.
Bibliographic records from the EMIC database have been incorporated.
The People tables have been considerably updated and Gopher+ clients now allow users to
update or add entries directly.
Several new databases have been released as allied-data bases. These include tables of genetic
data on Drosophila ananassae and D. buzzatii, a file of images contributed by the community
and all of the Drosophila records from the Environmental Mutagen Information Center's
database. In addition, FlyBase now includes the P1 clone and P-element data from the LBL
Drosophila Genome Project. The genetic data from the P-element tables has been imported into
the genes tables.
An attempt at a complete catalog of organs, tissues and other structures of Drosophila has been
released, it is in the file flybase/docs/controlled-vocabularies.txt.
One minor change, but important if you have programs using FlyBase tables, is that the symbols
representing super- and subscripts are now [ and ], and [[ and ]] respectively, respectively, rather
than {}and {{}}, in all tables.
The genes9405 differs from genes9404 only in that the newer file fully implements the format of
gene and allele identifier numbers described in this documentation.
9309
Release 9309 sees three very major changes to FlyBase. The first is the merger of the GENES
section of Lindsley and Zimm (1992) with Ashburner's September 1992 files and the subsequent
revision of the merged data. The file flybase/genes/genes93.doc describes the changes made, the
text is in flybase/genes/genes93.txt. This file not only corrects the errors found in Lindsley and
Zimm but includes considerable new material, including information on 1835 new loci. The total
number of genetic loci in genes93 is 5684.
The second major change is the release of flybase/refs, a unified bibliography of publications
concerning Drosophila. Available in three different formats flybase/refs includes over 58,000
records, including all of those from the published bibliographies of Herskowitz, from MEDLINE
and from BIOSIS (by permission).
The third major change is a unified directory of Drosophila workers, with over 4500 entries.
New files describing P1 clones of both D. melanogaster and D. virilis are now available on
FlyBase.
Many other FlyBase files have been updated for this release, and several FlyBase 'working
papers' suggesting controlled vocabularies for the description of Drosophila genes are published.
The documentation has been thoroughly revised and is now broken into two manuals, called the
FlyBase User Manual and the FlyBase Reference Manual. The Reference Manual is available, as
flybase/docs/Reference-manual.text or flybase/docs/Reference-manual.ps in text and
Postscript format respectively. The user Manual is available in flybase/docs/User-manual.text
or flybase/docs/User-manual.doc.
9301
Release 9301 of FlyBase is the first by the Consortium. It is released as a temporary measure to
make the data that are in FlyBase already available to the community. In large part 9301 is
simply a restructuring of data that had been previously available from IUBio and other sources. It
includes the tables from the 9209 release of Michael Ashburner (see
flybase/news/oldnews/1992.txt). These tables include 5321 loci, 4218 entries in the genetic
map, 11940 aberrations and 3120 references. Note that of the 5321 loci, about 800 are not in
Lindsley and Zimm (1992). The great majority of the references are also subsequent to Lindsley
and Zimm. This release includes tables from the 4/29/1993 release of John Merriam's clone lists.
Download