9709 The updated database contains information about nearly 50,000 alleles of nearly 15,000 genes. FlyBase contains descriptions of over 15,000 chromosomal aberrations, as well as molecular maps and information about more that 2,300 molecular constructs, including more than 1,200 transposons, and information about 1,350 different transposon insertions. The FlyBase bibliography of publications about Drosophila now contains more than 89,000 listings, many with links to the genes, aberrations, and molecular constructs they discuss. 9612 The updated database contains information about more than 38,000 alleles of more than 11,000 genes. More than 650 of the gene reports now include links to reports about expression patterns and other information for proteins and transcripts associated with the gene. FlyBase contains descriptions of over 13,300 chromosomal aberrations, as well as molecular maps and information about more that 2,000 molecular constructs, including more than 1,000 transposons. The FlyBase bibliography of publications about Drosophila now contains more than 81,000 listings, many with links to the genes and aberrations they discuss. 9603The genes file is an updated file with the addition of new genes and alleles. The total number of 'genes' in this release is nearly 10,000; that of alleles is over 30,000. The number of aberrations is over 12,500. The bibliography includes over 76,000 records, of which over 4000 are theses. The following are the more important structural changes: 1. The cross links to genes of other species (e.g. mouse, human, yeast) that are reported to be 'similar' to Drosophila genes have been extensively updated and now include gene symbols (and, where available, unique identifier numbers) from the mouse (MGD), human (OMIM and GDB), yeast and other single organism databases. Nearly 1500 links to genes in other species now exist. Over half of these links are to human genes. 2. The nucleic acid sequence data libraries now attach identifier numbers to individual coding regions within a record. This means that in a sequence record which includes information from more than one gene, each gene's coding region has its own identifier number (known as a PID). For this reason nucleic acid sequence accession number cross references now have the syntax: accession_number; <PID_number> e.g.: X12345; g9876543 3. The controlled vocabulary is now strictly hierarchical in organization. 4. FlyBase now distinguishes several different classes of genetic element and codes these in a special field (*t in star-coded output). These include foreign and honorary genes (i.e. genes from organisms other than drosophilids that may be used in constructs), fusion genes, transposons (this includes all repetitive elements and sequences, regardless of whether they are know to transpose), mitochondrial encoded genes, viruses, pathogens and symbionts, and structural elements. See flybase/docs/genes.doc for further information. Most of these classes are available as individual files. 9507 Cross references to the G protein-coupled receptor database (GCRDb) and to the TRANSFAC database have now been added. A file of wild stocks and chromosomes has been added to FlyBase (see flybase/wild_stocks/wild_stocks.doc. The genes and aberrations files are an updated files with the addition of new records and data. The total number of 'genes' in this release is 9,012, that of alleles is 25,998 and that of aberrations 11,699. The bibliography includes 73,186 records. 9502 The genes file is an updated file with the addition of new genes and alleles. The total number of 'genes' in this release is 8,732 (7,473 for D. melanogaster; that of alleles is 24,656 (24,163 D. melanogaster). The number of aberrations is 11,188 (11,181 D. melanogaster). The bibliography includes 70,617 records. Over 200 alleles have been removed from the genes file as they are due to position effect variegation. We now use a controlled syntax for describing these in the aberrations file. See the section describing the *V field in the Aberrations file documentation. In previous releases of the genes file the five HSP70 encoding genes were grouped as two genes, Hsp70A and Hsp70B, despite the fact that the former locus usually has two genes, and the latter three. Each of these five genes now has its own record. In previous releases of the genes file there was a single record for the mitochondrial DNA (mtDNA). This has now been split into records for each of the different 37 mitochondrial genes. Each of these has the symbol prefix mt:. For example the symbol for the gene encoding the small rRNA from the mitochondrial genome is mt:srRNA, that for subunit II of the cytochrome oxidase is mt:CoII. In addition there is a 'gene' record of the gene-less A+T rich mitochondrial origin (mt:ori) and a dustbin record for all general mitochondrial references (MT:DNA). Records for species other than for D. melanogaster are now considered to be different genes (as they indeed are). Their gene symbols are identical to those of their homologs in D. melanogaster but these symbols are prefixed Nnnn\, where N is the initial letter of the genus (e.g., D for Drosophila) and nnn is a three-letter code, normally the first three letters of the specific name. See flybase/nomenclature/species- abbreviations.txt for the abbreviations used. Although, in this release, all of the genes from species other than D. melanogaster have been given separate identities in the genes file, the process of splitting the bibliographic records (in particular) is not yet complete. The 'function' fields in the genes file have been rationalized to some extent. For enzymes with Enzyme Commission numbers synonymous names are no longer included. The only name is that in the DE line of the ENZYME DATA BANK. Synonyms can be obtained from this data bank (AN lines). In previous releases some entries in the function field (e.g., 'nuclear protein', 'mitochondrial protein') were inappropriate and belonged in a field describing the cellular localization of the gene's product. They have now been moved to such a field (*f in the starcoded format). The terms used in this field are to be found in the controlled vocabulary. Another former inappropriate use of the function field was for terms that described some aspect of the DNA sequence itself (e.g., 'repetitive-sequence', 'CAX-(opa)-repeat'). These have now been moved to a new field, *v in the star-coded format. Finally, the function (*d) field has been rationalized with respect to the use of synonyms. The ideal, to which we are working, this field will not include synonyms at all, these will be represented only in the controlled vocabulary. Over 1000 new theses, largely European and Australian, have been added to the bibliography. 9412 The genes file is an updated file with the addition of new genes and alleles. The total number of 'genes' in this release is 7,009; that of alleles is 19,542. The number of aberrations is 10,286. Revisions have been made to 3,104 gene records. The bibliography includes 68,161 records. All records are now date stamped. All gene records existing on May 16 1994 have the date 16 May 94 in a *H field (or displayed in the pretty text versions). The date of any subsequent updates, or the date of entry of a new record, is also shown in this field. Previous versions of the genes file included a number of records that were not genes, in any obvious sense of the word, i.e., EST sequences, vectors etc. These have now been removed to the new clones and transposons directories. Cross-references between FlyBase and the PDB and NRL_3D protein structure databases have now been made. Cross-references to NCBI gi and gibbsq numbers have now been removed from FlyBase. Three new directories have been created: flybase/docs/personal-communications/ to archive personal communications to FlyBase; flybase/nomenclature/ for documents and tables concerning the nomenclature of genes, alleles, aberrations and transposons and flybase/transposons/ for information on transposons and vectors. None are fully implemented, but will be in the coming months. For details, see the relevant sections of the Reference Manual. The subdirectory structure of the allied-data directory has been rationalized. Files of scanning electron microscope images of embryos (from Rudi Turner) have been added (flybase/allied-data/images/embryo-kaufman-turner). 9404 The two major changes to FlyBase since the last release have been the integration of the genes file with the files of molecular data from UCLA and the integration of the ABERRATIONS of Lindsley and Zimm (1992) with Ashburner's original aberrations file. The genes table of this release of FlyBase includes 6786 'genes' and 18,954 alleles, an increase of more than 1000 'genes' since release 9309. The aberrations table includes over 10,000 records. Since genes93 there has been a major edit of the material incorporated from Lindsley and Zimm (1992). This has included the correction of errors, the standardization of many terms and phrases, for example in the 'origin' field and in the way normal cytology is described, the removal of some material that is redundant for a database, and reciprocally, the duplication of material required to make the database records more self contained, and the use of a rigorous syntax for describing transposable elements. The synonym and function tables are now output from the same source as the genes table. There have been major additions to the bibliography and further rationalization of abbreviations of journal titles etc. The bibliography now includes 62,516 records and the crosslinking between the bibliography and other tables has been implemented via unique reference identifier numbers. Bibliographic records from the EMIC database have been incorporated. The People tables have been considerably updated and Gopher+ clients now allow users to update or add entries directly. Several new databases have been released as allied-data bases. These include tables of genetic data on Drosophila ananassae and D. buzzatii, a file of images contributed by the community and all of the Drosophila records from the Environmental Mutagen Information Center's database. In addition, FlyBase now includes the P1 clone and P-element data from the LBL Drosophila Genome Project. The genetic data from the P-element tables has been imported into the genes tables. An attempt at a complete catalog of organs, tissues and other structures of Drosophila has been released, it is in the file flybase/docs/controlled-vocabularies.txt. One minor change, but important if you have programs using FlyBase tables, is that the symbols representing super- and subscripts are now [ and ], and [[ and ]] respectively, respectively, rather than {}and {{}}, in all tables. The genes9405 differs from genes9404 only in that the newer file fully implements the format of gene and allele identifier numbers described in this documentation. 9309 Release 9309 sees three very major changes to FlyBase. The first is the merger of the GENES section of Lindsley and Zimm (1992) with Ashburner's September 1992 files and the subsequent revision of the merged data. The file flybase/genes/genes93.doc describes the changes made, the text is in flybase/genes/genes93.txt. This file not only corrects the errors found in Lindsley and Zimm but includes considerable new material, including information on 1835 new loci. The total number of genetic loci in genes93 is 5684. The second major change is the release of flybase/refs, a unified bibliography of publications concerning Drosophila. Available in three different formats flybase/refs includes over 58,000 records, including all of those from the published bibliographies of Herskowitz, from MEDLINE and from BIOSIS (by permission). The third major change is a unified directory of Drosophila workers, with over 4500 entries. New files describing P1 clones of both D. melanogaster and D. virilis are now available on FlyBase. Many other FlyBase files have been updated for this release, and several FlyBase 'working papers' suggesting controlled vocabularies for the description of Drosophila genes are published. The documentation has been thoroughly revised and is now broken into two manuals, called the FlyBase User Manual and the FlyBase Reference Manual. The Reference Manual is available, as flybase/docs/Reference-manual.text or flybase/docs/Reference-manual.ps in text and Postscript format respectively. The user Manual is available in flybase/docs/User-manual.text or flybase/docs/User-manual.doc. 9301 Release 9301 of FlyBase is the first by the Consortium. It is released as a temporary measure to make the data that are in FlyBase already available to the community. In large part 9301 is simply a restructuring of data that had been previously available from IUBio and other sources. It includes the tables from the 9209 release of Michael Ashburner (see flybase/news/oldnews/1992.txt). These tables include 5321 loci, 4218 entries in the genetic map, 11940 aberrations and 3120 references. Note that of the 5321 loci, about 800 are not in Lindsley and Zimm (1992). The great majority of the references are also subsequent to Lindsley and Zimm. This release includes tables from the 4/29/1993 release of John Merriam's clone lists.