WormBase:

advertisement
WormBase:
A Resource for the Biology &
Genome of C. elegans
Lincoln D. Stein
WormBase Web Site
WormBase is a MOD
‹ Model
Organism Database
‹ Repository for reagents
– Genetic stocks, vectors, clones
‹ Genetic
maps
‹ Large-scale data sets
– Genome, EST sets, microarrays, interactions
‹ Literature
‹ Meetings,
announcements, etc
Other MODs
‹ FlyBase
(Drosophila)
‹ WormBase (Caenorhabditis)
‹ SGD (Saccharomyces)
‹ TAIR (Arabidopsis)
‹ MGD (Mus)
‹ PlasmoDB (Plasmodium)
‹ RatDB (Rattus)
C. elegans Fun Facts
‹ 1.5
mm length
‹ 2 week life span
‹ 959 cells
‹ 302 neurons
‹ 6 chromosomes
‹ 100,258,171 bp (95 Ns)
‹ 19,000 genes
‹ 2,000 mutant strains
WormBase Fun Facts
‹ 402,076
Sequences
‹ 121,671 Proteins
‹ 143,708 Clones
‹ 24,728 Primer pairs
‹ 15,022 Papers
‹ 12,552 Loci
‹ 2,944 Cells
‹ 14 Maps
‹ 7,200 RNAi results
‹ 332 Transgenes
‹ 19,713 Expression Patterns
WormBase Tour:
Looking for MAP Kinase Kinase
mek-2
Studies
Found RNAi
Phenotype
a Genetic
Locus:
& Exprmek-2
Pattern
mek-2 RNAi Phenotype
mek-2 Sequence View
mek-2 Protein View
mek-2 Genome View
mek-2 PCR Assays
mek-2 Bibliography
mek-2 Citation
VB1 Neuron
VB1 Synapses
VBx Neuroanatomy
Advanced Searches (1)
Advanced Searches (2)
Advanced Searches (3)
Ad Hoc Queries
Bulk FTP Downloads
‹ Genomic
sequence
– DNA (fasta)
– Feature files (GFF)
– C. briggsae DNA
‹ ESTs
(fasta)
‹ WormPep
‹ Non-coding RNAs
‹ All the software (Open Source)
Recently Added: C. briggsae
‹ C.
elegans sequencing consortium (WashU
+ Sanger Center)
‹ Whole genome shotgun + 12 Mb
previously-finished BACs from WashU
‹ 142 scaffolds
‹ N50 = 1,450 kb
‹ 21,000 predicted genes
‹ 11,000 genes orthologous to elegans
Accessing briggsae
Corresponding
region
viainelegans
briggsae
Synteny/Orthology Display
WormBase Usage
900,000
800,000
700,000
600,000
500,000
400,000
300,000
200,000
100,000
M
ay
-0
0
Ju
l-0
Se 0
p0
N 0
ov
-0
0
Ja
n01
M
ar
-0
M 1
ay
-0
1
Ju
l-0
Se 1
p01
N
ov
-0
1
Ja
n02
M
ar
-0
2
0
Total Hits
Data Requests
Total Hits (fit)
Data Requests
(fit)
WormBase Hits by Domain
other
14%
ca
5%
uk
5%
edu
48%
de
6%
jp
6%
net
7%
com
9%
Major Referrers
elegans.bcgsc.bc.ca, 5837
wormbase.sanger.ac.uk, 5848
volvox, 6173
vermicelli.caltech.edu, 6181
elegans.swmed.edu, 37336
google.yahoo.com, 7221
stein.cshl.org, 9560
www.proteome.com, 14682
www.sanger.ac.uk, 35639
www.google.com, 19086
bookmarks, 30747
Top Pages
3%
3%
2%2%
2%2%
3%
3%
40%
5%
7%
7%
9%
12%
Sequence
Locus
Genome Browser
Tree
Blast
Picture
Clone
Aligner
RNAi
Protein
Paper
Biblio
XML
Expr Profile
How WormBase Works
Images, Movies
Web server
Perl scripts
You
Database access
library
Genomic Data
ACeDB
MySQL
WormBase Information Workflow
CalTech
Sanger
.ace
.ace
WashU
.ace
NCBI
.ace
CGC
.ace
WormBase Information Workflow
CalTech
Sanger
.ace
.ace
WashU
.ace
Sanger
NCBI
.ace
CGC
.ace
WormBase Information Workflow
CalTech
Sanger
.ace
.ace
WashU
.ace
Sanger
CSHL
www.wormbase.org
NCBI
.ace
CGC
.ace
WormBase Information Workflow
CalTech
Sanger
.ace
.ace
WashU
.ace
Sanger
CalTech
CSHL
Caltech.wormbase.org
www.wormbase.org
NCBI
.ace
CGC
.ace
Curating a Paper
Clipping Service
Domain
Expert
Gene Record
Database Entry
Cell Record
Mutant Record
.ACE Files
CalTechAce
.ACE File
Curating the Genome (1)
>CHROMOSOME_I
gcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagc
ctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcct
aagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaa
gcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagc
ctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcct
aagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaa
gcctaag…
Gene Prediction
Repeat Finding
EST Alignment
List of Features
Curating the Genome (2)
List of Features
CamAce
StlAce
ACeDB Sequence Editor
Curating Other Data Sets
Knockout
Consortium
GO
Consortium
C. elegans
Microarray
Consortium
RNAi
Labs
ORFeome
Project
CSHLAce
Build Process
CSHLAce
StlAce
CamAce
integrate
reconcile
BuildAce
WormBase
CalTechAce
The GMOD Project
‹ Generic
Model Organism Database
‹ Generic MOD web site
‹ Database schemas
‹ Standard operating procedures
‹ Annotation tools
‹ Analysis tools
‹ Visualization tools
http://www.gmod.org
Released Modules
‹ Apollo
genome annotation editor
‹ GBrowse generic genome browser
‹ PubSearch literature curation system
‹ LabDoc SOP editor
‹ CMap comparative map viewer
‹ GOET ontology editor
‹ Chado modular database schema
GBrowse
Zoomed Way In
Zoomed Way Way In
Zoomed Way Way Out
Keyword Search
Sequence Search
Third Party Annotations
Links to 3d Party Web Sites
Uploaded Your Own Annotations
Sequence dumps & other reports
Extensively Customizable
‹ End-user
– Turn tracks on and off, change order, change
packing & labeling attributes (stored in cookie)
‹ Data
provider
– Change fonts, colors, text.
– Change overview – genetic map, contigs,
coverage, karyotype.
– Define new tracks using simple config file.
– Tinker with track appearance to hearts content.
Adding a New Track
(a) Create a GFF file named “deletions.gff”
Chr1 targeted deletion 1293224 1294901 . . . Deletion d101k2
Chr1 targeted deletion 8239811 8241116 . . . Deletion d680k2
Chr2 targeted deletion 5866382 5866500 . . . Deletion d007k2
(b) Run the load_gff.pl script
> load_gff.pl –d example_database deletions.gff
Loading features…
Done. 3 features loaded.
(c) Add a new track “stanza” to the gbrowse configuration file
[Knockout]
feature = deletion
glyph
= span
fgcolor = red
key
= Knockouts
link
= http://example.org/cgi-bin/knockout_details?$name
citation
= These are deletion knockouts produced by the
example knockout consortium (http://example.org/knockouts.html)
Extensively Extensible
Plugins
gbrowse CGI
script
Apache Web Server
Glyphs
Oracle adaptor
(alpha test)
Bio::Graphics
library
BioPerl library
Bio::DB::GFF
adaptor
Oracle
MySQL
Flat File adaptor
Chado
adaptor
Flat
Files
GBrowse on GenBank!
GenBank?
Plugins
gbrowse CGI
script
Apache Web Server
Glyphs
Bio::Graphics
library
BioPerl library
Bio::DB::GFF
GenBank
adaptor
Proxy
Adaptor
MySQL
GenBank
B. burgdorferi via GenBank proxy
WormBase People
CalTech
Cold Spring Harbor
Paul Sternberg
Erich Schwarz
Raymond Lee
Wen Xiao
Lincoln Stein
Todd Harris
Nansheng Chen
Fiona Cunningham
Sanger Center
Washington University
Richard Durbin
Daniel Lawson
Keith Bradman
John Spieth
Download