Ensembl Developers Workshop Bert Overduin Daniel Rios

advertisement
Ensembl
Developers Workshop
Bert Overduin
Daniel Rios
Stephen Fitzgerald
Edinburgh, 24 & 25 February 2009
EBI is an Outstation of the European Molecular Biology Laboratory.
Workshop schedule
Tue 24 February 2009
• IntroductionBert Overduin
• Core API
Bert Overduin
• Variation API
Daniel Rios
Wed 25 February 2009
• Variation API
• Compara API
Daniel Rios
Stephen Fitzgerald
Ensembl - Goal
• Provide automatic annotation of genomes
• Integrate this annotation with other biological data
• Make all these data available to all
Ensembl - Organisation
• Joint project between the European Bioinformatics
Institute (EMBL-EBI) and the Wellcome Trust Sanger
Institute (WTSI)
• Started in 1999 for the Human Genome Project
• Funded primarily by the Wellcome Trust, with additional
funding by EMBL, the EU, NHGRI, NIH-NIAID, MRC and
BBSRC
• Team of ca. 50 people, led by Ewan Birney (EBI) and Tim
Hubbard (Sanger)
Wellcome Trust Genome Campus
Hinxton, Cambridge
© John Freebrey (www.thedigitaldarkcloth.com)
Ensembl - Species
•
45 chordates, ranging from human to ‘primitive’ chordates
•
3 key eukaryote model organisms:
fruitfly (Drosophila melanogaster)
nematode (Caenorhabditis elegans)
yeast (Saccharomyces cerevisiae)
•
2 insect pathogen vectors:
malaria mosquito (Anopheles gambiae)
yellow fever / dengue mosquito (Aedes aegypti)
Ensembl - Data
•
•
•
•
Genomic sequence
Gene/transcript/protein models
External references
Mapped cDNAs, proteins, micro array probes, BAC
clones, cytogenetic bands, markers, repeats etc.
• Comparative data: orthologs and paralogs, protein
families, whole genome alignments, syntenic regions
• Variation data: SNPs
• Regulatory data: “best guess” set of regulatory elements
Ensembl - Databases
•
MySQL
•
Species-specific databases:
core:
variation:
funcgen:
•
genomic sequences and most annotation
genetic variation
regulatory elements
Cross-species database:
compara:
all comparative data
Ensembl - Access to data
• Release web site
• Pre-Release
• Archive
• BioMart
http://www.ensembl.org
http://pre.ensembl.org
http://archive.ensembl.org
http://www.ensembl.org/biomart/martview
http://www.biomart.org/biomart/martview
• FTP site
• Amazon Web Services
• MySQL interface
• Perl API
ftp://ftp.ensembl.org
http://aws.amazon.com/publicdatasets
ensembldb.ensembl.org
http://www.ensembl.org/info/data/api.html
Application Programming Interface
“An Application Programming Interface (API) is a set of
definitions of the ways in which one piece of computer
software communicates with another. It is a method of
achieving abstraction, usually (but not necessarily) between
lower-level and higher-level software. One of the primary
purposes of an API is to provide a set of commonly-used
functions (…). Programmers can then take advantage of the
API by making use of its functionality, saving them the task of
programming everything from scratch.”
Download