Ensembl Developers Workshop Bert Overduin Daniel Rios Stephen Fitzgerald Edinburgh, 24 & 25 February 2009 EBI is an Outstation of the European Molecular Biology Laboratory. Workshop schedule Tue 24 February 2009 • IntroductionBert Overduin • Core API Bert Overduin • Variation API Daniel Rios Wed 25 February 2009 • Variation API • Compara API Daniel Rios Stephen Fitzgerald Ensembl - Goal • Provide automatic annotation of genomes • Integrate this annotation with other biological data • Make all these data available to all Ensembl - Organisation • Joint project between the European Bioinformatics Institute (EMBL-EBI) and the Wellcome Trust Sanger Institute (WTSI) • Started in 1999 for the Human Genome Project • Funded primarily by the Wellcome Trust, with additional funding by EMBL, the EU, NHGRI, NIH-NIAID, MRC and BBSRC • Team of ca. 50 people, led by Ewan Birney (EBI) and Tim Hubbard (Sanger) Wellcome Trust Genome Campus Hinxton, Cambridge © John Freebrey (www.thedigitaldarkcloth.com) Ensembl - Species • 45 chordates, ranging from human to ‘primitive’ chordates • 3 key eukaryote model organisms: fruitfly (Drosophila melanogaster) nematode (Caenorhabditis elegans) yeast (Saccharomyces cerevisiae) • 2 insect pathogen vectors: malaria mosquito (Anopheles gambiae) yellow fever / dengue mosquito (Aedes aegypti) Ensembl - Data • • • • Genomic sequence Gene/transcript/protein models External references Mapped cDNAs, proteins, micro array probes, BAC clones, cytogenetic bands, markers, repeats etc. • Comparative data: orthologs and paralogs, protein families, whole genome alignments, syntenic regions • Variation data: SNPs • Regulatory data: “best guess” set of regulatory elements Ensembl - Databases • MySQL • Species-specific databases: core: variation: funcgen: • genomic sequences and most annotation genetic variation regulatory elements Cross-species database: compara: all comparative data Ensembl - Access to data • Release web site • Pre-Release • Archive • BioMart http://www.ensembl.org http://pre.ensembl.org http://archive.ensembl.org http://www.ensembl.org/biomart/martview http://www.biomart.org/biomart/martview • FTP site • Amazon Web Services • MySQL interface • Perl API ftp://ftp.ensembl.org http://aws.amazon.com/publicdatasets ensembldb.ensembl.org http://www.ensembl.org/info/data/api.html Application Programming Interface “An Application Programming Interface (API) is a set of definitions of the ways in which one piece of computer software communicates with another. It is a method of achieving abstraction, usually (but not necessarily) between lower-level and higher-level software. One of the primary purposes of an API is to provide a set of commonly-used functions (…). Programmers can then take advantage of the API by making use of its functionality, saving them the task of programming everything from scratch.”