Introduction to databases at the EBI

advertisement
Small Molecules
EBI Bioinformatics Roadshow
Gareth Owen, ChEBI group
The Jackson Laboratory
October 18th 2012
Services | Research | Training |
Industry
Course Objectives
In this course you will learn…
• How small molecules are stored in databases.
• How data related to small molecules is stored in ChEBI
and ChEMBL and how to query these databases
• Understand the ChEBI ontology
• How to access and query enzyme resources at the EBI,
using the Enzyme Portal, with a closer look at individual
resources such as IntEnz and Rhea
• How the Metabolights database can be used for storing
information about metabolomics experiments
Exercises.
•
•
•
•
Separate exercise sheets for each resource discussed.
Help reinforce learning.
Work alone or in teams.
Solutions will be shown in a run-through before the start
of the next session.
Questions
• Please feel free to ask questions at any time.
• If you are confused, you are probably not alone.
• I am be happy to answer all questions, provided you will
allow the following responses:
• “We’ll be discussing that later”.
• “I don’t know”
• Please do not deal with emails, etc. during the sessions
• Please turn off mobiles, or set to vibrate.
EBI Metabolomics and Bioinformatics Resources training workshop
The Jackson Laboratory
Thursday 18th October 2012
Time
Subject
09.00-09.30
Introduction to EBI and EBI search
09.30-10.30
Introduction to ChEBI
Exercises
10.30-11.00
Tea & Coffee/ break
11.00-12.30
ChEBI: Searching and the ChEBI Ontology
Exercises
12.30-13.15
13.15-14.30
Lunch
14.30-15.00
15.00-15.30
The Enzyme Portal, IntEnz and Rhea
Exercises
Introduction to MetaboLights
Tea & Coffee break
15.30-16.00
Small molecules and PDBe
16.00-17.00
Introduction to ChEMBL
Exercises
Course Feedback
The EMBL-European Bioinformatics Institute
A whistlestop tour
Services | Research | Training |
Industry
What is bioinformatics?
• The science of storing, retrieving and analysing large
amounts of biological information
• An interdisciplinary science, involving biologists,
computer scientists and mathematicians
• At the heart of modern biology
7
Biology is changing
• Data explosion
• New types of data
12000
• High-throughput biology
10000
• Growth of applied biology
8000
Disks (TB)
• Emphasis on systems,
not reductionism
Growth of raw storage
at EMBL-EBI
(in terabytes)
• molecular medicine
6000
4000
2000
0
• agriculture
• food
• environmental sciences…
8
Year
New types of data
Literature
Genomes
Protein sequence
Proteomes
Nucleotide sequence
Protein structure
Gene expression
Protein families,
domains and motifs
Chemical entities
Protein-protein
interactions
Pathways
9
Systems
What is EMBL-EBI?
•
•
•
•
10
Bioinformatics research and services institute
Non-profit organisation
~ 500 staff
Part of the European Molecular Biology Laboratory
The five branches of EMBL
Heidelberg
• Basic research in
molecular biology
• Administration
• EMBO
•
•
Hamburg
Structural biology
Grenoble
Hinxton
Bioinformatics
Monterotondo
1500 staff
>60 nationalities
Structural biology
Mouse biology
EMBL member states
Austria, Belgium, Croatia,
Denmark, Finland, France,
Germany, Greece, Iceland, Ireland,
Israel, Italy, Luxembourg, the
Netherlands, Norway, Portugal,
Spain, Sweden, Switzerland and
the United Kingdom
Associate member state: Australia
12
The Wellcome Trust Genome Campus
Sanger Institute
Sulston building
Data centre
Sanger
labs /
informatics
Cairns
Pavilion
(shared)
EMBL-EBI
© John Freebury
13
EMBL-EBI’s Mission
•
•
•
•
•
14
To provide freely available data and bioinformatics services to all
facets of the scientific community in ways that promote scientific
progress
To contribute to the advancement of biology through basic
investigator-driven research in bioinformatics
To provide advanced bioinformatics training to scientists at all levels,
from PhD students to independent investigators
To help disseminate cutting-edge technologies to industry
To coordinate biological data provision across Europe
EMBL-EBI external funding
Sources of external funding for the year as of December 2010. The
Wellcome Trust also supports us through provision of our buildings.
The UK’s Biotechnology and Biological
Sciences Research Council (BBSRC)
awarded a further €11.4m in August 2009
in support of EMBL-EBI’s planned role as
the central hub of ELIXIR.
15
Services
www.ebi.ac.uk/services
Services | Research | Training |
Industry
Key facts about services
• European node for globally coordinated data collection
and dissemination projects
• Core databases produced in collaboration with other
world leaders, including NCBI (US), National Institute of
Genetics (Japan), Swiss Institute of Bioinformatics, Cold
Spring Harbor Laboratory (US)
• The world’s most comprehensive collection of molecular
databases
17
Principles of service provision
18
•
Accessibility – all data and tools freely available without restriction,
apart from information that could be used to identify individuals
•
Compatibility – we develop and promote the use of standards in
bioinformatics
•
Comprehensive data sets – agreements with other data providers
ensure that our resources contain comprehensive and up-to-date
data; agreements with publishers ensure that published data are
placed in a public repository at the earliest opportunity
•
Portability – data and software can be downloaded and installed
locally
•
Quality – Our databases are enhanced through annotation and
cross-referencing
Databases: molecules to systems
Genomes
Ensembl
Ensembl Genomes
EGA
Nucleotide sequence
ENA
Functional
genomics
ArrayExpress
Expression Atlas
Literature and ontologies
CiteXplore, GO
Protein families,
motifs and domains
InterPro
Macromolecular
PDBe
Protein activity
IntAct , PRIDE
Pathways
Reactome
Protein Sequences
UniProt
Chemical entities
ChEBI
Chemogenomics
ChEMBL
19
Systems
BioModels
BioSamples
Database collaborations
20
Standards development – international collaborations
Genomics Standards Consortium (GSC)
http://gensc.org
Genome annotation
www.geneontology.org
Protein sequence
www.uniprot.org
Nucleotide sequence
www.insdc.org
Functional Genomics
Data Society
www.fged.org
Cheminformatics
www.ebi.ac.uk/chebi
HUPOProteomics
Standards
Initiative (PSI)
www.psidev.info/
Pathways
www.reactome.org
www.biopax.org
Metabolomics Standards Initiative (MSI)
www.metabolomicssociety.org
21
Protein structure
www.wwpdb.org
Systems modelling
standards
www.sbml.org
New search service
Access from the
EBI’s homepage
Species selector
allows for easy
comparison
Data organised
according to:
• gene
• expression
• protein
• structure
• literature
22
Explore data,
return easily to
your results
Goals of the new EBI Search
• Relevant to ‘wet-lab’ biologists
• Organises information based around a single gene
(or a small number of genes)
• User-expectation centric (not database centric)
• Smooth transition to the detailed information in
many of EBI’s core databases
• NOT for bioinformaticians:
does not provide programmatic access
23
User support
• E-mail support – www.ebi.ac.uk/support
• Online help pages – www.ebi.ac.uk/help
• eLearning Portal – coming soon
24
Research
www.ebi.ac.uk/groups
Services | Research | Training |
Industry
Key facts about research at EMBL-EBI
• A unique
environment for
bioinformatics
research
• Nine dedicated
research groups
• Seven services
teams also carry out
R&D
• Research and
services are mutually
supportive
Training
www.ebi.ac.uk/training
Services | Research | Training |
Industry
Pre- and postdocs at EMBL-EBI
• EMBL International PhD Programme
• Postdoctoral fellowships:
• EIPOD – EMBL-sponsored interdisciplinary fellowships
• ESPOD – EBI–Sanger combined experimental and
computational fellowships
3
1
For further information go to:
http://www.ebi.ac.uk/Information/Brochures/
EBI in a Nutshell
Guide to data resources
Research at a Glance 2012
32
Download