GeneGrid : Using OgsaDai in Bioinformatics

advertisement
GeneGrid :
Using OgsaDai in
Bioinformatics
Noel Kelly
Belfast e-Science Centre
www.qub.ac.uk/escience
The Queen’s University of Belfast
The Queen’s University of Belfast
GeneGrid Background
• Bioinformatics - Commercially Driven
• Develop specialist tissue specific
datasets
• Large volumes data
• Multiple sites - little collaboration
• No dedicated HPC, low bandwidth
• Lack of in house expertise
www.qub.ac.uk/escience
The Queen’s University of Belfast
GeneGrid Objectives
• Grid Based Framework for Bioinformatics
• Integration of Existing Technologies & Data
Sets
• Gene Study in Silico
• Develop Specialist Data Sets
• Grid Services for Commercial or 3rd Party Use
• Institute of Bioinformatics R&D
www.qub.ac.uk/escience
The Queen’s University of Belfast
GeneGrid Architecture
GeneGrid Enviroment
GeneGrid Environment Interface
GeneGrid
Application &
Resource Registry
Workflow
Manager
Factory
GAM
www.qub.ac.uk/escience
GeneGrid Data Manager
Registry
Process
Manager
Factory
GAM
Database
Factory
Database
Factory
The Queen’s University of Belfast
GeneGrid Architecture
GeneGrid Enviroment
GeneGrid Data Manager
Registry
Database
Factory
www.qub.ac.uk/escience
Database
Factory
The Queen’s University of Belfast
Data Access, Integration &
Storage – OGSA-DAI
DAI Service
Group Registry
Grid Data Service Factory
Grid Data Service Factory
Grid Data Service
Grid Data Service
Database
Database
Status
SwissProt
www.qub.ac.uk/escience
The Queen’s University of Belfast
Databases in GeneGrid
Grid Environment
GeneGrid
Databases
OGSA-DAI
Proprietary
Databases
Public
Databases
www.qub.ac.uk/escience
The Queen’s University of Belfast
Databases in GeneGrid
Grid Environment
GeneGrid
Databases
OGSA-DAI
Proprietary
Databases
Public
Databases
www.qub.ac.uk/escience
The Queen’s University of Belfast
Proprietary Databases
Oracle Database
www.qub.ac.uk/escience
T.B.C.
The Queen’s University of Belfast
GeneGrid Databases
Results
(Xindice
/ Exist)
www.qub.ac.uk/escience
Workflow
Definition
(Xindice)
Workflow
Status
(Xindice)
The Queen’s University of Belfast
Public Biological Databases
EMBL
(File)
SwissProt
(File)
ENSEMBL
(MySQL)
www.qub.ac.uk/escience
trEMBL
(File)
trEMBL_new
(File)
The Queen’s University of Belfast
What OGSA-DAI done for
GeneGrid…
• “Ready to Go” Solution
• Easy Implementation
• Good Documentation
• Helpful & Useful Support
www.qub.ac.uk/escience
The Queen’s University of Belfast
Current Issues with OGSADAI in GeneGrid
• No Support for Flat File Databases
• Service Discovery
• CDATA wrappers
• Perform Documents
• Service Re-Registration
www.qub.ac.uk/escience
The Queen’s University of Belfast
Dealing with the Issues I
• Service Discovery
– Waiting for later release
• Perform Documents
– Upgrade to Incorporate new APIs
• Service Re-Registration
– T.B.D.
www.qub.ac.uk/escience
The Queen’s University of Belfast
Dealing with the Issues II
• CDATA wrappers
– Is this an OGSA-DAI issue?
• Flat File Databases
– Implemented PERL scripts in place of
XML:DB / JDBC Drivers
– Extensible Support requires PERL module
Development
www.qub.ac.uk/escience
The Queen’s University of Belfast
Misc. Contacts
• Dr. Paul Donachy – Project Supervisor
– p.donachy@qub.ac.uk
• Noel Kelly – Software Engineer
– n.kelly@qub.ac.uk
• GeneGrid web site
– www.qub.ac.uk/escience/projects.php
• Encyclopaedia of Life
– eol.sdsc.edu
www.qub.ac.uk/escience
The Queen’s University of Belfast
Download