GeneGrid : Using OgsaDai in Bioinformatics Noel Kelly Belfast e-Science Centre www.qub.ac.uk/escience The Queen’s University of Belfast The Queen’s University of Belfast GeneGrid Background • Bioinformatics - Commercially Driven • Develop specialist tissue specific datasets • Large volumes data • Multiple sites - little collaboration • No dedicated HPC, low bandwidth • Lack of in house expertise www.qub.ac.uk/escience The Queen’s University of Belfast GeneGrid Objectives • Grid Based Framework for Bioinformatics • Integration of Existing Technologies & Data Sets • Gene Study in Silico • Develop Specialist Data Sets • Grid Services for Commercial or 3rd Party Use • Institute of Bioinformatics R&D www.qub.ac.uk/escience The Queen’s University of Belfast GeneGrid Architecture GeneGrid Enviroment GeneGrid Environment Interface GeneGrid Application & Resource Registry Workflow Manager Factory GAM www.qub.ac.uk/escience GeneGrid Data Manager Registry Process Manager Factory GAM Database Factory Database Factory The Queen’s University of Belfast GeneGrid Architecture GeneGrid Enviroment GeneGrid Data Manager Registry Database Factory www.qub.ac.uk/escience Database Factory The Queen’s University of Belfast Data Access, Integration & Storage – OGSA-DAI DAI Service Group Registry Grid Data Service Factory Grid Data Service Factory Grid Data Service Grid Data Service Database Database Status SwissProt www.qub.ac.uk/escience The Queen’s University of Belfast Databases in GeneGrid Grid Environment GeneGrid Databases OGSA-DAI Proprietary Databases Public Databases www.qub.ac.uk/escience The Queen’s University of Belfast Databases in GeneGrid Grid Environment GeneGrid Databases OGSA-DAI Proprietary Databases Public Databases www.qub.ac.uk/escience The Queen’s University of Belfast Proprietary Databases Oracle Database www.qub.ac.uk/escience T.B.C. The Queen’s University of Belfast GeneGrid Databases Results (Xindice / Exist) www.qub.ac.uk/escience Workflow Definition (Xindice) Workflow Status (Xindice) The Queen’s University of Belfast Public Biological Databases EMBL (File) SwissProt (File) ENSEMBL (MySQL) www.qub.ac.uk/escience trEMBL (File) trEMBL_new (File) The Queen’s University of Belfast What OGSA-DAI done for GeneGrid… • “Ready to Go” Solution • Easy Implementation • Good Documentation • Helpful & Useful Support www.qub.ac.uk/escience The Queen’s University of Belfast Current Issues with OGSADAI in GeneGrid • No Support for Flat File Databases • Service Discovery • CDATA wrappers • Perform Documents • Service Re-Registration www.qub.ac.uk/escience The Queen’s University of Belfast Dealing with the Issues I • Service Discovery – Waiting for later release • Perform Documents – Upgrade to Incorporate new APIs • Service Re-Registration – T.B.D. www.qub.ac.uk/escience The Queen’s University of Belfast Dealing with the Issues II • CDATA wrappers – Is this an OGSA-DAI issue? • Flat File Databases – Implemented PERL scripts in place of XML:DB / JDBC Drivers – Extensible Support requires PERL module Development www.qub.ac.uk/escience The Queen’s University of Belfast Misc. Contacts • Dr. Paul Donachy – Project Supervisor – p.donachy@qub.ac.uk • Noel Kelly – Software Engineer – n.kelly@qub.ac.uk • GeneGrid web site – www.qub.ac.uk/escience/projects.php • Encyclopaedia of Life – eol.sdsc.edu www.qub.ac.uk/escience The Queen’s University of Belfast