MyGrid and Gold: their use of OGSA-DAI Arijit Mukherjee School of Computing Science University of Newcastle Newcastle Upon Tyne 1 MyGrid Components • • • • • • • Workflow enactor Service Discovery Registry Bio Services Information Repository … OGSA-DQP – uses and extends OGSA-DAI 2 Goals for DQP in MyGrid • To benefit from homogeneous access to heterogeneous data sources [OGSA-DAI]. • To benefit from Grid abstractions for on-demand allocation of resources required for a task [OGSA/OGSI/GT3]. • To provide transparent, implicit support for parallelism and distribution. [Polar*] • To orchestrate the composition of data retrieval and analysis services. • To expose this orchestration capability as a Grid data service. 3 An example • • Given two DBMSs and one analysis Then, OGSA-DQP acts as an enactor tool (e.g., a WS): of a declarative orchestration of services on the Grid: – proteinTerm to a GO Gene Ontology running as a remote mySQL DB, reduce 3,4 – protein to a GIMS Genome op_call(Blast) Warehouse running as a remote exchange ODMG-compliant DB, 2 hash_join – Blast (sequence alignment (proteinId) scoring); exchange We can obtain alignment scores for a exchange 5 sequence against proteins of a reduce reduce certain kind: select p.proteinId, Blast(p.sequence) from protein p, proteinTerm t where t.termId = ‘GO:0005942’ and p.proteinId = t.proteinId table_scan (protein) 1 index_scan termId=GO:0005942 (proteinTerm) 4 User Experience about OGSA-DAI • Upside – – – – – Uniform access to heterogeneous data resources Wraps JDBC/XMLDB details Extensible Not so difficult installation process Excellent user manual • Downside – – – – – – Still very slow, possibly that can be contributed to OGSI MetaDataExtractor is only for MySql – needs extension High initialization cost Performance worries for large data sets Possible bugs in XMLUtilities Need customizable streaming (cursor like features: get me N rows, get me next N rows) – Still contains hard-coded port numbers in the configuration files 5 Use of OGSA-DQP/OGSA-DAI in MyGrid • Lack of stability in OGSI and the recent debate about WS-RF partially responsible for limited use of OGSA-DAI and OGSA-DQP in MyGrid • OGSA-DQP has a web-service wrapper for MyGrid • A stable WS-I based implementation of OGSA-DAI would facilitate MyGrid components to use it – rest of MyGrid is WS-I 6 Gold • Gold is a new £2M e-Science Pilot Project – Newcastle & Lancaster • Designing a Generic infrastructure for Virtual Organisations – workflow, security, audit, service matching – information management • Using Chemical Engineering as exemplar 7 Information Model: MyGrid & Gold Domain Dependent Access Services Chem Eng Construction Domain Independent Access Services Provenance Organisation Update Notification Data Models Relational XML RDF Security Query Naming & Location Schema Independent Access Services Schema : Gold + Domain Metadata Data Storage (Distributed) DBMS DBMS DBMS 8 Potential use of OGSA-DAI • We would like to use OGSA-DAI for access to databases, and federation • Gold has chosen to build on WS-I – needs stable platform to support users (some in industry) • Therefore, currently building our own basic query interface from WS-I • We would like to use a WS-I version of OGSADAI if and when it becomes available – the sooner the better 9