MyGrid and Gold: their use of OGSA-DAI Arijit Mukherjee School of Computing Science

advertisement
MyGrid and Gold: their use
of OGSA-DAI
Arijit Mukherjee
School of Computing Science
University of Newcastle
Newcastle Upon Tyne
1
MyGrid Components
•
•
•
•
•
•
•
Workflow enactor
Service Discovery
Registry
Bio Services
Information Repository
…
OGSA-DQP – uses and extends OGSA-DAI
2
Goals for DQP in MyGrid
• To benefit from homogeneous access to heterogeneous
data sources [OGSA-DAI].
• To benefit from Grid abstractions for on-demand
allocation of resources required for a task
[OGSA/OGSI/GT3].
• To provide transparent, implicit support for parallelism
and distribution. [Polar*]
• To orchestrate the composition of data retrieval and
analysis services.
• To expose this orchestration capability as a Grid data
service.
3
An example
•
•
Given two DBMSs and one analysis  Then, OGSA-DQP acts as an enactor
tool (e.g., a WS):
of a declarative orchestration of
services on the Grid:
– proteinTerm to a GO Gene
Ontology running as a remote
mySQL DB,
reduce
3,4
– protein to a GIMS Genome
op_call(Blast)
Warehouse running as a remote
exchange
ODMG-compliant DB,
2
hash_join
– Blast (sequence alignment
(proteinId)
scoring);
exchange
We can obtain alignment scores for a exchange
5
sequence against proteins of a
reduce
reduce
certain kind:
select p.proteinId, Blast(p.sequence)
from
protein p, proteinTerm t
where t.termId = ‘GO:0005942’ and
p.proteinId = t.proteinId
table_scan
(protein)
1
index_scan
termId=GO:0005942
(proteinTerm)
4
User Experience about OGSA-DAI
• Upside
–
–
–
–
–
Uniform access to heterogeneous data resources
Wraps JDBC/XMLDB details
Extensible
Not so difficult installation process
Excellent user manual
• Downside
–
–
–
–
–
–
Still very slow, possibly that can be contributed to OGSI
MetaDataExtractor is only for MySql – needs extension
High initialization cost
Performance worries for large data sets
Possible bugs in XMLUtilities
Need customizable streaming (cursor like features: get me N rows, get
me next N rows)
– Still contains hard-coded port numbers in the configuration files
5
Use of OGSA-DQP/OGSA-DAI in MyGrid
• Lack of stability in OGSI and the recent debate about
WS-RF partially responsible for limited use of OGSA-DAI
and OGSA-DQP in MyGrid
• OGSA-DQP has a web-service wrapper for MyGrid
• A stable WS-I based implementation of OGSA-DAI
would facilitate MyGrid components to use it
– rest of MyGrid is WS-I
6
Gold
• Gold is a new £2M e-Science Pilot Project
– Newcastle & Lancaster
• Designing a Generic infrastructure for
Virtual Organisations
– workflow, security, audit, service matching
– information management
• Using Chemical Engineering as exemplar
7
Information Model: MyGrid & Gold
Domain Dependent Access Services
Chem Eng
Construction
Domain Independent Access Services
Provenance
Organisation
Update
Notification
Data Models
Relational
XML
RDF
Security
Query
Naming & Location
Schema Independent Access Services
Schema : Gold + Domain
Metadata
Data Storage (Distributed)
DBMS
DBMS
DBMS
8
Potential use of OGSA-DAI
• We would like to use OGSA-DAI for access to
databases, and federation
• Gold has chosen to build on WS-I
– needs stable platform to support users (some in
industry)
• Therefore, currently building our own basic
query interface from WS-I
• We would like to use a WS-I version of OGSADAI if and when it becomes available
– the sooner the better
9
Download