Declarative Grid Service Orchestration with OGSA-DQP Service-Based Distributed Query Processing on the Grid

advertisement
Service-Based Distributed Query
Processing on the Grid
Declarative Grid Service Orchestration
with OGSA-DQP
Alvaro A A Fernandes
Department of Computer Science
University of Manchester
16-17 October 2003
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
1
places, people, funding, projects
Manchester
M Nedim Alpdemir
Anastasios Gounaris
Norman W Paton
Alvaro A A Fernandes
Rizos Sakellariou
16-17 October 2003
Newcastle upon Tyne
Arijit Mukherjee
Jim Smith
Paul Watson
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
2
motivation
• Pull by applications:
– overwhelming amounts
of semantically
complex data in
– very diverse,
structurally dissimilar,
and autonomous,
geographically
dispersed data sources
– requiring
computationally
demanding analysis.
16-17 October 2003
• Push from context
and infrastructure:
– Web service impetus
combined with
– Grid abstractions and
protocols that enable,
– not just dynamic
resource discovery but
also,
– dynamic resource
allocation and use.
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
5
context
1. High-level data access and integration services
are needed if applications that have data with
complex structure and complex semantics are
to benefit from the Grid.
2. Standards for data access are emerging, and
middleware products that are reference
implementations of such standards are already
available.
3. Distributed query processing technology is one
approach to delivering (1.) given the availability
of (2.).
4. Declarative service orchestration falls out.
16-17 October 2003
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
6
OGSA-DQP
approach
Query
Results
OGSA-DQP
OGSA-DAI
OGSA-DAI
DBMS
DBMS
data
data
16-17 October 2003
• OGSA-DQP uses a
middleware approach.
• It can be seen as a
mediator over OGSADAI wrappers.
• It promises bottomlines regarding:
– efficiency: “leave to it
to schedule in
parallel”;
– effectiveness: “leave to
it to orchestrate your
services”;
– usability: “use it as a
Grid data service”.
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
9
OGSA-DQP
example
• Given two DBMSs and one
• Then, OGSA-DQP acts as an
analysis tool (e.g., a WS):
enactor of a declarative
orchestration of services on the
– proteinTerm to a GO Gene
Grid:
Ontology running as a remote
mySQL DB,
reduce
3,4
– protein to a GIMS Genome
op_call(Blast)
Warehouse running as a
exchange
remote ODMG-compliant DB,
2
hash_join
– Blast (sequence alignment
(proteinId)
scoring);
exchange
exchange
• We can obtain alignment scores
for a sequence against proteins of
reduce
reduce
a certain kind:
select p.proteinId, Blast(p.sequence)
from
protein p, proteinTerm t
where t.termId = ‘GO:0005942’ and
p.proteinId = t.proteinId
16-17 October 2003
table_scan
(protein)
1
5
index_scan
termId=GO:0005942
(proteinTerm)
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
16
OGSA-DQP
extends/depends on
extends
depends on
• Leonidas Fegaras’s DB system and OPTGEN
optimiser generator.
• OGSA/OGSI/GT3 Grid
Services (GSs).
• OGSA-DAI Grid Data
Services (GDSs).
[1997-2000]
• Polar: a parallel query
processing engine.
[1998-2001]
• Polar*: an MPICH-G
distributed extension of
Polar. [2002]
16-17 October 2003
• Leonidas Fegaras and
David Maier’s work on a
formal semantics for
OQL. [TODS 25(4),2000]
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
17
OGSA-DQP
manages/provides
provides
manages
• Grid Distributed Query
Services (GDQSs) that:
• Grid Query Evaluation
Services (GQESs) that:
– interact with clients;
– find and retrieve
service descriptions;
– parse, compile,
partition and schedule
the query execution
over a union of
distributed data
sources.
• The query plan is an
orchestration of GQESs
16-17 October 2003
– implement the physical
query algebra;
– implement the query
execution model and
semantics;
– run a partition of a
query execution plan
generated by a GDQS;
– interact with other
GQESs/GDSs/WSs but
not with clients.
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
18
OGSA-DQP
a brief tour (1)
• It builds upon GDSs
which build upon GSs.
• A GDS is a leaf in a
query execution plan
up from which data
ultimately flows.
• Data resources are,
thereby, virtualised.
• Since they are GSs, they
can be dynamically
created by dynamically
discovered factories
and then disposed of.
16-17 October 2003
• A GDQS is a GDS
capable of integration
and distributed retrieval
and analysis of data.
• To perform a request a
GDQS spawns as many
GQESs in as many hosts
as the partitioning and
scheduling policies of
the GDQS recommend
for that request.
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
19
OGSA-DQP
a brief tour (2)
• To obtain an execution plan, a GDQS:
– Interacts with registries to fetch information
about the data and computational services
deemed of interest by the requestor;
– Interacts with GDSs and (in future) Index
Services to acquire relevant metadata;
– Compiles, optimises, partitions and schedules
the query execution.
16-17 October 2003
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
20
OGSA-DQP
a brief tour (3)
• Given a distributed query plan, a GDQS:
– Interacts with GDS factories to create the leaf
services in the plan;
– Interacts with WSs that front-end analysis
capabilities;
– Commands the creation of GQESs as
stipulated by the partitioning and scheduling
decided on by the compiler;
– Coordinates the GQESs into executing the
plan.
16-17 October 2003
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
21
findServiceData(DBSchema)
GS
1
4.1
GDS
G
Instances
GDS
registerService
1
createService
Registry
GDSR
G
GS
7
perform(query)
5
findServiceData
Client
GDT
4
1
importSchema
perform(gqes_query)
GDS
GDQ
GDQS
6
GDT
6
(1)
GDS
GQES 1
G
GDT
. . .
2
1
perform(querySubPlan)
what is
going
on
behind
the
scenes
Factory GDQSF
G
3
1
GDS
8
GQES n
G
GDT
8
16-17 October 2003
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
39
results
Client
G
1
4
GDT
N0
GDQ
GDS
3
GQES 2
hash_join
(p.proteinID=t.proteinID)
G
perform(QuerySubplan)
GDQS
GDT
2
N4
createService
reduce (proteinID,sequence)
Factory GQESF
G
GDT
3
sequential_scan
GDS
perform(QuerySubplan)
GQES 1
G
reduce (p.proteinID, blast)
createService
perform(QuerySubplan)
what is
going
on
behind
the
scenes
G
perform(Query)
GDS
N2
GDS
2
Factory GQES F
G
Web S ervices
(BLAST)
operation_call
blast(p.sequence)
4
4
1
N3
results
GDT
results
GDS
3
GQES 1
G
GDT
(2)
Factory GQESF
G
2
createService
reduce (p.proteinID, blast)
GDS
GQES 3
G
operation_call
blast(p.sequence)
Factory GQESF
G
N1
reduce (proteinID)
sequential_scan (term=8372)
GDS
G
16-17 October 2003
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
40
the Khalaf-Leymann taxonomy
for web services aggregation
aggregation
unconstrained
grouping
16-17 October 2003
recursive
wiring
constrained
choreography
service
domains
agreements
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
51
OGSA-DQP
various kinds of service aggregation
• There is interface
inheritance from GSs
and GDSs.
• The execution plan
can be seen as
encapsulating a
wiring of GQESs,
• But constrained, and
constructed on-thefly, as in an an
orchestration.
16-17 October 2003
• As in service
domains, there is
competition of GQESs
for a role to play in
the orchestration.
• As is agreements, the
orchestration is
opportunistic,
responsive to the
obtaining resource
levels and shortlived.
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
54
summary
• OGSA-DQP is a service-based distributed
query processor for the Grid that is:
– Exposed as a service;
– Implemented as an orchestration of services.
• OGSA-DQP is an enactor of declarative
Grid service orchestrations that:
– Improves on Grid portals when only retrieval
and analysis is involved;
– Fills the gap left by the lack of a service
orchestration framework in the OGSA.
16-17 October 2003
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
57
where to find out more: papers
1.
2.
3.
4.
M N Alpdemir, A Mukherjee, A Gounaris, A A A Fernandes, N
W Paton, P Watson, J Smith. An Experience Report on
Designing and Building OGSA-DQP: A Service Based
Distributed Query Processor for the Grid. GGF9 Workshop on
Designing and Building Grid Services, 2003.
M N Alpdemir, A Mukherjee, A Gounaris, N W Paton, P Watson,
A A A Fernandes, J Smith. Service-Based Distributed Querying
on the Grid. 1st Int. Conf. on Service Oriented Computing,
2003. LNCS, to appear
M N Alpdemir, A Mukherjee, A Gounaris, N W Paton, P Watson,
A A A Fernandes, J Smith. OGSA-DQP: A Service-Based
Distributed Query Processor for the Grid. 2nd UK e-Science All
Hands Meeting, 2003.
J Smith, A Gounaris, P Watson, N W Paton, A A A Fernandes, R
Sakellariou. Distributed Query Processing on the Grid. GRID
2002, LNCS 2536
(papers available from http://www.cs.man.ac.uk/~alvaro/publications.html )
16-17 October 2003
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
58
where to find out more: software
OGSA-DQP
Grid middleware to query distributed data sources
www.ogsadai.org.uk/dqp
OGSA-DAI
Grid middleware to interface with data(bases)
www.ogsadai.org.uk/
Globus Toolkit
Open-source implementation of OGSA/OGSI
www.globustoolkit.org/
16-17 October 2003
Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes)
59
Download