MAGIK- : I Managing Grids Containing Information

advertisement
MAGIK-I: Managing Grids Containing Information
and Knowledge that are Incomplete
Scenario: a Semantic Grid for Astronomers
Imagine a Semantic Grid that is used by Astronomers around the world
for pooling together their data, and for running computations on that
data.
Data
Grid
Semantic
Layer
Underpinning their Grid is a “semantic layer”: an interface that allows
both astronomers and Grid components to locate and query
information that has been published.
Meta
data
Publishing and Querying Information
Results
 Databases register a description of the data that they publish.
 Satellites and telescopes advertise data streams.
 Results of computations run on a Grid can be stored.
 Users query the semantic layer to find relevant information.
Figure 1: Scenario: An astronomer’s Grid
Information in a Grid may be Incomplete
Attribute Level
Answer Level
Query:
Give me the x-ray flux of all optical sources
that are redder than X.
Global Level
Query:
Give me the x-ray flux of all objects within
distance D of position P of the sky.
Query:
Give me all galaxies that are not detected in
the radio range.
Problem:
Sources may be incomplete.
Details:
Say we have two databases:
1. Galaxy database
2. Radio emitting objects database
Problem:
Problem:
The precision of the stored data is not detailed
Some data sources may be temporarily
enough.
unavailable.
Details:
Details:
The x-ray fluxes may be found by:
To obtain an answer to the query all
• Extracting sources redder than X from
databases that contain x-ray flux
information about the region with radius D,
an optical database.
The radio database is incomplete. Some
centred
on
position
P
of
the
sky
will
be
• Find counterparts in an x-ray database.
galaxies emitting radio signals are missing
contacted.
If the red source is faint in x-ray image then it
from database 2.
One of these databases may be
may not appear in x-ray database.
An answer returned by taking the galaxies
unavailable
at
the
time
of
the
query,
but
Astronomer with access to raw x-ray image
from database 1 and removing objects found
still
registered.
would be able to estimate an upper bound for
in database 2 could be wrong.
Requirement:
the flux.
Requirement:
Return answer currently available along
Requirement:
Mechanism to deal with incomplete sources in
with
details
of
the
source
currently
Semantic data should contain details of
query answering.
unavailable.
precision.
Aim and Objectives of MAGIK-I
MAGIK-I aims to develop a logical model for integrating and querying
incomplete information that is published on a Grid.
Problems addressed will include:
• How to describe incompleteness in data? New constructs are needed
for query languages and data descriptions.
• How to express an answer that takes account of incompleteness?
• What is the meaning of an incomplete answer?
• What algorithms can be devised for integrating data from different
sources, and computing (possibly incomplete) answers to a query?
Register Query
Consumer
Consumer
Republisher
Register Query
& View (Q=V)
Registry
Schema
Data
The Framework
Producer
R-GMA is a Grid information system built by the DataGrid project.
R-GMA has a producer/consumer architecture (figure 2), and includes a
mediator that can answer queries posed against a global relational
schema.
We plan to extend R-GMA’s query execution engine in order to evaluate
the concepts and algorithms developed in MAGIK-I.
Project Team
Collaborators
Werner Nutt Howard Williams
Andy Cooke
Steve Fisher RAL
Robert Mann ROE
Producer
Register View
Producer
Figure 2: The architecture of R-GMA
Website: http://www.macs.hw.ac.uk/magik-i
Email:
magik-i@macs.hw.ac.uk
Download