The status of the EU DataGrid's R-GMA system

advertisement
WP3
The status of
the EU
DataGrid's
R-GMA system
Steve Fisher / RAL
24/4/2003
<s.m.fisher@rl.ac.uk>
Who we are
•
Heriot-Watt, Edinburgh
WP3
– Andrew Cooke, Werner Nutt
•
IBM-UK
– James Magowan, (Manfred Oevers), Paul Taylor
•
INFN
– Roberto Barbera, Giuseppe Save, Gennaro Tortone
•
Queen Mary, University of London
– Roney Cordenonsi, (Ari Datta)
•
CCLRC/PPARC
– Rob Byrom, Laurence Field, Steve Hicks, Manish Soni, Antony Wilson,
(Xiaomei Zhu), Jason Leake
– Linda Cornwall, Abdeslem Djaoui, Steve Fisher, Robin Middleton
•
SZTAKI, Hungary
– Peter Kacsuk, Norbert Podhorszki
•
Trinity College Dublin
– Brian Coghlan, Stuart Kenny, David O’Callaghan, (John Ryan)
R-GMA
Steve Fisher/RAL - 24/4/2003
2
GMA
WP3
• From GGF
• Very simple model
• Does not define:
Producer
execute
or
stream
Consumer
R-GMA
Registry
– Data model
– How data are
moved from
Producer to
Consumer
– What registry looks
like
Steve Fisher/RAL - 24/4/2003
3
R-GMA
WP3
• Use the GMA from
GGF
• A relational
implementation
Producer
execute
or
stream
Consumer
R-GMA
Registry
– Powerful data model
and query language
• Applied to both
information and
monitoring
• Creates impression
that you have one
RDBMS per VO
Steve Fisher/RAL - 24/4/2003
4
Relational Data Model
WP3
• Not a general distributed RDBMS system, but a way
to use the relational model in a distributed
environment where global consistency is not
important.
• Producers announce: SQL “CREATE TABLE”
publish: SQL “INSERT”
• Consumers collect:
SQL “SELECT”
• Some producers, the Registry and Schema make use
of RDBMS as appropriate – but what is central is the
relational model.
R-GMA
Steve Fisher/RAL - 24/4/2003
5
Producer 
Consumer
WP3
• Consumer can issue one-off queries
– Similar to normal database query
• Consumer can also start a continuous query
– Requests all data published which matches the
query
• Can be seen as an alert mechanism
R-GMA
Steve Fisher/RAL - 24/4/2003
6
Registry choices
Registry
(of Producers
and Consumers)
WP3
Schema
(descriptions
of tables)
• Decided early to keep them separate
• In fact they have different requirements for
distribution/replication
• Each implemented with one RDBMS per
instance
R-GMA
Steve Fisher/RAL - 24/4/2003
7
Virtual RDBMS
WP3
• Creates impression that you have one
RDBMS per VO
– This makes it very easy to use
– 1 integrated system
– 1 query language
• Users like it
• But how will it fit in with GridServices?
R-GMA
Steve Fisher/RAL - 24/4/2003
8
Producers
• DataBaseProducer – Supports History Queries
WP3
– Information not lost
– Supports joins
– Clean up strategy
• StreamProducer – Supports Continuous Queries
– In memory data structure
– Can define minimum retention period
• ResilientStreamProducer – Supports Continuous Queries
– Like the StreamProducer but won’t lose data if system crashes
– So slightly slower
• LatestProducer – Supports Latest Queries
– Just holds the latest information for any “primaryish” key
– Supports joins
• CanonicalProducer – Supports anything
– Offers anything as relations
R-GMA
Steve Fisher/RAL - 24/4/2003
9
Archiver (Re-publisher)
WP3
• It is a combined Consumer-Producer
• You just have to tell it what to collect and it
does so on your behalf
• Re-publishes to any kind of “Insertable” (i.e.
not to the CanonicalProducer)
R-GMA
Steve Fisher/RAL - 24/4/2003
10
Canonical Producer
WP3
• Allows user defined code to be invoked to respond to
SQL query
• Developed in collaboration with CrossGrid
CreateTable, Port, Protocol,
Security, SQL Support, Multiple
Query Support
CP
API
Security
Insert
User Code
Query
Register
Canonical
Producer
Servlet
Port
Files
R-GMA
Steve Fisher/RAL - 24/4/2003
11
Functionality - mediator
WP3
• Queries posed against a virtual data base
• The Mediator must:
– find the right Producers
– combine information from them
• Hidden component – but vital to R-GMA
• Can now merge information from several
producers
• The final mediator will take “any” SQL
statement and do the right thing
R-GMA
Steve Fisher/RAL - 24/4/2003
12
Topologies
SP
A
SP
WP3
A
LP
SP
SP
A
SP
A
• Normally publish
via SP
• Archivers
instantiated with a
Producer and a
Predicate
• Must avoid cycles
in the graph
HP
SP
R-GMA
Steve Fisher/RAL - 24/4/2003
13
Schema & Contributions
WP3
CPULoad (Global Schema)
Country
Site
Facility
Load
Timestamp
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
CH
CERN
ALICE
0.9
19055611022002
CH
CERN
CDF
0.6
19055511022002
CPULoad (Producer 2)
CPULoad (Producer 1)
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
CPULoad (Producer 3)
R-GMA
CH
CERN
ATLAS
1.6
19055611022002
CH
CERN
CDF
0.6
19055511022002
Steve Fisher/RAL - 24/4/2003
14
Contributions are Views
WP3
CPULoad (Producer 1)
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
SELECT * FROM
cpuLoad
WHERE country = ’UK’ AND site = ’RAL’
CPULoad (Producer 2)
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
SELECT * FROM
cpuLoad
WHERE country = ’UK’ AND site = ’GLA’
R-GMA
Steve Fisher/RAL - 24/4/2003
15
R-GMA Tools
WP3
• R-GMA CLI
– Command Line Interface (similar to MySQL)
– Supports single query and interactive modes
– Can perform simple operations with Consumers,
Producers and Archivers
• R-GMA Browser
– JSP application dynamically generating web pages
– Supports pre-defined and user-defined queries
• Pulse
– R-GMA Java client-based GUI
– Supports streaming and simple graphical displays
R-GMA
Steve Fisher/RAL - 24/4/2003
16
GIN and GOUT
(Gadget IN and Gadget OUT)
LDAP
InfoProvider
GLUE
Schema
Archiver
Consumer
(CE)
Consumer
(SE)
WP3
DataBase
Producer
ConsumerA
PI
GIN
Consumer
(SiteInfo)
CircularBuffer
Producer
R-GMA
RDBMS
GOUT
CircularBuffer
Producer
GIN
LDAP
Server
R-GMA
Consumers
LDAP
InfoProvider
R-GMA
Steve Fisher/RAL - 24/4/2003
17
R-GMA – How?
WP3
• Currently based on servlet technology
– Behind every API there is a Servlet
– Multiple hand crafted APIs
• Java, C++, C, Python and Perl
– Tomcat
– Soft state registration
– Uniform exception handling
• To ensure that useful messages and stack traces are
preserved.
R-GMA
Steve Fisher/RAL - 24/4/2003
18
OGSIfication
WP3
• Have recently started the migration to web
and grid services
– Apache axis
– WSDL generated APIs
– Will provide a wrapper for backwards compatibility
R-GMA
Steve Fisher/RAL - 24/4/2003
19
OGSIfied R-GMA
Application
Consumer
API
Producer
API
Consumer
Factory
Consumer
Instance
Producer
Instance
WP3
Registry
Schema
Sensor
•
•
•
•
Producer
Factory
All Grid Services
OGSA Factories, GSH, GSR
Registry includes HandleMapper
SQL as Service Data Element Query Language
R-GMA
Steve Fisher/RAL - 24/4/2003
20
OGSIfication issues
WP3
• Consider XML as internal representation of
service data elements
– Depends on other developments
• Consider XQuery as service data elements
query language
– Depends on how XQuery develops
• X-GMA ??
– Will this be distinguishable from what is in GT3
R-GMA
Steve Fisher/RAL - 24/4/2003
21
Resilience - Registry
Producer1
Registry1
Info mastered by Registry1
Copy of info from Registry2
•
•
Copy of info from Registry3
•
Registry2
Info mastered by Registry2
•
Copy of info from Registry1
Copy of info from Registry3
Producer2
Registry3
Info mastered by Registry3
Copy of info from Registry1
Copy of info from Registry2
R-GMA
•
•
•
•
WP3
Will have one logical registry
and schema per VO
Each logical registry will
have multiple physical
“copies”
Each entry in registry has 3
possible states
Transmit new records and
deleted records and
checksum after records
deleted locally
Self healing even supports
new registry instances
Consumer uses any
instance
Fail over mechanism not yet
implemented
Schema more tricky
Steve Fisher/RAL - 24/4/2003
22
Soft-state Registration
and the Registry
WP3
• Registry records existence of Producers and
Consumers
• Registry holds last contact time and ‘expiry’
time
• Producers and Consumers periodically
refresh their time stamps
• Producer and Consumer servlets avoid
unnecessary traffic to Registry
• Scheduled removal of entries that have
timed-out
R-GMA
Steve Fisher/RAL - 24/4/2003
23
Resilience Testing
WP3
• Taking 7 components
–
–
–
–
–
–
Schema
2 registry instances
Producer API
Consumer API
Producer Servlet with other APIs
Consumer Servlet with other APIs
• Consider each component in turn
– Break the network and bring it back
– Close the component down and bring it back
– Crash the component and bring it back
• Will also consider real life scenarios
R-GMA
Steve Fisher/RAL - 24/4/2003
24
Performance
WP3
• By design:
– Very flexible - to avoid bottlenecks
– Powerful queries allow a single query to be made
• Performance and Optimisation
– Will use NetLogger and profiling tools to identify
possible bottlenecks
• Internally not high speed because of XML etc
R-GMA
Steve Fisher/RAL - 24/4/2003
25
Summary
WP3
• R-GMA is a combined Grid information and
monitoring system
• Supports notion of Virtual Database
• Recently deployed in the EDG development
testbed
• Now focusing on reliability, stability and
performance
http://hepunx.rl.ac.uk/edg/wp3/
Thanks to the EU and our
national funding agencies for
their support of this work
R-GMA
Steve Fisher/RAL - 24/4/2003
26
And finally GGF8…
WP3
• RGIS-RG
– The two short sessions will be held:
• Session 1: Database use cases and best practices in the grid
environment (outside the traditional data areas)
»
»
»
»
Using databases to store application metadata
Using databases to store monitoring information
Using databases as a grid registry
Creating grid registries for locating relational and XML
databases
• Session 2: Data discovery in the grid environment
– We will also discuss our milestones and future directions.
(e.g should we include XML as well as Relational models.)
– See http://hepunx.rl.ac.uk/ggf/rgis-rg
• A GMA BOF is planned for GGF8
R-GMA
Steve Fisher/RAL - 24/4/2003
27
Download