Discovery and Monitoring of Services using R-GMA Abdeslem DJAOUI / RAL

advertisement
NeSc Middleware Workshop July 22-23 2004
www.eu-egee.org
Discovery and Monitoring
of Services using R-GMA
Abdeslem DJAOUI / RAL
EGEE is a project funded by the European Union under contract IST-2003-508833
Content
• Background on R-GMA
• Service discovery and Monitoring using R-GMA
• Service interfaces charcteristics
NeSc Middleware Workshop, 22-233 July 2004 - 2
GMA separates information source
discovery from transfer to a sink
GMA: see GGF document GFD-1.7
Producer
Replicated
Directory:
Consumer
metadata
Producer:
metadata
Messages directly from P to C
Receives
sends
Messages
Consumer:
Combined
Messages
C/P
NeSc Middleware Workshop, 22-233 July 2004 - 3
R-GMA
• A relational implementation of GMA
ƒ Producers announce:
SQL “CREATE TABLE”
publish:
SQL “INSERT”
ƒ Consumers collect:
SQL “SELECT”
ƒ Powerful data model and query language
• All data modeled as tables
• SQL can express most queries in one expression
• User don’t have to construct complex SQL, front ends that automate
the process can be used.
• Creates impression that you have one RDBMS per VO
ƒ Ability to issue global queries across all information
NeSc Middleware Workshop, 22-233 July 2004 - 4
R-GMA: producers, consumers
registry and schema
• Registry has two main
User services
tables:
Producer
execute
or
stream
Consumer
S
lo tor
ca e
tio
n
up
k
n
o
Lo atio
loc
Registry
Store table
description
Schema
ƒ Producer
• Table name
• Predicate
• Location
ƒ Consumer
• Query
• Location
• Schema holds description
of tables
ƒ Column names and types
of each table
• Registry predicate defines
subset of “global” table
NeSc Middleware Workshop, 22-233 July 2004 - 5
Contributions to the “global” table
CPULoad (Global Schema)
Country
Site
Facility
Load
Timestamp
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
CH
CERN
ALICE
0.9
19055611022002
CH
CERN
CDF
0.6
19055511022002
CPULoad (Producer 2)
CPULoad (Producer 1)
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
WHERE
country = ’UK’
AND site =
’RAL’
WHERE
country = ’CH’
AND site =
’CERN’
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
CPULoad (Producer 3)
CH
CERN
ATLAS
1.6
19055611022002
CH
CERN
CDF
0.6
19055511022002
NeSc Middleware Workshop, 22-233 July 2004 - 6
Queries over “global” table – merging streams
SELECT * from CPULoad WHERE country = ’UK’
CPULoad (Consumer)
Country
Site
Facility
Load
Timestamp
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
CPULoad (Producer 2)
CPULoad (Producer 1)
UK
RAL
CDF
0.3
19055711022002
UK
RAL
ATLAS
1.6
19055611022002
Mediator handles P/C
matchmaking and merging
information from multiple
producers for queries on one
table
UK
GLA
CDF
0.4
19055811022002
UK
GLA
ALICE
0.5
19055611022002
CPULoad (Producer 3)
CH
CERN
ATLAS
1.6
19055611022002
CH
CERN
CDF
0.6
19055511022002
NeSc Middleware Workshop, 22-233 July 2004 - 7
Service Discovery: Service table
• Service table columns definitions
ƒ Endpoint: VARCHAR(255) - URI to contact the service
ƒ Type: VARCHAR(50) - Type of service
ƒ MajorVersion: INT - Major version number
ƒ MinorVersion: INT - Minor version number
ƒ PatchVersion: INT - Patch version number
ƒ Site_Name: VARCHAR(100) - Name of the site
ƒ WSDL: VARCHAR(255) - URL of WSDL describing the service
ƒ Semantics: VARCHAR(255) - URL of detailed description
NeSc Middleware Workshop, 22-233 July 2004 - 8
Service monitoring: ServiceStatus
table
• ServiceStatus table columns definitions:
ƒ Endpoint: VARCHAR(255) - URI to contact the service
ƒ Status: INT- Status code, 0 means the service is up.
ƒ Message: VARCHAR(255) - Human readable indication of the
service status
NeSc Middleware Workshop, 22-233 July 2004 - 9
Queries over “global” table – joining tables
SELECT Service.Endpoint Service.Site_name
from Service S, ServiceStatus SS
WHERE (S.Endpoint= SS.Endpoint and SS.Status=0)
Service/ServiceStatus (Consumer)
Endpoint
Site_Name
gppse02
RAL
Service/ServiceStatus
Service
Endpoint
Type
Site_name
gppse01
SE
RAL
…
…
…
gppse01
SE
RAL
…
…
…
gppse02
SE
RAL
…
…
…
lxshare0404
SE
CERN
…
…
lxshare0404
SE
CERN
…
Endpoint
…
…
ServiceStatus
Status… message
gppse01
gppse02
0
Service is up
lxshare0404
NeSc Middleware Workshop, 22-233 July 2004 - 10
R-GMA services for EGEE users
• Producer services
ƒ Used for publishing information
ƒ Advertise the type of information by declaring a table
ƒ A predicate (SQL WHERE clause) can define the precise subset of the
global table to publish
ƒ 3 types
• Primary
• Secondary
• On-Demand
• Consumer service
ƒ Used as a sink for information
ƒ Defined by a single query (SQL SELECT statement)
ƒ Query types
• Continuous
• one-time
– History
– Latest
NeSc Middleware Workshop, 22-233 July 2004 - 11
Producers
• Primary and Secondary Producers support
ƒ History Queries
• over time sequenced data
ƒ Latest Queries
• correspond to intuitive idea of current information
ƒ Continuous queries
• as soon as new data becomes available it is broadcast to all interested
parties
• On Demand Producers
ƒ Static queries (similar standard query to a database)
NeSc Middleware Workshop, 22-233 July 2004 - 12
Producer Properties
• Primary or Secondary Producer may use:
ƒ Memory
• Gives best performance for continuous queries
ƒ File
• Data has a good chance of being recovered after machine crash
• Fair performance
ƒ Database
• Poor performance for inserts and continuous queries
• Best chance of data recovery after machine crash
• Best performance for joins
NeSc Middleware Workshop, 22-233 July 2004 - 13
Three Kinds of Query
insert
select
Producer
Producer
Tuple
Tuple
Tuple
Tuple
Tuple
Tuple
Tuple
Tuple
Tuple
Tuple
Tuple
Continuous Query
Tuple
Tuple
Tuple
History Query
Tuple
Tuple
Tuple
Latest Query
Tuple
Tuple
Tuple
Tuple
NeSc Middleware Workshop, 22-233 July 2004 - 14
Service: PrimaryProducer
• A client uses a PrimaryProducer to publish information into
R-GMA
• availability of code
ƒ URL - http://hepunx.rl.ac.uk/egee/jra1-uk/
ƒ License - EGEE
ƒ Support- Through EGEE (OMII in the future)
• SOA Model:WS
ƒ WS-I compliant WSDL
NeSc Middleware Workshop, 22-233 July 2004 - 15
PrimaryProducer Service
Operations
ƒ For Latest Java API see
• http://hepunx.rl.ac.uk/egee/jra1-uk/
NeSc Middleware Workshop, 22-233 July 2004 - 16
Service: Consumer
• A client uses a Consumer to retrieve information from one
or more producers
• availability of code
ƒ URL - http://hepunx.rl.ac.uk/egee/jra1-uk/
ƒ License - EGEE
ƒ Support- Through EGEE (OMII in the future)
• SOA Model:WS
ƒ WS-I compliant WSDL
NeSc Middleware Workshop, 22-233 July 2004 - 17
Consumer Service Operations
ƒ For Latest Java API
• http://hepunx.rl.ac.uk/egee/jra1-uk/
NeSc Middleware Workshop, 22-233 July 2004 - 18
What do you use to build your service?
(i.e. How ‘standard’ is your service?)
NB:A low score means less risk & more mainstream
•
Widely Implemented Standard Specification (1pt) - 1
ƒ
•
Implemented draft specification (2pt)
ƒ
•
•
•
<An idea that exits as a white paper, but no code and no specification details>
Concept (6pt)
ƒ
•
<Specification in standards body and supported by most/many companies. One/few implementations exist (e.g., WSSecurity, BPEL)>
Implemented draft specification (3pt)
ƒ <Specification in standards body but alternatives exist. Industry is divided. One/few implementations
exist. (e.g., Transactions, coordination, notification, etc.).
Implemented proposal (4pt)
ƒ An implementation of an idea, a proposal but not submitted to standards body yet (e.g., WS-Addressing,
WS-Trust, etc.)
Non-implemented proposal (5pt)
ƒ
•
<Demonstrable Multiple Implementations, e.g. SOAP, WSDL>
<An idea that exists only as power point slides!!>
TOTAL: <List specs and add up!>
NeSc Middleware Workshop, 22-233 July 2004 - 19
Service Dependencies
• What else does your service depend on (i.e. external
dependencies)?
ƒ RDBMs (e.g. service persistence): MySQL, DB2
ƒ Other services (name them): Registry and Schema
• What does your implementation depend on?
ƒ Languages (Java)
ƒ Container type: Apache Tomcat, Axis, ….
NeSc Middleware Workshop, 22-233 July 2004 - 20
AAA & Security
• What authentication mechanism do you use?
ƒ EGEE Authentication
• What authorisation mechanism do you use?
ƒ Fine Grained Authorisation
• The authorization rules are defined in a TableAuthorization object that is
passed into the createTable method. (View : AllowedCredentials)
• To impose the constraints that a row of the table is available to the owner of
the job, i.e. if the DN matches.
SELECT * from Job where Owner=[DN] : DN=[DN];
• If you match the allowed credentials you will have read access to the data
defined in that view.
• What accounting mechanism do you use?
ƒ Is interaction audited? Not now
ƒ Is usage run against quotas? Not now
• Does service interaction need to be encrypted?
ƒ This is a requirement from Bioinformatics
• If these are not used now, will they be in the future?
ƒ They could be
NeSc Middleware Workshop, 22-233 July 2004 - 21
Exploiting the Service Architecture
• What features from your ‘plumbing’ do you use in your
service?
ƒ Factory pattern
ƒ Java Logging
ƒ Event notification (streaming)
ƒ Meta-data
ƒ Registry discovery/advertisement
ƒ Other WSRF/WS characteristics?
NeSc Middleware Workshop, 22-233 July 2004 - 22
Service Activity
•
•
•
•
Multiple interaction or single user? Both
Throughput (1/per day or 100/per second?) Both
Typical data volume moved in
Typical data volume moved out
NeSc Middleware Workshop, 22-233 July 2004 - 23
Service Failure
• Required Reliability
ƒ Failure semantics?
• Positive ack – in future
• Submit & forget - current
• Required Persistence
ƒ Work never lost? Optional
• Required Availability
ƒ One of many or unique requirement
NeSc Middleware Workshop, 22-233 July 2004 - 24
Required Service Management
• Remote access to:
ƒ Performance
ƒ Progress
ƒ Diagnostic and repair interfaces:
• Nagios monitoring
NeSc Middleware Workshop, 22-233 July 2004 - 25
Download