Summary report - OSG Document Database

Grid Service Information Discovery
Copenhagen, Denmark
December 12-14, 2006
Laurence Field
Steve Fisher
Antony Wilson
Arumugam Paventhan
Markus Schulz
Steven Timm
OSG / Fermilab
Shaowen Wang
OSG / U. of Iowa
Sergio Andreozzi
Weijian Fang
OMII / U. of Southampton
Oxana Smirnova
Nordugrid / U. Lund
Balazs Koyna
ARC / Lund / NorduGrid
Laura Pearlman
Globus / ISI
Ahda Iamhitchi
U. of South Florida
Lydia Priefo
U. of South FloridaF
The main purpose of this meeting was to discuss a common way to do Service Discovery
across multiple grid infrastructures. It is of critical importance that the solution will scale
to the future requirements of the expanding grid infrastructures. The meeting was split
into three main sections; the past, the present and the future. Firstly, the existing systems
used to discover services in the different grid infrastructures were described in order to
understand the similarities and differences between the exiting systems and to understand
the pros and cons of each. Next, a common method of service discovery was discussed
and finally a road map of how to achieve end results was devised. An agenda was
provided as a guide for discussion.
The expected workshop deliverables were a summary of the service requirements to
achieve interoperability between OSG and EGEE at different levels: information systems,
information schema, job submission, job description, brokering, security, file transfer and
resource management. It is important to clarify that the purpose of the meeting was not
interoperability, but the design of a new information system that is highly scalable. We
tried to find what were the interoperability problems existing in current systems and
devised a new solution that avoids or diminishes these issues.
To give an idea of the challenge, the current size of EGEE has 200 production sites, each
with compute element, storage element, BDII, and monitoring services at least. They
may run other services as well. Currently there are 60 Virtual Organizations (VO’s).
The anticipated scale of EGEE in 3 years is expected to be over 1000 sites, 20 different
services, 200 VO’s, 10 roles and groups apiece, 40 x 106 pieces of metadata.
Fermilab’s interests
Part of Steve Timm’s motivation to be at the workshop was to understand the
implications to FermiGrid and also alert the developers to the existence of job-forwarding
services such as the FermiGrid one. FermiGrid has a stake in the outcome of any service
discovery protocol since we would likely have to write an emulation for it, given the
implementation of our site Globus job gateway.
The meeting was split into three sections: the past, the present and the future. First, the
existing past systems used to discover services in the different grid infrastructures were
described in order to understand the similarity’s and differences between them and to
understand the pros and cons of each. Second, the discussion focused on the present;
know how actually systems are being used. A common definition of service discovery,
the differences with service selection (query) and a common method of service discovery
were described. Third, a road map of how to solve current problems to achieve the end
result in the future was devised, pointing out implementation issues and how to migrate
from existing to new systems.
Existing systems and use cases
The follow existing systems were discussed and the pros and cons of each system were
Nordugrid use of MDS2
Service Discovery in glite
Service Discovery in OSG
Naregi Cell Domain
One of the main outcomes of this discussion was the similarities in each system. Most
used an index which contained information of site level interfaces. The system would
then pull the information from site level. On a conceptual point of view, a site could be
represented by a database and an interface. One of the main differences between the
different systems is how VO or grid level caching is handled. Some common use cases
of the systems were walked through to demonstrate the kind of information that needed to
be found with certain frequency.
As far as security is concerned, most of the above systems have the capacity to use GSI
authentication to retrieve the information, but do not do so due to the extra overhead that
this places on the information process. All seem to think it is desirable and will
eventually have to be re-implemented.
The pros of this system include an inherently distributed GRIS; it uses standard LDAP
and a three-like hierarchy. The cons are various: in general, the implementation is not
stable; it uses only one GIIS and it is necessary to reconfigure GRIS each time a change
happens; caching is done on demand and in several layers; the system requires
performing too many queries, having to traverse all the system to generate the results;
finally, scalability is a problem: the size of the query grows linearly with the number of
Globus Monitoring and Data Service (MDS2) has been used in LCG, OSG, and
NorduGrid. It is now deprecated by Globus Alliance. The initial implementation used
the GRIS (Grid Resource Information Service) and the GIIS (Grid Index Information
Service). This was a distributed service with one GRIS per resource and one GIIS per
site. In all its various implementations it suffered from scalability and stability problems.
GIIS was used to cache the information, but the total information is huge, 19MB
currently for EGEE, 35MB from NorduGrid. NorduGrid had an intelligent client for
MDS2 which had a timeout for misbehaving sites. This is part of the ARC (Advanced
Resource Connector) middleware
BDII uses a central cache. Instead of have data in each site, a central database is used.
The pro of this system is that there is no intelligence in the clients. The system faces
problems using a single entry point; it overloads when there are too many queries and had
trouble handling large catching.
BDII (Berkeley Database Information Index) is now used as the front end to the
information system, replacing the GIIS and eventually the GRIS as well. It begins with a
registry of sites and caches information every two minutes. In LCG implementation there
is a BDII at every site. The site BDII grows in size with the number of other BDII’s that
are querying it. OSG will have a single BDII for whole OSG and advertise to it with
CEMon (Computing Element Monitor). Again the caches are large.
MDS4 switches from LDAP to web services implementation. The main difference with
MDS2 is that caching is not done on demand. The pros are multiple: it is an XML
database; queries are done using XPath; one index centralizes all other indexes and an
index server contains pieces of data for each provider. The cons with MDS4 are the size
of XML database in memory.
Globus MDS4 (web service) doesn’t run information providers on demand, just on a
fixed interval. It has central index servers and it is possible to make custom indexes, with
periodic caching. The data is stored in an XML database and queries are done using
Xpath. The one problem that was mentioned is that the XML database takes up a lot of
space in memory. Globus toolkit includes Java and C interfaces, a web based client, and
a command line client. Largest known implementation is the TeraGrid. Globus
developers have written some information providers. Modular Information Providers for
the Open Science Grid have been written in beta version at U of Iowa as well.
It is a relational database system, actually used by EGEE as infrastructure for monitoring.
The architecture is similar to MDS4. A registry contains a list of URL sites. This system
conceptually is extremely good. The cons are various: implementation is not so good; it is
complex and connections time out; and the registry gets too much load; it has problems
when a large number of jobs are in the system simultaneously. Some improvements had
been proposed: allow authorization based on parameterized views; handle multiple
virtualized database management; and allow multiple socket connections among
R-GMA (Relational Grid Monitoring Architecture) is part of gLite/EGEE and uses a
relational database to store the various information from grid schemas, from whence it
can be retrieved via SQL queries. It is possible to authenticate with grid certificates and
configure the views to define what parts of a table any given user can see. However,
there are reports of connection timeouts and too much load on the registry, and it has
difficulty handling the conditions of large numbers of jobs in the system simultaneously.
Improvements have been proposed to deal with these and other problems
It is a web service container deployable in OMII, GT4 and Tomcat/Axis. It is the
middleware for UK e-science projects. It seems to be in development stage. The pros are
that it is UDDI compatible and it is a stand alone service that uses metadata annotation
and discovery.
Grid RegIstry with Metadata Oriented Interface: Robustness, Efficiency, Security. This
is an Open Middleware Infrastructure Institute-UK (OMII-UK) funded project from the
University of Southampton. Grimoires can deploy as a web services container within
OMII, GT4, and Tomcat/Axis frameworks. The service registry is based on UDDI
(Universal Description Discovery and Integration) framework but extends the framework.
SAGA is a middleware that contains an interface for grid applications. It has no service
discovery included in the functionality. So it uses GLite as an independent component
service discovery.
gLite includes a Service Discovery plugins to the R-GMA service mentioned above as
well as for the BDII. There is a command line interface and also an API. The Simple
API for Grid Applications (SAGA) is currently using this API for service discovery since
it has no native one of its own.
It is a Japanese project based on the cell domain concept, started from the need of many
industrial projects such as Fujitsu and other industries. The functionality is similar to
MDS4 except that the interface is OGSA DAI. The pros of this system are that no
catching is involved; it uses a standard web services interface and a Common Information
Model Object Manager (CIMON) to generate the queries. When a query comes in, it is
possible to decide which information provider to work with. CIMON requires a schema
to work, but could be any schema.
NAREGI (National Research Grid Initiative) uses CIMOM, the CIM Object Manager,
which distributes information about compute elements based on the Common
Information Model, an emerging industry standard. This is then aggregated to a
relational database and implemented as a grid service by use of Globus service OGSADAI (Open Grid Services Architecture Data Access and Integration). Each site is
referred to as a Cell Domain.
Use cases for the information systems
Some common use cases of the system were walked through to demonstrate the kind of
information that actually exist and the ones needed to be found and the frequency. The
discussion covered the functionality currently provided by the existing systems and also
the desired functionality for the new information system analyzed during the meeting.
Job Submission
 User Interface Use Case
 Find all Resource Brokers (RB) that I can use
 Find a suitable workload management system
 Find my Virtual Organization Membership Service (VOMS) server: it will find
my proxy with my roles
 Find again the central catalogue to write output data to the central/global
Workflow Management System Use Case
 Find which catalogs are available for my Virtual Organization (VO) 1 and query
them to learn where data is
 Find what Computer Elements (CE) I can use and what state they are in
 Find all CEs close to my Storage Elements (SE)
 Rank the resources within the information system
 Get back a sorted list of resources, sorted by goodness of resources
Find all the CEs where I can run the job that meets my criteria2
 Find job status3. It includes sub-jobs status
 Publish dynamic job info. (e.g. job states, resources consumed by the job, etc.)
Update information periodically: 30 secs.
Do job tracking / monitoring4
Catalog contains the locations of all copies of the files that you have
Static and dynamic information
3 The actor of the use case could be a user or a service in his behalf
4 Includes data transfer monitoring also
Working Node (WN) Use Case
 Find which catalogs are available
 Find storage capacity
 Discovery of storage elements (SE)
Additional Use Cases
 Find user services
 Perform an analytical query (browsing)
 Provide a restricted view for a user (depending on rights)
 Perform analysis on data: querying about services that are running instead of CEs.
 Perform service monitoring5 (for troubleshooting) to signal wrong status. E.g. if
the service is down for longer than a threshold.
 Data mining approaches: calculate correlations, find trends, detect job failing
patterns, send notification, etc.
File Transfer Service Use Case
 Functionality is similar to job submission use case, but it is for moving data
around. Instead of looking for CEs, it looks for SEs.
 User defines source (logical filename) and destination
 The system runs the queries to discover
The priority of data transfers is important. This can be seen in the existing
implementation in LCG to manage data flow between the place where the data is created
(CERN) and the major research center in the world. It is used heavily.
Service Discovery
Before any details could be discussed, the concept of Service Discovery, such as service
discovery, service advertising and service selection, needs to be agreed. This is
essentially the question asked and the answer which is returned. Service Discovery is
essentially the question asked and the answer which is returned. It was agreed that the
questions which could be asked are anything that is generic for all services and the
answer is a handle. In fact there are two handles, the Service Access Point and the
Information End Point.
Closely related to this is the question of what a service is and what a resource is. All
resources are made visible to the grid via services but in general there can be a many to
many relationship between services and resources. For instance, a single Compute
Element resource might have an OSG gatekeeper service and an LCG gatekeeper service.
Likewise you could have a site gateway service which is associated with more than one
Compute Element resource. It follows that resources will have to have unique identifiers
as well, and resource discovery can be viewed as a reverse lookup operation to service
History is not stored (although this would be good for monitoring)
A resource is seen as a property as a service. There is a many to many relationship
between services and resources, therefore a resources also need a unique identifier. Hence
resource discovery can be seen as a reverse lookup.
The need for a common Service Discovery interface was discussed and it was agreed that
this would be needed. There is ongoing work to define APIs for Service Discovery clients
in the SAGA activity within OGF. A plug-in specification is also defined to enable the
APIs to be used within multiple systems. Similar plug-ins were developed as part of the
OGF gin-info activity.
It was agreed that an information provider interface would be useful so that the
developers of underlying systems (e.g. batch systems and storage systems) can maintain
their providers. The proposed idea is that vendors of different systems should write their
own adapters based on the common specification generating the same output. The
interface should be simple. A command that produces XML on standard out might be
suffice. This interface can be used both at the provider and at the plug-in level. There is
no need to make this an official standard but it would help if this recommendation was
adopted so that information providers could be shared and possibly developed by the
developers of the underlying system which is being queried. It was suggested that we
may want to set up a community repository for providers (plug-ins).
The need of a common schema for each service type was also proposed. An experimental
approach could be tried using a subset of GLUE. The semantics of the schema needs to
be specified for each service type. What information needs to be produced is defined by
the schema and we must not forget about caching to protect against overloading the
underlying resource.
The use case of finding information which is specific to a service type was discussed. The
name given to this use case was "service query". The main difference between service
description and service query is the level of aggregation of the information used. We
have 2 layers: first the list of handlers and second the details. At the first step only we can
find out what services are available; in the second step we could ask whether or not the
service is working properly and further attributes. This latest information can only be
provided by the selected service; this last step is what was called “service query”.
In order for this to be achieved a schema needs to be defined for each service type. Data
must be as generic as possible and information should be as little as possible. Static and
dynamic attributes were discussed. The conclusion is that all attributes can change. The
main difference is the frequency on what we expect them to change. If something is static
we don't expect it to change more frequently than 6 hours, dynamic attributes would
change more frequently. Schema design needs to take into consideration dynamic values
so that systems can be made more efficient by only moving the dynamic values around
with a high frequency.
Query Interface
A discussion followed on the need for a common query interface. A service query may
involve polling the services themselves or a site-level database such as a BDII. Some
services have their own API (e.g. VOMS= Virtual Organization Management System)
others don’t, (e.g. FTS= File Transfer System).
Issues involved in service-specific queries are (a) defining a schema for each service type
(b) identifying which attributes would be considered static and which are dynamic, and
(c) determining whether the information returned by the query can be reduced to keyvalue pairs. For some services the GLUE schema is sufficient, for others new schema will
have to be defined. “Static” attributes are considered to be those that stay the same for 6
hours or more. Schema design needs to plan for moving the dynamic values around with
more frequency. It was thought that some of the service-specific queries cannot be
reduced to key-value pairs
Three options were presented: extend the Service Discovery API, have a common generic
query interface, or service specific APIs. It was agreed that extending the Service
Discovery would not be an option as key value pairs cannot be used to express the
complex structure of some service types. Service specific API would be very sensitive to
schema changes. The conclusion was that a common query interface would be desirable.
However, it was not clear what this should be or how we can agree on one.
Investigating the existing systems showed that each grid has a top level aggregator. The
end points of these aggregators needs to be passed to the configuration of the Service API
so that it can find these aggregators. One possibility is that the VO should know these as
they have already negotiated with the infrastructure to gain access. Grid infrastructures
should look at the VOs they support to understand which Grids they need to interoperate
with and they can get the endpoints from the VO. An agreed format may need to be
worked out.
At the end some additional ideas were discussed. The result is a list of research questions
 How to test query languages such as XPath, XQuery, SQL, LDAP, OGSA-DAI to
define how they cope for large-scale queries? (e.g., given the scale of Grids in the
 Is there a performance analysis for MDS4 and for BDII? Would it be worth to
have it?
 How to trust the information in a grid information system? Example: latitude,
longitude for sites (e.g., Ro EGEE sites in the mountains)
The meeting was very fruitful with many areas being covered. The main outcome of the
meetings is summarized as follows.
1. We agree that there is a need for a common way to do Service Discovery.
2. A Service Discovery API is needed and this work will be done in the SAGA
working group within OGF.
3. Service Discovery will need a generic description of services, which will be
defined in the GLUE working group within OGF.
4. The gin-info group could help to provide the required plug-ins for connecting to
other grids.
5. There is a need for Service-specific schemas. This will also be defined in the
GLUE working group within OGF.
6. A common information provider interface would be helpful. It was decided that
this would be just a command which returned XML. There is no need to make this
an official standard but it would help if this recommendation was adopted so that
information providers could be shared and possibly developed by the developers
of the underlying system which is being queried.
7. It would be a good idea to set up a community repository which can be used to
share plug-ins etc.
8. In principle, it was agreed that a common query API would be required however
it wasn't clear how this would be decided.
9. Scalability has to be taken into consideration in the design; this has to be solved
for each system, based on own design and problems.
The meeting was very fruitful. The participation of real users from real operating grids
was a key point for the success. The objectives set for the project were all met. As result
of the meeting, USF wants to collaborate in the analysis of current information system
utilization using existing workloads for job submission and data transfers for existing
different grids infrastructures.
The meeting provided us the contacts for getting real usage trace. We have promises
1) Open Science Grid (USA) - Contact person is Wang Shaowen
2) R-GMA (UK) - Contact person is Steve Fisher
3) CERN Lab (Switzerland) - Contact person is Laurence Field
Our objectives from the study of real traces are:
1) Detect usage patterns that characterize user behavior when running processes over
the grid
2) Provide metrics for resource usage and propose statistics on the amount of usage
of computing and storage resources for different grid services
3) Study correlations for different variables such as location, resources, time, user,
experiment and applications, to understand dependencies
4) Find missing data and propose improvements on the information logged from the
5) Predict future usage based on current system characterization
6) Summarize interoperability issues in service discovery for grids from a practical
point of view, understand the causes and possible solutions
7) Understand scalability issues from real user traces and make informed predictions
on future usage in order to better define scalability requirements for information
services in grids
Next Steps
In the Open Grid Forum 19, currently taking place in Chapel Hill, NC, there is a session
devoted to the Service Discovery API within the SAGA framework. Various funding
proposals have been submitted both to European funding agencies and to US funding
agencies. One European group is seeking funding to work on the above-mentioned
Service Discovery API within SAGA; another US group is seeking funding to work on
Modular Information Providers for use in Globus MDS4. The Europeans expected an
answer in January on whether their proposal would be approved.
As mentioned in the trip report filed by the University of South Florida researchers, they
have laid out a plan of getting real usage traces from real user behavior on the grid.
These traces would not be limited to information systems but include all the various
resources that are used in a typical grid use case.
Informal discussions with NorduGrid staff present at the meeting indicated a willingness
to continue exploring ways to collaborate with OSG on information services. It is
unlikely that the next step in this collaboration will be known until some of the work on
the Service Discovery API is actually done within the OGF. It is my understanding that
NorduGrid is planning to base whatever their next software release will be on the
common API when it comes out.
Although the consensus statement above doesn’t explicitly reflect it, most of the
participants agreed that not much could be done on the harder problem of service and
resource selection until the service discovery API is first defined in the OGF. That is
why #8 above refers to a common query API being required but not being unclear about
its implementation. This common query API refers to that process, not to service
