globus-infn-amsterdam

advertisement
INFN-GRID Globus evaluation
Massimo Sgaravatto
INFN Padova
for the INFN Globus group
globus@infn.it
http://www.infn.it/globus
Globus activities within INFN


WP “Installation and Evaluation of the Globus Toolkit” of the
INFN-GRID Project
Goal: evaluate the Globus toolkit as a GRID framework
providing basic services

Proposed work plan








Globus deployment and installation tools
Evaluation of Globus security services
Evaluation of Grid Information Service
Evaluation of Globus services for resource management
Evaluation of Globus tools for data management
Evaluation of Globus HBM for fault monitoring
Evaluation of Globus GEM for execution environment management
The activities of this WP are over

A report on this evaluation will be soon available
Globus installation tools



INFN-GRID installation tool to shorten the installation
time of the Globus toolkit, avoid common mistakes,
support for specific customisations
Possibility (option) to install optional software, to
proceed with INFN specific customizations (INFN CA,
configuration of a hierarchical GIS architecture), to
install and use specific INFN tools
Proven to be successful within INFN (used to setup a
INFN GRID testbed) and also outside (CERN, FNAL,
…)
Globus installation tools

INFN-GRID 1.2


INFN-GRID 1.3 (just released)


Other patches, in particular for job manager memory leaks (already
available for INFN-GRID 1.2 installations), support for Solaris 7, full support
for GDMP 1.2, distribution of various Globus flavored compilations, support
for unattended installations
Future


Binary distribution (binary relocability), support for RH 6.x and Solaris 2.6,
support for Condor, LSF and PBS, auxiliary packages (gsincftp, wu-gsiftpd,
GDMP), upgrade procedure, various patches
RPMs, support for developers, ...
Other info



http://www.pi.infn.it/grid/dist
http://www.infn.it/grid/dist.html (to download the INFN-GRID installation
kit)
infngrid-toolkits@infn.it (mailing list used to inform users about news,
problems, etc… . related to the toolkit)
Security

Evaluation of Globus GSI




Some shortcomings, but the GSI security model
seems to satisfy our requirements
INFN-CA used to sign certificates
Script for CRL (issued by INFN CA)
distribution
AFS tests

Analysis of what can be done now with the
existing tools (quite unfit for any real need) and
possible ways to address the existing
shortcomings
Security

Centralized management of the grid-mapfile


Goal: Ease the sharing of the same access policies
(represented by the grid-mapfiles) for groups of hosts with
common purposes
Proposed system

Central repository (LDAP server) to store user certificates
(subjects) and to define groups of users



Certificates published by CA manager
Group manager responsible for editing group memberships (using
a LDAP client)
Resource owners (Globus administrators) periodically (i.e. cron
job) “connect” to this repository, “download” the subject of the
certificates that meet a specified criterion (i.e. all users of
group X), and produce grid-mapfile entries
Information Service






Evaluation of Globus GIS
Definition and implementation of a hierarchical architecture of
GIS 1.1.3
Performance and scalability tests
MRTG tools to monitor LDAP (GRIS and GIIS) servers
Web interface for browsing
Various shortcomings must be addressed (to use the GIS in a
production environment)




Mixed push/pull model more suitable than a pull model
Performance
Lack of security
…
INFN GIS Topology
Top Level INFN GIIS
Dc=infn,dc=it,
o=grid
Exp=cms, o=grid
Dc=bo, Dc=infn,
dc=it,o=grid
Dc=pd,Dc=infn,
GIIS dc=it,o=grid
GIIS
Bologna
Padova
INFN CMS GIIS
GRIS
Resource Management


Most of these activities as collaboration with Datagrid WP 1
(Grid Workload Management)
Evaluation of Globus GRAM


Tests with Condor, LSF and PBS as underlying resource
management systems
Lack of “robustness” (needed for real production environments)





Memory leaks in the Globus job manager (fixed)
Scalability (one job manager for each job)
Reliability (the job manager is not persistent)
…
GRAM client API evaluation
Resource Management

Evaluation of GRAM Reporter (“cooperation” between GRAM and GIS)
in particular for farms





Evaluation of RSL as uniform language to specify resources
Submission of Condor jobs to Globus resources



Condor-G (useful as a reliable crash-proof job submission service)
GlideIn
Preliminary evaluation of GARA


Many useless attributes (at least for our needs), attributes not calculated
(always defined as 0), some attributes not properly calculated by Globus
shell scripts (i.e. totalnodes and freenodes for LSF)
Some important information describing the farms and the submitted jobs
(necessary for example for a resource broker) missing
Draft proposal for a possible modification of the default schema
Network reservation
Evaluation of MPICH-G2 vs. MPICH

Some shortcomings found (lack of support for shared memory, worse
latency performance wrt. MPICH)
Data management

Tests with GASS



Tests with command line tools and APIs
Problems (huge decrease in transfer rate)
transferring big files
Tests with Globusftp alpha release 2


Collaboration with WP network INFN-GRID
Tests of new features



Support for GSI mechanisms
Capability of resuming interrupted file transfers
Throughput tests using parallel data transfers
Other services

Fault Monitoring (HBM)



Evaluation of HBM for fault detection (for “system”
and “user” processes)
… but the HBM package is not seeing active
development
Execution Environment Management (GEM)


Evaluation of GEM as service for code migration
… but the GEM service now provides only limited
capabilities (executable staging)
Conclusions



The Globus toolkit can provide basic services
useful to create and deploy usable Grids, but
many shortcomings and issues must be
addressed
… more details in the report (very soon
available)
Other info: http://www.infn.it/globus
Download