Review of the EU DataGrid Project and other EU Grid initiatives

advertisement
CERN
Review of the EU
DataGrid Project and
other EU Grid initiatives
Fabrizio GAGLIARDI
CERN
Geneva-Switzerland
EU-DataGrid Project Head
November 2001
F.Gagliardi@cern.ch
Krakow 2001
Talk summary

Introduction

EU DataGrid background

Project Status

Future Plans

Other EU initiatives

Conclusions
Krakow 2001
CERN
F. Gagliardi - CERN/IT 2
CERN
The European Organisation
for Nuclear Research
20 European countries
2,500 staff
6,000 users
CERN
27km of tunnel stuffed
with magnets and
klystrons
Krakow 2001
F. Gagliardi - CERN/IT 4
CERN
One of the four
LHC detectors
online system
multi-level trigger
filter out background
reduce data volume
Krakow 2001
F. Gagliardi - CERN/IT 5
The LHC Detectors
CERN
CMS
ATLAS
~6-8 PetaBytes / year
~108 events/year
LHCb
Krakow 2001
F. Gagliardi - CERN/IT 6
Funding
CERN

Requirements growing faster than Moore’s law

CERN’s overall budget
is fixed
2010
2009
2008
2007
2006
2005
2004
2003
2002
Budget level in 2000 for
all physics data handling
45,000
40,000
35,000
30,000
25,000
20,000
15,000
10,000
5,000
0
2001
KCHF
All Expts
- Tier
+ Tier 1 at CERN
Estimated
cost
of0 facility
Physics in
2005
~ 30% of offline
requirements*
Year
R&D testbed
Physics WAN
Systems administration
Mass Storage
disks
processors
Krakow 2001
*assumes physics in July 2005,
rapid ramp-up of luminosity
F. Gagliardi - CERN/IT 7
World Wide Collaboration
 distributed computing & storage capacity
CMS:
Krakow 2001
CERN
1800 physicists
150 institutes
32 countries
F. Gagliardi - CERN/IT 8
LHC Computing Model
Lab m
Uni x
USA
Brookhaven
Tier2
Physics
Department
USA
FermiLab

Lab b
CERN
Uni y

Uni n
……….
NL

Krakow 2001
France
Tier 1
Italy
Desktop
UK
Uni a
Germany
Uni b
Lab c
les.robertson@cern.ch
Lab a
CERN
F. Gagliardi - CERN/IT 9
Five Emerging Models of Networked Computing From
The Grid
CERN
Distributed Computing

 || synchronous processing
High-Throughput Computing
 || asynchronous processing
On-Demand Computing


 || dynamic resources
Data-Intensive Computing

 || databases
Collaborative Computing

 || scientists
Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing
Infrastructure,” Morgan Kaufmann, 1999, http://www.mkp.com/grids
Krakow 2001
F. Gagliardi - CERN/IT 10
EU DataGrid background

CERN
Motivated by the challenge of the LHC computing
 Large amount of data (~10 Pbytes/year starting in 2006)
 Distributed computing resources and skills
 Geographical worldwide distributed community (VO)

Excellent Grid computing model match to HEP
requirements (Foster’s quote: HEP is Grid computing “par
excellence” )
 Transition from supercomputers to commodity computing
done
 Distributed job level parallelism (no strong need for MPI)
 High throughput computing rather than supercomputing
 VO tradition already long established
 Prototype Grid activity in some CERN member states
Krakow 2001
F. Gagliardi - CERN/IT 11
Main project goals and characteristics

To build a significant prototype of the LHC computing model

To collaborate with and complement other European and US
projects

To develop a sustainable computing model applicable to other
sciences and industry: biology, earth observation etc.

Specific project objectives:
CERN
 Middleware for fabric & Grid management (mostly funded by the
EU): evaluation, test, and integration of existing M/W S/W and
research and development of new S/W as appropriate
 Large scale testbed (mostly funded by the partners)
 Production quality demonstrations (partially funded by the EU)

Open source and communication:
 Global GRID Forum
 Industry and Research Forum
Krakow 2001
F. Gagliardi - CERN/IT 12
Main Partners
CERN
 CERN – International (Switzerland/France)
 CNRS - France
 ESA/ESRIN – International (Italy)
 INFN - Italy
 NIKHEF – The Netherlands
 PPARC - UK
Krakow 2001
F. Gagliardi - CERN/IT 13
Associated Partners
CERN
Research and Academic Institutes
•CESNET (Czech Republic)
•Commissariat à l'énergie atomique (CEA) – France
•Computer and Automation Research Institute,
Hungarian Academy of Sciences (MTA SZTAKI)
•Consiglio Nazionale delle Ricerche (Italy)
•Helsinki Institute of Physics – Finland
•Institut de Fisica d'Altes Energies (IFAE) - Spain
•Istituto Trentino di Cultura (IRST) – Italy
•Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany
•Royal Netherlands Meteorological Institute (KNMI)
•Ruprecht-Karls-Universität Heidelberg - Germany
•Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands
•Swedish Natural Science Research Council (NFR) - Sweden
Industrial Partners
•Datamat (Italy)
•IBM (UK)
•CS-SI (France)
Krakow 2001
F. Gagliardi - CERN/IT 14
Project scope
CERN

9.8 M Euros EU funding over 3 years

90% for middleware and applications (HEP, EO and
biology)

Three year phased developments & demos (2001-2003)

Possible extensions (time and funds) on the basis of first
successful results:




Krakow 2001
DataTAG (2002-2003)
CrossGrid (2002-2004)
GridStart (2002-2004)
…
F. Gagliardi - CERN/IT 15
DataGrid status

CERN
Preliminary architecture defined
 Enough to deploy first testbed 1

First M/W delivery
 (GDMP, first workload management system, fabric
management tools, Globus installation, including
certification and authorization, Condor tools)

First application test cases ready, long term cases
defined

Integration team actively deploying Testbed 1
Krakow 2001
F. Gagliardi - CERN/IT 16
EU-DataGrid Architecture
CERN
Local Application
Local Database
Local Computing
Grid
Grid Application Layer
Data
Management
Job
Management
Metadata
Management
Object to File
Mapping
Collective Services
Information
&
Monitoring
Replica
Manager
Grid
Scheduler
Underlying Grid Services
SQL
Database
Services
Computing
Element
Services
Storage
Element
Services
Replica
Catalog
Authorization
Authentication
and Accounting
Service
Index
Grid
Fabric
Fabric services
Resource
Management
Krakow 2001
Configuration
Management
Monitoring
and
Fault Tolerance
Node
Installation &
Management
Fabric Storage
Management
F. Gagliardi - CERN/IT 17
Test Bed Schedule

CERN
TestBed 0 (early 2001)
 International test bed 0 infrastructure deployed


Globus 1 only - no EDG middleware
TestBed 1 ( now )
 First release of EU DataGrid software to defined users within the
project:


HEP experiments (WP 8)

Biology applications (WP 9)

Earth Observation (WP 10)
TestBed 2 (Sept. 2002)
 Builds on TestBed 1 to extend facilities of DataGrid

TestBed 3 (March 2003) & 4 (Sept 2003)
Krakow 2001
F. Gagliardi - CERN/IT 18
DataGrid status

Preliminary architecture defined
 Enough to deploy testbed 1

First M/W delivery
 (GDMP, first workload management system,
fabric management tools, Globus installation,
including certification and authorization,
Condor tools)

First application test cases ready, long
term cases defined

Integration team first release of Testbed 1
Krakow 2001
CERN
Elisabetta Ronchieri
Shahzad Muzaffar
Alex Martin
Maite Barroso Lopez
Jean Philippe Baud
Frank Bonnassieux
WP1
WP2
WP3
WP4
WP5
WP7
Brian Coghlan
Flavia Donno
Eric Fede
Fabio Hernandez
Nadia Lajili
Charles Loomis
Pietro Paolo Martucci
Andrew McNab
Sophie Nicoud
Yannik Patois
Anders Waananen
WP6
WP6
WP6
WP6
WP6
WP6
WP6
WP6
WP6
WP6
WP6
PierGiorgio Cerello
Eric Van Herwijnen
Julian Lindford
Andrea Parrini
Yannick Legre
WP8
WP8
WP9
WP9
WP10
F. Gagliardi - CERN/IT 19
Test bed 1 Approach

CERN
Software integration
 combines software from each middle-ware work package and
underlying external tool kits (e.g. Globus)
 performed by integration team at CERN on a cluster of 10 Linux
PCs

Basic integration tests
 performed by integration team to verify basic functionality

Validation tests
 application groups use testbed 1 to exercise their application
software

Krakow 2001
e.g. LHC experiments run jobs using their offline software
suites on test-bed 1 sites
F. Gagliardi - CERN/IT 20
Detailed TestBed 1 Schedule





Krakow 2001
CERN
October 1:
Intensive integration starts
Based on Globus 2
November 1:
First beta release of DataGrid (CERN & Lyon)
(depends on changes needed Globus 1->2)
November 15:
Initial limited application testing finished
DataGrid ready for deployment on partner sites (~5 sites)
November 30:
Widespread deployment
Code machines split for development
Testbed 1 open to all applications (~40 sites)
December 29: WE ARE DONE!
F. Gagliardi - CERN/IT 21
CERN
TestBed 1 Sites

First round (15 Nov.)
 CERN, Lyon, RAL, Bologna

Second Round (30 Nov.)
 Netherlands: NIKHEF
 UK: 6 sites: Bristol, Edinburgh, Glasgow, Lancaster, Liverpool,
Oxford
 Italy: 6-7 sites: Catania, Legnaro/Padova, Milan, Pisa, Rome, Turin,
Cagliari?
 France: Ecole-Polytechnique
 Russia: Moscow
 Spain: Barcelona?
 Scandinavia: Lund?
 WP9 (GOME): ESA, KNMI, IPSL, ENEA
Krakow 2001
F. Gagliardi - CERN/IT 22
Licenses & Copyrights

Package Repository and web site



Will be the same (or very similar) to Globus license
A BSD-style license which puts few restrictions on use
Condor-G (used by WP1)



Copyright (c) 2001 EU DataGrid – see http://www.edg.org/license.html
License



Provides access to the packaged Globus, DataGrid and required
external software
All software is packaged as source and binary RPMs
Copyright Statement


CERN
Not open source or redistributable
Through special agreement, can redistribute within DataGrid
LCFG (used by WP4)

Krakow 2001
Uses GPL
F. Gagliardi - CERN/IT 23
Security

CERN
The EDG software supports many Certification Authorities
from the various partners involved in the project
 http://marianne.in2p3.fr/datagrid/ca/ca-table-ca.html
 but not Globus CA

For a machine to participate as a Testbed 1 resource all the
CAs must be enabled.
 all CA certificates can be installed without compromising local site
security

Each host running a Grid service needs to be able to
authenticate users and other hosts
 site manager has full control over security for local nodes

Virtual Organisation represents a community of users
 6 VOs for testbed 1: 4 HEP (ALICE, ATLAS, CMS, LHCb), 1 EO,
1 Biology
Krakow 2001
F. Gagliardi - CERN/IT 24
Node configuration
tools
Node configuration
and installation
tools
CERN
• For reference platform (Linux RedHat 6.2)
• Initial installation tool using system image
cloning
• LCFG (Edinburgh University) for software
updates and maintenance
LCFG configuration files
mkxprof
HTTP
Web Server
XML Profile
(one per client node)
rdxprof
ldxprof
Generic
Component
Server node
DBM File
Client nodes
Krakow 2001
LCFG Components
F. Gagliardi - CERN/IT 25
Middleware components
Job Description Language (JDL)
script
to
describe
the
CERN
Logging and Book-keeping (L&B )
job
parameters
User Interface (UI)
sends the job to the RB and
receives the results
Resource Broker (RB)
locates and selects the target
Computing Element (CE)
Job Submission Service (JSS)
 records job status
information
Grid Information Service (GIS)
 Information Index about
state of Grid fabric
Replica Catalog
 list of data sets and their
duplicates held on Storage
Elements (SE)
submits the job to the target
CE
Krakow 2001
F. Gagliardi - CERN/IT 26
A Job Submission Example
UI
JDL
Job Submit
Event
Input Sandbox
Replica
Catalogue
CERN
Information
Service
Input Sandbox
Output Sandbox
Resource
Broker
Brokerinfo
Logging &
Book-keeping
Job Submission
Service
Output Sandbox
Job Status
Krakow 2001
Storage
Element
Compute
Element
F. Gagliardi - CERN/IT 27
Iterative Releases

CERN
Planned intermediate release schedule
 TestBed1: October 2001
 Release 1.1: January 2002
 Release 1.2: March 2002
 Release 1.3: May 2002
 Release 1.4: July 2002
 TestBed 2: September 2002

Similar schedule will be organised for 2003

Each release includes
 feedback from use of previous release by application groups
 planned improvements/extension by middle-ware WPs
 use of software infrastructure
 feeds into architecture group
Krakow 2001
F. Gagliardi - CERN/IT 28
Software Infrastructure

Toolset for aiding the development & integration of middleware







CERN
code repositories (CVS)
browsing tools (CVSweb)
build tools (autoconf, make etc.)
document builders (doxygen)
coding standards and check tools (e.g. CodeChecker)
nightly builds
Guidelines, examples and documentation
 show the software developers how to use the toolset

Development facility
 test environment for software (small set of PCs in a few partner
sites)

Provided and managed by WP6
 setting-up toolset and organising development facility
Krakow 2001
F. Gagliardi - CERN/IT 29
Future Plans

Tighter connection to applications principal architects

Closer integration of the software components

Improve software infrastructure toolset and test suites

Evolve architecture on the basis of TestBed results

Enhance synergy with US via DataTAG-iVDGL and InterGrid

Promote early standards adoption with participation to GGF
CERN
WGs

First project EU review end of February 2002

Final software release by end of 2003
Krakow 2001
F. Gagliardi - CERN/IT 30
Conclusions
CERN

EU DataGrid well on its way, but coordination
with other Grid initiatives essential both in EU
and elsewhere

The EU is very supportive for a worldwide
collaboration

Important to extend project test beds to
entire HEP (and other applications communities:
EO, Bio, etc.)

Grid activity well started in Poland, need to
make sure this well coordinated with the rest
of the world
Krakow 2001
F. Gagliardi - CERN/IT 31
Download