CERN Review of the EU DataGrid Project and other EU Grid initiatives Fabrizio GAGLIARDI CERN Geneva-Switzerland EU-DataGrid Project Head November 2001 F.Gagliardi@cern.ch Krakow 2001 Talk summary Introduction EU DataGrid background Project Status Future Plans Other EU initiatives Conclusions Krakow 2001 CERN F. Gagliardi - CERN/IT 2 CERN The European Organisation for Nuclear Research 20 European countries 2,500 staff 6,000 users CERN 27km of tunnel stuffed with magnets and klystrons Krakow 2001 F. Gagliardi - CERN/IT 4 CERN One of the four LHC detectors online system multi-level trigger filter out background reduce data volume Krakow 2001 F. Gagliardi - CERN/IT 5 The LHC Detectors CERN CMS ATLAS ~6-8 PetaBytes / year ~108 events/year LHCb Krakow 2001 F. Gagliardi - CERN/IT 6 Funding CERN Requirements growing faster than Moore’s law CERN’s overall budget is fixed 2010 2009 2008 2007 2006 2005 2004 2003 2002 Budget level in 2000 for all physics data handling 45,000 40,000 35,000 30,000 25,000 20,000 15,000 10,000 5,000 0 2001 KCHF All Expts - Tier + Tier 1 at CERN Estimated cost of0 facility Physics in 2005 ~ 30% of offline requirements* Year R&D testbed Physics WAN Systems administration Mass Storage disks processors Krakow 2001 *assumes physics in July 2005, rapid ramp-up of luminosity F. Gagliardi - CERN/IT 7 World Wide Collaboration distributed computing & storage capacity CMS: Krakow 2001 CERN 1800 physicists 150 institutes 32 countries F. Gagliardi - CERN/IT 8 LHC Computing Model Lab m Uni x USA Brookhaven Tier2 Physics Department USA FermiLab Lab b CERN Uni y Uni n ………. NL Krakow 2001 France Tier 1 Italy Desktop UK Uni a Germany Uni b Lab c les.robertson@cern.ch Lab a CERN F. Gagliardi - CERN/IT 9 Five Emerging Models of Networked Computing From The Grid CERN Distributed Computing || synchronous processing High-Throughput Computing || asynchronous processing On-Demand Computing || dynamic resources Data-Intensive Computing || databases Collaborative Computing || scientists Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing Infrastructure,” Morgan Kaufmann, 1999, http://www.mkp.com/grids Krakow 2001 F. Gagliardi - CERN/IT 10 EU DataGrid background CERN Motivated by the challenge of the LHC computing Large amount of data (~10 Pbytes/year starting in 2006) Distributed computing resources and skills Geographical worldwide distributed community (VO) Excellent Grid computing model match to HEP requirements (Foster’s quote: HEP is Grid computing “par excellence” ) Transition from supercomputers to commodity computing done Distributed job level parallelism (no strong need for MPI) High throughput computing rather than supercomputing VO tradition already long established Prototype Grid activity in some CERN member states Krakow 2001 F. Gagliardi - CERN/IT 11 Main project goals and characteristics To build a significant prototype of the LHC computing model To collaborate with and complement other European and US projects To develop a sustainable computing model applicable to other sciences and industry: biology, earth observation etc. Specific project objectives: CERN Middleware for fabric & Grid management (mostly funded by the EU): evaluation, test, and integration of existing M/W S/W and research and development of new S/W as appropriate Large scale testbed (mostly funded by the partners) Production quality demonstrations (partially funded by the EU) Open source and communication: Global GRID Forum Industry and Research Forum Krakow 2001 F. Gagliardi - CERN/IT 12 Main Partners CERN CERN – International (Switzerland/France) CNRS - France ESA/ESRIN – International (Italy) INFN - Italy NIKHEF – The Netherlands PPARC - UK Krakow 2001 F. Gagliardi - CERN/IT 13 Associated Partners CERN Research and Academic Institutes •CESNET (Czech Republic) •Commissariat à l'énergie atomique (CEA) – France •Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI) •Consiglio Nazionale delle Ricerche (Italy) •Helsinki Institute of Physics – Finland •Institut de Fisica d'Altes Energies (IFAE) - Spain •Istituto Trentino di Cultura (IRST) – Italy •Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany •Royal Netherlands Meteorological Institute (KNMI) •Ruprecht-Karls-Universität Heidelberg - Germany •Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands •Swedish Natural Science Research Council (NFR) - Sweden Industrial Partners •Datamat (Italy) •IBM (UK) •CS-SI (France) Krakow 2001 F. Gagliardi - CERN/IT 14 Project scope CERN 9.8 M Euros EU funding over 3 years 90% for middleware and applications (HEP, EO and biology) Three year phased developments & demos (2001-2003) Possible extensions (time and funds) on the basis of first successful results: Krakow 2001 DataTAG (2002-2003) CrossGrid (2002-2004) GridStart (2002-2004) … F. Gagliardi - CERN/IT 15 DataGrid status CERN Preliminary architecture defined Enough to deploy first testbed 1 First M/W delivery (GDMP, first workload management system, fabric management tools, Globus installation, including certification and authorization, Condor tools) First application test cases ready, long term cases defined Integration team actively deploying Testbed 1 Krakow 2001 F. Gagliardi - CERN/IT 16 EU-DataGrid Architecture CERN Local Application Local Database Local Computing Grid Grid Application Layer Data Management Job Management Metadata Management Object to File Mapping Collective Services Information & Monitoring Replica Manager Grid Scheduler Underlying Grid Services SQL Database Services Computing Element Services Storage Element Services Replica Catalog Authorization Authentication and Accounting Service Index Grid Fabric Fabric services Resource Management Krakow 2001 Configuration Management Monitoring and Fault Tolerance Node Installation & Management Fabric Storage Management F. Gagliardi - CERN/IT 17 Test Bed Schedule CERN TestBed 0 (early 2001) International test bed 0 infrastructure deployed Globus 1 only - no EDG middleware TestBed 1 ( now ) First release of EU DataGrid software to defined users within the project: HEP experiments (WP 8) Biology applications (WP 9) Earth Observation (WP 10) TestBed 2 (Sept. 2002) Builds on TestBed 1 to extend facilities of DataGrid TestBed 3 (March 2003) & 4 (Sept 2003) Krakow 2001 F. Gagliardi - CERN/IT 18 DataGrid status Preliminary architecture defined Enough to deploy testbed 1 First M/W delivery (GDMP, first workload management system, fabric management tools, Globus installation, including certification and authorization, Condor tools) First application test cases ready, long term cases defined Integration team first release of Testbed 1 Krakow 2001 CERN Elisabetta Ronchieri Shahzad Muzaffar Alex Martin Maite Barroso Lopez Jean Philippe Baud Frank Bonnassieux WP1 WP2 WP3 WP4 WP5 WP7 Brian Coghlan Flavia Donno Eric Fede Fabio Hernandez Nadia Lajili Charles Loomis Pietro Paolo Martucci Andrew McNab Sophie Nicoud Yannik Patois Anders Waananen WP6 WP6 WP6 WP6 WP6 WP6 WP6 WP6 WP6 WP6 WP6 PierGiorgio Cerello Eric Van Herwijnen Julian Lindford Andrea Parrini Yannick Legre WP8 WP8 WP9 WP9 WP10 F. Gagliardi - CERN/IT 19 Test bed 1 Approach CERN Software integration combines software from each middle-ware work package and underlying external tool kits (e.g. Globus) performed by integration team at CERN on a cluster of 10 Linux PCs Basic integration tests performed by integration team to verify basic functionality Validation tests application groups use testbed 1 to exercise their application software Krakow 2001 e.g. LHC experiments run jobs using their offline software suites on test-bed 1 sites F. Gagliardi - CERN/IT 20 Detailed TestBed 1 Schedule Krakow 2001 CERN October 1: Intensive integration starts Based on Globus 2 November 1: First beta release of DataGrid (CERN & Lyon) (depends on changes needed Globus 1->2) November 15: Initial limited application testing finished DataGrid ready for deployment on partner sites (~5 sites) November 30: Widespread deployment Code machines split for development Testbed 1 open to all applications (~40 sites) December 29: WE ARE DONE! F. Gagliardi - CERN/IT 21 CERN TestBed 1 Sites First round (15 Nov.) CERN, Lyon, RAL, Bologna Second Round (30 Nov.) Netherlands: NIKHEF UK: 6 sites: Bristol, Edinburgh, Glasgow, Lancaster, Liverpool, Oxford Italy: 6-7 sites: Catania, Legnaro/Padova, Milan, Pisa, Rome, Turin, Cagliari? France: Ecole-Polytechnique Russia: Moscow Spain: Barcelona? Scandinavia: Lund? WP9 (GOME): ESA, KNMI, IPSL, ENEA Krakow 2001 F. Gagliardi - CERN/IT 22 Licenses & Copyrights Package Repository and web site Will be the same (or very similar) to Globus license A BSD-style license which puts few restrictions on use Condor-G (used by WP1) Copyright (c) 2001 EU DataGrid – see http://www.edg.org/license.html License Provides access to the packaged Globus, DataGrid and required external software All software is packaged as source and binary RPMs Copyright Statement CERN Not open source or redistributable Through special agreement, can redistribute within DataGrid LCFG (used by WP4) Krakow 2001 Uses GPL F. Gagliardi - CERN/IT 23 Security CERN The EDG software supports many Certification Authorities from the various partners involved in the project http://marianne.in2p3.fr/datagrid/ca/ca-table-ca.html but not Globus CA For a machine to participate as a Testbed 1 resource all the CAs must be enabled. all CA certificates can be installed without compromising local site security Each host running a Grid service needs to be able to authenticate users and other hosts site manager has full control over security for local nodes Virtual Organisation represents a community of users 6 VOs for testbed 1: 4 HEP (ALICE, ATLAS, CMS, LHCb), 1 EO, 1 Biology Krakow 2001 F. Gagliardi - CERN/IT 24 Node configuration tools Node configuration and installation tools CERN • For reference platform (Linux RedHat 6.2) • Initial installation tool using system image cloning • LCFG (Edinburgh University) for software updates and maintenance LCFG configuration files mkxprof HTTP Web Server XML Profile (one per client node) rdxprof ldxprof Generic Component Server node DBM File Client nodes Krakow 2001 LCFG Components F. Gagliardi - CERN/IT 25 Middleware components Job Description Language (JDL) script to describe the CERN Logging and Book-keeping (L&B ) job parameters User Interface (UI) sends the job to the RB and receives the results Resource Broker (RB) locates and selects the target Computing Element (CE) Job Submission Service (JSS) records job status information Grid Information Service (GIS) Information Index about state of Grid fabric Replica Catalog list of data sets and their duplicates held on Storage Elements (SE) submits the job to the target CE Krakow 2001 F. Gagliardi - CERN/IT 26 A Job Submission Example UI JDL Job Submit Event Input Sandbox Replica Catalogue CERN Information Service Input Sandbox Output Sandbox Resource Broker Brokerinfo Logging & Book-keeping Job Submission Service Output Sandbox Job Status Krakow 2001 Storage Element Compute Element F. Gagliardi - CERN/IT 27 Iterative Releases CERN Planned intermediate release schedule TestBed1: October 2001 Release 1.1: January 2002 Release 1.2: March 2002 Release 1.3: May 2002 Release 1.4: July 2002 TestBed 2: September 2002 Similar schedule will be organised for 2003 Each release includes feedback from use of previous release by application groups planned improvements/extension by middle-ware WPs use of software infrastructure feeds into architecture group Krakow 2001 F. Gagliardi - CERN/IT 28 Software Infrastructure Toolset for aiding the development & integration of middleware CERN code repositories (CVS) browsing tools (CVSweb) build tools (autoconf, make etc.) document builders (doxygen) coding standards and check tools (e.g. CodeChecker) nightly builds Guidelines, examples and documentation show the software developers how to use the toolset Development facility test environment for software (small set of PCs in a few partner sites) Provided and managed by WP6 setting-up toolset and organising development facility Krakow 2001 F. Gagliardi - CERN/IT 29 Future Plans Tighter connection to applications principal architects Closer integration of the software components Improve software infrastructure toolset and test suites Evolve architecture on the basis of TestBed results Enhance synergy with US via DataTAG-iVDGL and InterGrid Promote early standards adoption with participation to GGF CERN WGs First project EU review end of February 2002 Final software release by end of 2003 Krakow 2001 F. Gagliardi - CERN/IT 30 Conclusions CERN EU DataGrid well on its way, but coordination with other Grid initiatives essential both in EU and elsewhere The EU is very supportive for a worldwide collaboration Important to extend project test beds to entire HEP (and other applications communities: EO, Bio, etc.) Grid activity well started in Poland, need to make sure this well coordinated with the rest of the world Krakow 2001 F. Gagliardi - CERN/IT 31