Workshop on Sustainable National Grid Services Edinburgh, Feb 22 – 23, 2007 D-Grid Progress Towards Sustainability Wolfgang Gentzsch D-Grid February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 1 Today’s Topics A little history of D-Grid D-Grid: a few details 2 ways towards a sustainability strategy: 1. Learn from others (analysis of major grids) 2. Learn from our own requirements Analyzing major grid projects D-Grid Sustainability Workshop Conclusions February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 2 History of D-Grid Initiative • 01/2003: German scientists started D-Grid Initiative (‘UK pressure’); report with recommendations for German Government • 03/2004: BMBF announced 100 ME e-Science Initiative for Germany • 08/2004: BMBF Call for Proposals for e-Learning, Knowledge Networks, and Grid Computing • 09/2005: D-Grid-1: 25 ME, early adopters, ‘Services for Science’ • 06/2007: D-Grid-2: new communities and services providers • 06/2008: D-Grid-3 (?): Service Grids for research, industry, society February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 3 Sustainability, Ideas in the Beginning Learn from others, collaborate with others Sustainability in architecture (standards), technology (robust), users (applications), market, legal, government,… Start with a plan for sustainability Users and applications drive sustainability (not only!) Develop clear benefits for users Make everything easy to use Political and policy landscape has to be right February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 4 D-Grid Im Wissensnetz ... WIKINGER ONTOVERSE WISENT Textgrid Knowledge Management MediGrid IN-Grid HEP-Grid C3-Grid Astro-Grid D-Grid-1 Generic Grid Middleware and Grid Services Integration Project February, 2007 Wolfgang Gentzsch, D-Grid & RENCI Courtesy Helmut Loewe, BMBF 5 D-Grid-2 Services Level Agreements Im Wissensnetz ... WIKINGER ... ONTOVERSE WISENT Knowledge Management Textgrid MediGrid IN-Grid HEP-Grid C3-Grid Astro-Grid D-Grid-1 + 2 Generic Grid Middleware and Grid Services Integration Project February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 6 D-Grid-3 Services Level Agreements Im Wissensnetz ... WIKINGER ... Knowledge Management ONTOVERSE Knowledge Management WISENT Textgrid MediGrid IN-Grid HEP-Grid C3-Grid Astro-Grid D-Grid-1 + 2 + 3 Generic Grid Middleware and Grid Services Integration Project February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 7 D-Grid-3 Virtual Competence Centers for Services Level Agreements Middleware, Resources, Support, Knowledge Im Wissensnetz ... WIKINGER ... ONTOVERSE Knowledge Management WISENT Textgrid MediGrid IN-Grid HEP-Grid C3-Grid Astro-Grid D-Grid-1 + 2 + 3 Generic Grid Middleware and Grid Services February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 8 D-Grid Middleware Stack User Application Development and User Access GAT API GridSphere Plug-In Nutzer Scheduling Workflow Management High-level Grid Services Monitoring Data management Basic Grid Services UNICORE LCG/gLite Accounting Billing User/VO-Mngt Globus 4.0.1 Security Resources in D-Grid December, 2006 Distributed Data Archive Data/ Software Network Infrastructur Wolfgang Gentzsch, D-Grid & RENCI Distributed Compute Resources 9 Learn from Others: February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 10 e-Science Grid Initiatives Investigated Initiative Time Funding People *) UK e-Science-I: 2001 - 2004 UK e-Science-II: 2004 - 2006 $180M $220M 900 1100 TeraGrid-I: TeraGrid-II: 2001 - 2004 2005 - 2010 $90M $150M 500 850 Res. Res. ChinaGrid-I: ChinaGrid-II: 2003 - 2006 2007 – 2010 20M RMB 50M RMB *) 400 1000 Res. Res. NAREGI-I: NAREGI-II 2003 - 2005 2006 - 2010 $25M $40M EGEE-I: EGEE-II: 2004 - 2006 2006 - 2008 $40M $45M D-Grid-I: D-Grid-II: 2005 - 2008 2007 - 2009 $25M $25M *) *) estimate February, 2007 Wolfgang Gentzsch, D-Grid & RENCI Users Res. Res. Ind. 150 250 Res. Res. Ind. 800 1000 Res. Res. Ind. 220 220 (= 440) Res. Res. Ind. 11 Main Objectives of e-Science Projects UK e-Science: To enable the next generation of multi-disciplinary collaborative science and engineering, to enable faster, better or different research. EGEE: To provide a seamless Grid infrastructure for e-Science that is available for scientists 24 hours-a-day. ChinaGrid: To provide a research and education platform by using grid technology for the faculties and students among the major universities in China. NAREGI: To do research, development and deployment of science grid middleware. TeraGrid: Create a unified Cyberinfrastructure supporting a broad array of US science activities using the suite of NSF HPC facilities D-Grid: Build and operate a sustainable grid service infrastructure for German research (D-Grid1) and research and industry (D-Grid2) February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 12 Components of e-Infrastructures for Science (Tony Hey, 2003) 1. Resources: Networks with computing and data nodes, etc. 2. Development/support of standard middleware & grid services 3. Internationally agreed AAA infrastructure 4. Discovery services and collaborative tools 5. Data provenance, curation and preservation 6. Open access to data and publications via interoperable repositories 7. Remote access to large-scale facilities: Telescopes, LHC, ITER, .. 8. Industrial collaboration Ideally: having well-defined specific service & support centres Examples, UK: OMII, DCC, NGS 13 Grid Middleware Stacks, major modules UK e-Science: Phase 1: Globus 2.4.3, Condor, SRB. Phase 2: Globus 3.9.5 und 4.0.1, OGSA-DAI, Web services. EGEE: gLite distribution: elements of Condor, Globus 2.4.3 (via VDT distribution). ChinaGrid: ChinaGrid Supporting Platform (CGSP) 1.0 is based on Globus 3.9.1, and CGSP 2.0 is implemented based on Globus 4.0. NAREGI: NAREGI middleware and Globus 4.0.1 GSI and WS-GRAM TeraGrid: GT 2.4. and 4.0.1: Globus GRAM, MDS for information, GridFTP & TGCP file transfer, RLS for data replication support, MyProxy for credential mgmnt D-Grid: Globus 2.4.3 (gLite) and 4.0.3, Unicore 5, dCache, SRB/iRODS, OGSA-DAI, GridSphere, GAT, VOMS and Shibboleth February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 14 Sustainability UK e-Science: National Grid Service (NGS), Grid Operations Support Center (GOSC), National e-Science Center (NeSC), Regional e-Science Centers, Open Middleware Infrastructure Institute (OMII), Digital Curation Center (DCC) EGEE: Plans to establish a European Grid Initiative (EGI), together with NGIs, to provide persistent grid service federating national grid programmes starting in 2008 ChinaGrid: Increasing numbers of grid applications using CGSP grid middleware packages NAREGI: Software will be managed and maintained by Cyber Science Infrastructure Center of National Institute of Informatics TeraGrid: NSF Cyberinfrastructure Office: 5 year Coop. Agreement. Partnerships with peer grid efforts and commercial web services activities in order to integrate broadly D-Grid: DGI WP 4: sustainability, services strategies, and business models February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 15 e-Science Applications drive Sustainability UK e-Science: Particle physics, astronomy, chemistry, bioinformatics, healthcare, engineering, environment, pharmaceutical, petro-chemical, media and financial sectors EGEE: 2 pilot applications (physics, life science) and applications from other 7 disciplines. ChinaGrid: Bioinformatics, image processing, computational fluid dynamics, remote education, and massive data processing NAREGI: Nano-science applications TeraGrid: Physics (Lattice QCD calculations, Turbulence simulations, Stellar models), Molecular Bioscience (molecular dynamics), Chemistry, Atmospheric Sciences D-Grid-1: Astrophysics, high-energy physics, earth science, medicine, engineering, humanities February, 2007 Wolfgang Gentzsch, D-Grid & RENCI 16 D-Grid: Towards a Sustainable Infrastructure for Science and Industry Govt is changing policies for resource acquisition (HBFG ! ) to enable a service model 2nd Call: Focus on Service Provisioning for Sciences & Industry Strong collaboration with: Globus Project, EGEE, Deisa, CrossGrid, CoreGrid, GridCoord, GRIP, UniGrids, NextGrid, … Application and user-driven, not infrastructure-driven Focus on implementation and production, not grid research, in a multi-technology environment (Globus, Unicore, gLite, etc) D-Grid is the Core of the German e-Science Initiative December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 17 Challenges for Research and Industry • • • • • • • • • • • • Sensitive data, sensitive applications (medical patient records) Different organizations get different benefits Accounting, who pays for what (sharing!) Security policies: consistent and enforced across the grid ! Lack of standards prevent interoperability of components Current IT culture is not predisposed to sharing resources Not all applications are grid-ready or grid-enabled Open source is not equal open source (read the small print) SLAs based on open source (liability?) “Static” licensing model don’t embrace grid Protection of intellectual property Legal issues (e.g. FDA, HIPAA, multi-country grids) December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 18 Lessons Learned and Recommendations – During development, operation, the grid infrastructure should be modified and improved in large cycles only: all applications depend on this infrastructure ! – Continuity especially for the infrastructure part of grid projects is important. Therefore, funding should be available after the project, to guarantee services, support and continuous improvement and adjustment to new developments. – Interoperability: Use software components and standards from open-source and standards initiatives especially in the infrastructure and application middleware layer. – Close collaboration is mandatory between developers of the grid infrastructure and the applications to best utilize grid services and to avoid application silos. – Infrastructure should be user-friendly for easy adoption for new communities. The infrastructure group should offer installation/operation service and support. December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 19 Lessons Learned and Recommendations – For complex projects (infrastructure and application projects), a management board (consisting of the leaders of the different projects) should steer coordination and collaboration among the projects. – On top of grid infrastructure, new projects should utilize the generic infrastructure and focus on an application or on a specific service, to avoid complexity and re-inventing wheels and building grid application silos. . – Centers of Excellence should specialize on specific services, e.g. integration of new communities, grid operation, utility services, training, support, etc. – Participation of industry has to be industry-driven. Push from outside, even with government funding, is not promising. Success will come only from real needs e.g. through existing collaborations with research and industry, as a first step. – Implement utility computing in small steps, enhancing existing service models moderately, testing utility models first as pilots. Often, today’s government funding models are counter-productive for utility services. December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 20 Workshop Sustainability in D-Grid Oct 9 – 10 2006 Sustainability in Grids S. and the funding Organization (Govt) S. and monitoring, accounting, billing S. of the D-Grid Infrastructure S. and application communities Example DFN German Research Network S. and Industry S. and support The European Grid Initiative (EGI) December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 21 Results of the Workshop Requirements There is a general need for a sustainable infrastructure Funding agency demands cost-neutral operation But: not only monetary considerations, but also research Benefits for all constituencies Long-term data preservation International integration Acceptance of infrastructure through ease of use Long-term planning safety for grid communities Include learning (GridKa), testing, support, and production December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 22 Results of the Workshop Challenges Heterogeneous middleware complicates building sustainable grid Today: user unfriendly and complex environments Integration of new hardware from new partners and communities Currently, D-Grid is not a ‘legal’ entity Long-term financing of resources and their usage is not clear Grid-enabled software licensing model is unclear Broadening community grids beyond their current core members Germany: “Laender” investments restricted to local usage December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 23 Results of the Workshop Steps towards a sustainable infrastructure Increase significantly the number of grid users Govt funding for D-Grid specific resources was key Support of several middlewares important Long-term goal: independence of D-Grid from funding Encourage Govt to change current funding policies for resources User-friendly user support of utmost importance (DGI & CGs) Industry participation as users (SMEs) and providers (IT companies) December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 24 Results of the Workshop Conclusions and Recommendations D-Grid seems to be on track towards a sustainable infrastructure A centralized resource infrastructure is important, but the how still has to be discussed (DGI vs CGs) Implementation of sustainable D-Grid only together with users (CGs) Sustainable usage (business) models only with users (CGs) Integration of D-Grid in European infrastructure is important Central D-Grid institution should encourage broad acceptance of D-Grid, incl certification of and support for resources Role of industry unclear, but participation possible today December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 25 Lessons Learned and Recommendations – Continuity: Grid infrastructure should be modified and improved in large cycles only: applications depend on infrastructure ! – Sustainability: Funding should be available after end of project, to guarantee services, support and continuous improvement. – Interoperability: Use open-source software and standards especially in the infrastructure and application middleware layer. – Collaboration: between infrastructure developers and the applications, to best utilize grid services and to avoid application silos. – User-Friendliness: for easy adoption for new communities. Infrastructure group should offer installation, operation and support services. – Grid Services: Centers of Excellence should specialize on specific services, e.g. integration of new communities, grid operation, utility services, training, support, etc. – Participation of Industry: has to be industry-driven. Push from outside, even with govmnt funding, is not promising. Success comes only from real needs e.g. through already existing collaborations between research and industry. December, 2006 Wolfgang Gentzsch, D-Grid & RENCI 26 Many Thanks to: • UK-e-Science: Tony Hey, Steven Newhouse, Carole Goble, Malcolm Atkinson, John Darlington, Trevor Cooper Chadwick, Monica Schraefel, Luc Moreau, Paul Watson, Aaron Turner • TeraGrid: Charlie Catlett, Dane Skow • ChinaGrid: Hai Jin • Naregi: Kazushige Saga, Satoshi Matsuoka, Kenichi Miura • EGEE: • D-Grid: Bob Jones, Dieter Kranzlmueller, Erwin Laure Uwe Schwiegelshohn, Wolfgang Guerich, Klaus Ullmann, Klaus Peter Mickel, Matthias Steinmetz, Matthias Kasemann, Wolfgang Hiller, Otto Rienhoff, Michael Resch, Elmar Mittler, Wilhelm Hasselbring • RENCI: February, 2007 Dan Reed and Alan Blatecky Wolfgang Gentzsch, D-Grid & RENCI 27