UK Grid Support Centre CLRC + EPCC + MC Rob Allan CLRC e-Science Centre Daresbury Laboratory http://www.grid-support.ac.uk http://www.e-science.clrc.ac.uk r.j.allan@dl.ac.uk 22nd October 2001 UK Grid Support Centre - GSC Outline • The issues – why we need a Grid Support Centre • The team – who we are • The service – what we will provide 22nd October 2001 UK Grid Support Centre - GSC HPC and Scientific Opportunity in the Electronic Era Report of the Trends and Opportunities Panel (P.J. Durham, March 2001) • From Molecules to Matter – Electronic Structure – – – – – The Structure of Proteins “Simulation of whole systems, and not Materials Design just system components” Heterogeneous Catalysis Environmental and Atmospheric Science Rational Drug Design – New areas which will require exceptional computer resources: • Bridging Length and Time Scales • Quantum Computation • Thermodynamics • • • • From Molecules to Cells and beyond – Computational Biology From Eddies to Aircraft - Fluid Dynamics From Oceans to the Earth - Environmental Modelling From the Earth to the Solar System – Solar Plasma Physics 22nd October 2001 UK Grid Support Centre - GSC Scientific Software Infrastructure One of the Major Software Challenges Peak Performance is skyrocketing (more than Moore’s Law) – In past 10 years, peak performance has increased 100x; in next 5+ years, it will increase 1000x but ... – Efficiency has declined from 40-50% on the vector supercomputers of 1990s to as little as 5-10% on parallel supercomputers of today and may decrease further on future machines Research challenge is software – Scientific codes to model and simulate physical processes and systems – Computing and mathematics software to enable use of advanced computers for scientific applications – Continuing challenge as computer architectures undergo fundamental changes: Algorithms that scale to thousandsmillions processors 22nd October 2001 UK Grid Support Centre - GSC Improvements in Large-Area Networks • • • Network vs. computer performance – Computer speed doubles every 18 months – Network speed doubles every 9 months – Difference = order of magnitude per 5 years 1986 to 2000 – Computers: x 500 – Networks: x 340,000 2001 to 2010 – Computers: x 60 – Networks: x 4000 Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins. 22nd October 2001 UK Grid Support Centre - GSC e.g. DOE SCIDAC Hardware Infrastructure A Future Model for the UK ? • Flagship Computing Facility – To provide robust, high-end computing resources for all research programs e.g. TeraGrid • Topical Computing Facilities – To provide the most effective and efficient computing resources for a selected set of scientific applications – To serve as a focal point for a scientific research community as it adapts to new computing technologies • Experimental Computing Facilities – To assess new computing technologies for scientific applications • All Facilities on “The Grid” 22nd October 2001 UK Grid Support Centre - GSC Why Topical Facilities ? Variation in Scientific Application Needs* Time Memory Storage (TBYTES) (TBYTES) Node I/O Code Application Cactus ARPS MILC PPM PUPI ASPCG ENZO Astrophysics Weather Particle Physics Turbulent Flow Liquids Fluid Dynamics Galaxies 300 25 10,000 500 150 5,000 1,000 1.8 0.25 0.2 0.5 0.1 0.5 0.9 20 16 1 54 0.2 50 10 5 18 3 6 3 3 12 Variation 400x 18x 100x 6x (TFLOP/S-HRS) (MBYTES/S) * From “High-level Application Resource Characterization,” NSF/PACI (National Computational Science Alliance, May 2000) 22nd October 2001 UK Grid Support Centre - GSC The 13.6 TF/s TeraGrid: Computing at 40 Gb/s 26 24 8 4 Site Resources Site Resources HPSS HPSS External Networks Caltech HPSS 5 Argonne External Networks External Networks Site Resources External Networks SDSC 4.1 TF 225 TB NCSA/PACI 8 TF 240 TB TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne 22nd October 2001 Site Resources UniTree www.teragrid.org UK Grid Support Centre - GSC Grid/ Cluster Computing: Where do distributed Clusters fit? 1 TF/s delivered Distributed systems • • • • • • • • SE TI@ ho me Int ern et Co nd or Le gio n\G lob us Be ow ulf Be rk l ey NO W Su pe rcl us AS ter CI s Re Tfl d op s Potential 15 TF/s delivered Gather remote resources (unused) System S/W manages resources System S/W adds value 10% - 20% overhead is OK Resources drive applications Time to completion is not critical Time-shared Commercial: Entropia, PopularPower, United Devices, Centrata, ProcessTree, Applied Meta 22nd October 2001 • • • • • • • • MPP systems Bounded set of resources Apps grow to consume all cycles Application manages resources System S/W gets in the way 5% overhead is maximum Apps drive purchase of equipment Real-time constraints Space-shared UK Grid Support Centre - GSC Why we need a GSC • • • • • Software quality – much Grid software is relatively new and immature Local support – most Grid software is from US research groups UK strategy – focus on a set of software and provide good local support User skills – many users will be scientists and not software experts Coordination – large user base needs coordinated support and development 22nd October 2001 UK Grid Support Centre - GSC EPSRC Projects Name Partners DAME g CombEchem g Reality Grid g GEODISE g Discovery Net Theme York, Oxford, Sheffield, Leeds, Rolls Royce, Data Systems and Solutions Distributed Aircraft Maintenance Environment Southampton, Bristol, Roche Discovery, Pfizer, IBM, Cambridge Crystallograpic Structure-property Mapping: Combinatorial Chemistry and the Grid QMW, Edinburgh, Manchester, Loughborough, CfS, Schllomberger Cambridge, Edward Jenner, SGI, AVS, FECIT Southampton, Oxford, Manchester, Rolls Royce, BAe Systems, Fluent, Intel, Microsoft, Epistemics, Compusys A tool for investigating condensed matter and materials ICST, Inforsense, Deltadot, Rvco E-Science Testbed for high-throughput Informatics Manchester, Nottingham, Sheffield, EBML, IBM, GlaxoSmithKline, Sun, AstraZeneca, Merck, Epistemice, Network Inference, geneticXchange Directly supporting the e-Scientist Grid enabled Optimisation and Design Search for Engineering g MyGrid g 22nd October 2001 UK Grid Support Centre - GSC PPARC Projects Name URL+Sponsor GridPP Www.GridPP.ac.uk Theme UK LHC analysis programme linked to EU DataGrid project g AstroGrid Astrophysical Images and analysis g Other Research Councils are currently reviewing proposals 22nd October 2001 UK Grid Support Centre - GSC Core Projects Name Regional Centres URL Theme Www.nesc.ac.uk Support for e-Science projects and outreach to UK industry Www.grid-support.ac.uk Helpdesk and in-depth support for funded e-Science projects AccessGrid www.mcs.anl.gov/FL/ accessgrid Create & deploy group collaboration systems using commodity technologies Grid Starter Kit Www.grid-support.ac.uk Tutorial material, Software and documentation, Globus, Condor, SRB Consolidated report to be available soon... Evaluate Globus and related middleware based on experience of UKHEC sites Grid Support Centre g g g Evaluation Reports g 22nd October 2001 UK Grid Support Centre - GSC UK Grid Support Centre key challenges Develop a support mechanism for grid applications and distributed job execution. UKHEC expertise available. Help with distributed user administration • authentication and certificates (UK CA) • registration and grid-mapfiles • e-Science project database Help with installing Grid middleware • Globus, Condor, SRB • accounting and resource management middleware • new middleware components, e.g. coscheduling and instrumentation • network monitoring Help with system administration • reference system and demonstrators grid Who do you call when something goes wrong ? 22nd October 2001 UK Grid Support Centre - GSC Who is the Support Team ? • • • UKHEC Sites: CLRC e-Science Centre (RAL + DL) – running our own OST-funded e-Science programme – major involvement in GridPP and other Grid projects – HPCI Centre and extensive HPC applications development Edinburgh Parallel Computing Centre/ University of Edinburgh – National e-Science Centre (Edinburgh + Glasgow) – HPCI Centre and extensive computing applications experience Manchester Computing/ University of Manchester – NW Regional e-Science Centre – operates CSAR national HPC service www.ukhec.ac.uk 22nd October 2001 UK Grid Support Centre - GSC UKHEC and GSC Background In January 2000 EPSRC funded a 3 year core activity to track and disseminate information on international activities in computer architectures, software and programming tools and to promote good programming practice for the HEC community via workshops, seminars and mentoring. GSC commenced operation with distribution of the Grid Starter Kit and announcement of the helpdesk at the e-Science Core Programme meeting on 27th July 2001. Staff at sites: UKHEC GSC CLRC 2 4 Edinburgh 2 1 Manchester 1 1 also HPCI and CCL support teams A strategic collaboration between the main UK centres offering nation-wide academic computing support. 22nd October 2001 UK Grid Support Centre - GSC UKHEC Key Topics • Hardware: SMP/ DMM (ASCI) architectures + clusters (Beowulf) • Software Development and QA Tools: faster development, maintenance and exploitation • Languages (Java, C++, Fortran90): ease of use and performance • Optimisation: need highest performance - can tools help ? • Visualisation and VR: demonstrate VR capabilities for scientists • Data Management: demonstrate fast storage and access for science apps. • Grid computing environment: evaluation and use of Globus, coordination of eScience activities between centres • Standards: portability and longer code lifetime • OpenMP/ MPI Programming: optimise for heterogeneous architectures 22nd October 2001 UK Grid Support Centre - GSC Background: Reports, Meetings and Workshops Survey “Survey of Computational Steering, Meta-computing and Network Information Tools” (R.J. Allan DL-TR-99-002) and updated edition on-line www.dl.ac.uk/TCSC/HPCI/reports.html EPSRC GRID awareness meetings, e.g. RI, 27 March 2000 Polaris House, 1 June 2000 UKHEC Grid Seminar and Workshop - 21-22 June 2000 www.dl.ac.uk/TCSC/UKHEC/GridWorkshop Technical Report and Report for EPSRC “Grid-based High Performance Computing” www.dl.ac.uk/TCSC/UKHEC/metacomputing/metacomputing.pdf “A Review of UK HEC Grid Infrastructure: State-of-the-art and Next Steps” www.ukhec.ac.uk/publications/reports/ukhec-grid.pdf UKHEC-funded early activities meant that the partners were in a good position to lead in the UK’s e-Science programme. Set up of e-Science Centres and UK Grid Support Centre has been achieved. 22nd October 2001 UK Grid Support Centre - GSC What will GSC provide ? a) Information b) Technical support c) Development d) Technical liaison 22nd October 2001 UK Grid Support Centre - GSC e-Science and Grid Support Centres: What we have done so far ? Experience of Globus and GSI installations: SUN Solaris, IBM AIX, Linux (RedHat and SUSE), Compaq Tru64, SGI Irix Globus working with local RMS: PBS, LoadLeveler, NQE Evaluations of Condor and SRB on Sun and Linux Grid software distribution portal: linked to DisCo archive, now part of Grid Support Centre Grid Starter Kit, help desk and Web sites Re-launch of HPCProfile as HPCGrid Magazine Membership of GGF working groups, mailing lists and NPI UKHEC Globus Course under development Centres established: National e-Science Centre www.nesc.ac.uk CLRC e-Science Centre www.e-science.clrc.ac.uk NorthWest Regional e-Science Centre UK Grid Support Centre www.grid-support.ac.uk 7 other Regional Centres 22nd October 2001 UK Grid Support Centre - GSC Information • Support Web site www.grid-support.ac.uk – downloadable software – installation guides – documentation – introductory material – evaluation reports (soon) • Grid Starter Kit via main site – access via the Web site (kept up to date) or on CD-ROM – supports installation of Globus, Condor, SRB – additional software as required, e.g. GridEngine, LSF • National e-Science Centre www.nesc.ac.uk – links to other Regional Centres 22nd October 2001 UK Grid Support Centre - GSC Grid Support Centre Web Site www.grid-support.ac.uk 22nd October 2001 UK Grid Support Centre - GSC Technical support (1) • Help desk support@grid-support.ac.uk – staffed during normal office hours (9h-17h) – contactable via Web, email (or phone) – normally first point of contact with the Support Centre – located at RAL but provides links to experts at all centres – uses ARS Remedy help desk management system • Expert staff at 3 sites – provide help with installations – work closely with Regional Centres – visit user sites where necessary – provide in-depth technical expertise for complex problems • Intended to work closely with Regional Centres 22nd October 2001 UK Grid Support Centre - GSC Technical support (2) • Certificate Authority ca@grid-support.ac.uk – issues digital certificates to e-Science programme participants - will require nominated project, National or Regional Centre contacts to validate applicants – policy statement to be defined, agreed and published • Reference systems – running recommended software installations – will be available for users to access and study • Skeleton (Prototype) Grid • Training courses – required for both developers and system administrators – e.g. this course!!! 22nd October 2001 UK Grid Support Centre - GSC Software Distribution and Support : Grid Starter Kit (CD and Web) Software Download Portal: 22nd October 2001 UK Grid Support Centre - GSC Skeleton Grid • • • • • Link Regional Centres Prototype production Grid Identify issues for UK research and development Provide facilities for UK demonstrator and pilot projects Take part in International Testbeds, e.g. SC’2001, EcoGrid… watch this space! 22nd October 2001 UK Grid Support Centre - GSC Future work • • • Evaluate new software – leading to recommended additions to the supported programme software base Custom developments – software integration, enhancement, and filling gaps – new capabilities provided through the Support Centre User registration and accounting – proposal to base on UoM Unix User Registration System – ease problem of registering certified users on multiple Grid resources shared within Virtual Organisations 22nd October 2001 UK Grid Support Centre - GSC Technical liaison • Collaborate with Research Council application programmes and Core Programme Centres • Develop links with US software development teams • Collaborate on software developments • Exchange technical staff with development teams • Participate in Consortium for Open Grid Software • Contribute to Global Grid Forum WGs, W3C, IETF, etc. • Develop links with industry 22nd October 2001 UK Grid Support Centre - GSC Working with Industry Grid Middleware and Services End Users IBM - e-Services SGI - e-Services Unilever SUN - GridEngine, DRB, iPlanet, TCP Jaguar Platform Computing - Load Sharing Facility (LSF) BAe Systems Entropia - harnessing Windows platforms others in future ... New Productivity Initiative (NPI) - DRM standardisation Extreme Networks - QoS and highperformance networking solutions 22nd October 2001 NorthWest Development Agency Advert: First GSC workshop in association with 12th Daresbury Machine Evaluation Workshop 28-30/November 2001. See www.cse.clrc.ac.uk/Activity/DisCo UK Grid Support Centre - GSC How to contact us • Via the Web – http://www.grid-support.ac.uk/ • By email – support@grid-support.ac.uk • By phone – 01235 446822 • Web or email are preferred 22nd October 2001 UK Grid Support Centre - GSC Summary • Its all happening very quickly ! • Support service is now operational – but please bear with us as we get up to speed – staff recruitment in progress... • Constructive comments welcome • Priority is to get new users up and running • This is a partnership with the UK e-Science community – we’ll do our best to help you – but please also try to help us as well ! 22nd October 2001 UK Grid Support Centre - GSC Publications and URLs Publications R.J. Allan Survey of Computational Steering, Meta-computing and Network Information Tools DL-TR-99-002 (Daresbury Laboratory, 1999 and 2000) UKHEC Grid-based High Performance Computing (2000) UKHEC A Review of UK HEC Grid Infrastructure: State-of-the-art and Next Steps (2000) R.J. Allan et al. Evaluation of Globus and associated Middleware (CLRC, 2001) R.J. Allan et al. A Globus Developer’s Guide (2001) R.J. Allan Developing a Web Portal for the Computational Grid (2001) URLs www.grid-support.ac.uk www.e-science.clrc.ac.uk www.ukhec.ac.uk www.dl.ac.uk/TCSC/UKHEC/GridWorkshop www.dl.ac.uk/TCSC/UKHEC/WG www.dl.ac.uk/TCSC/HPCI/reports.html 22nd October 2001 CD-ROM Grid Starter Kit More stuff available, please call us! UK Grid Support Centre - GSC