ETF – the Good, the Bad and the Ugly! A Personal View of the Grid Engineering Task Force (its successes and failures) Rob Allan e-Science Centre CCLRC Daresbury Laboratory Rob Allan 14/7/04 The Grid “Grid technology is a response to the needs of some of today's most challenging scientific problems and enables integration and use of experimental, computing, data and visualisation resources on a global scale.” BUT e-Science is more than just the Grid! AND It needs applications and users! Rob Allan 14/7/04 What if ? • You could automatically access digital markup facilities for qualitative data (video, voice, text, historical documents, paintings, 3D material, geographical data, images); • You could test theories by applying best interpretation methodology to marked-up data; • You could do this across multiple datasets; • You could access numerical solvers with appropriate precision and performance for coupled non-linear equations; • You could match your research questions to information held in existing digital resources. Search for new explanations; • You could integrate management and learning techniques with routine research processes using context and semantics; • Integrating multiple sources could help to fill in missing data and ideas. What if? (Examples relevant to quantitative Social Science research.) Rob Allan 14/7/04 Building a Grid Infrastructure • ETF Coordination: activities are coordinated through regular Access Grid meetings, e-mail and the Web site; • Resources: the components of this Grid are the computing and data resources contributed by the UK e-Science Centres linked through the SuperJanet4 backbone to regional networks. + JCSR • Middleware: many of the infrastructure services available on this Grid are provided by Globus GT2 software. • Directory Services: a national Grid directory service using MDS links the information servers operated at each site and enable tasks to call on resources at any of the e-Science Centres. • Security and User Authentication: the Grid operates a security infrastructure based on x.509 certificates issued by the e-Science Certificate Authority at the UK Grid Support Centre at CCLRC. • Access Grid: on-line meeting facilities with dedicated virtual venues and multicast network communication. Rob Allan 14/7/04 Regional e-Science Centres You are here! Rob Allan 14/7/04 Engineering Task Force • Addresses infrastructure - not user facing • Resourced by 0.5 FTE from regional e-Science centres • Participation from e-Science Centres and Centres of Excellence • Built and co-ordinates the Globus-based “Level 2 Grid” – on resources volunteered by many institutions • Emphasis now on pre-deployment evaluation of middleware – in conjunction with NGS and OMII • http://www.grids.ac.uk/ETF/index.shtml Rob Allan 14/7/04 Grid Deployment Phases Globus (OGSI + WSRF) Service-based Wonderful World of Web Services… GT4.2 GT4 GT4 beta Library-based GT3.2 Globus GT1/ GT2 GT3 Level 4 GSC + Globus GT2… Level 3 StarterKit Level 2 Level 1 Level 5 “Production” “Core + Development” OGSA ? “Service” OGSI Testbeds Level 0 “Skeleton” “Evaluation” 2001 2002 2003 2004 2005 Rob Allan 14/7/04 Evaluation and Skeleton Level 0 • Evaluation phase – UKHEC Centres completed Globus evaluation projects and prepared a joint report for DTI – University of Edinburgh – University of Manchester – Daresbury Laboratory Level 1 • Skeleton Grid used GT1.1.4 at the Regional Centres • Grid Support Centre set up at Edinburgh, Manchester and CCLRC • Access Grid nodes installed at Regional Centres • Town Meeting and launch of GSC CD July 2001 • 6 EPSRC e-Science Pilot Projects Rob Allan 14/7/04 Level 2 Functionality Globus GT2.2 pillars include: • Resource discovery via MDS and OpenLDAP • File transfer via GASS and GridFTP • Authentication via GSI and OpenSSL • Remote login via GSI-SSH • Job submission via GRAM Current applications and tools have been built from these pillars using the C API or Java CoG kit with some additional Web services. Supplementary tools such as Access Grid and GridSite are used. We are investigating tools developed in other projects to compare with our own, e.g. comparing R-GMA (EGEE) with InfoPortal – recent NeSC workshop on Grid Information Services 2003 Rob Allan – recent NeSC workshop on Portals and Portlets 2003 14/7/04 Contribution or Collaboration ? • Large collaborative projects suffer from: – Geographically distributed partners, too difficult to manage – Too many meetings – “herding cats” • On the other hand a “community” can – Work to a commonly agreed goal – Develop independent tools – Contribute and share • View reflected in ETF “management” – Vigorous Grid deployment community – Useful tools developed – Share expertise and experiences Rob Allan 14/7/04 Adding Value to the Grid Middleware • Certificate Authority and Grid Support – OpenCA and Remedy/ ARS (CCLRC) • GITS: Grid Integration Test Script (Southampton) • InfoPortal (CCLRC) These tools have been • VOM and RUS (Imperial) developed by the Regional e• Secure IP database (Oxford) Science Centres and CCLRC. • ICENI (Imperial) • IeSE: HPCPortal and DataPortal (CCLRC) • Nimrod-G (Cardiff) • GridMon (CCLRC) Applications – were presented in talks and in the poster and demonstration sessions at AHM2003 Rob Allan 14/7/04 GITS: Grid Integration Test Script Used to test Globus functionality between sites, result matrix fed into monitoring tools via Web service Rob Allan 14/7/04 Some Applications used on L2G • • • • • • • • • • Monte Carlo - simulations of ionic diffusion through radiation damaged crystal structures. Mark Hayes and Mark Calleja (Cambridge) GENIE - integrated Earth system modelling with ICENI. Steven Newhouse, Murtaza Gulamali and John Darlington (Imperial College), Paul Valdes (Reading), Simon Cox (Southampton) BLAST – for post-genomics studies. John Watt (Glasgow) Nimrod/G - with astrophysical applications. Jon Giddy (Cardiff) DL_POLY (CCP5) - via e-Minerals portal. Rob Allan, Andy Richards and Rik Tyer (Daresbury), Martin Dove and Mark Calleja (Cambridge) Grid Enabled Optimisation - vibrations in space application to satellite truss design. Hakki Eres, Simon Cox and Andy Keane (Southampton) RealityGrid – computational steering for chemistry. Stephen Pickles, Robin Pinning (Manchester), Jonathan Chin, Peter Coveney (UCL) R-Matrix (CCP2) – electron-atom interactions for astrophysics. Terry Harmer (Belfast) GITS - David Baker (Southampton) ICENI - Steven Newhouse, Nathalie Furmento and William Lee Rob Allan (Imperial College) 14/7/04 Grid Engineering Task Force http://www.grids.ac.uk/ETF/index.shtml Rob Allan 14/7/04 A brief Pause… • Following L2G report, Easter 2003, I proposed to hold a Grid Retreat to identify further tools coming from Centes which could be integrated into the growing Grid infrastructure and ways to encourage more users. • This was instead replaced by the Stakeholders’ Town Meeting • I proposed changes to the organisation including the urgent need for a Users’ Group. • OGSI “brainstorming” meeting also held at Cosener’s House • Lack of focus meant nothing really happened after that • Many groups, including my own, started to “play” with OGSI and Web services Rob Allan 14/7/04 All Hands 2003 and after • Large number of papers presented at AHM’03 referred to work done on the L2G • Stands using L2G had the now infamous “deflating” balloons • User Group and other changes proposed in rapid series of meetings • Unfortunately the Grid also deflated as OGSI came along and the Ian Foster’s “Bump in the Road” with the announcement of WSRF • Proposed developments did not happen • Projects were uncertain which way to go • Tony Hey held his Town Meeting on 18/12/03 to address this crisis and announced new organisational structures focussed on the Grid Operations Centre Rob Allan 14/7/04 Levels 3 and 4 Level 3 (2003) • Papers written on – Grid convergence – A production Grid – Grid support • User group and UTF proposed • JCSR clusters purchased to establish what became known as NGS – ETF roadmap for NGS Level 4 (2004) • Now Evaluating Web service middleware – see breakout groups • Issues of scalability and reliability still need to be addresses • Security model ??? • OGSA Testbeds working with OGSI, OGSA-DAI and WSRF implementations Rob Allan 14/7/04 NGS: National Grid Service • • • • • • Currently in pre-production phase Core comprises – JISC JCSR-funded nodes • Compute clusters at Leeds and Oxford (64 dual processor systems) • Data clusters at RAL and Manchester (20 dual processor systems, 18 TB) • Access is free at point-of-use, subject to light-weight peer review – National HPC services HPCx and CSAR Volunteer nodes to be added subject to minimum SLD Middleware basis – Globus Toolkit version 2.4.3 (from VDT distribution) plus “goodies” – data nodes also provide Oracle, SRB, OGSA-DAI on data nodes – SRB client on compute nodes Access through UK e-Science (or other recognised) certificates First line of support provided by Grid Support Centre Rob Allan – until Grid Operation Support Centre is established 14/7/04 NGS Web site http://www.ngs.ac.uk Rob Allan 14/7/04 User Requirements • Grid user base split into groups: – Resource managers – Developers – End users • NGS must provide tools and support for all these groups • Why no tools ? • Need to understand user requirements and responses to Grid computing – User Group ??? ETF told not to do this ! Rob Allan 14/7/04 Call for more Grid Applications CCPs are “real” collaborations of people with long experience and genuine scientific challenges. JCSR Centres are being pro-active in seeking and promoting applications for the Grid (computational and data intensive). • Current e-Science testbed applications not production – Need to make apps available/ usable by non-experts – Need to map computational requirements to resources – Data-intensive applications new to Grid (except certain cases) • Many un-tapped application areas, e.g. inter-disciplinary linking several different CCP areas – Data privacy and security may be new issues • Social Science and Arts and Humanities now coming on board – Ethics and usability become new issues • e-Science is more than the Grid! Please complete the questionnaire and return to: Survey@leeds.ac.uk Rob Allan 14/7/04 Why Web Services ? • Example of “contribution” from community software development process – Commonly agreed specifications – Independent service development – Register and lookup services – In line with e-Business • OGSI -> WSRF – Attempt to specify or “manage” the resource framework • But compute- and data-intensive research is different to business – As the Grid becomes heavily used WSRF won’t work. – Needs the equivalent of a batch system!!! • Where do we go from here ? Rob Allan 14/7/04 The Grid “Client Problem” Many clients want to access a few Gridenabled resources Grid Core Workplace: desktop clients Grid Core Middleware e.g. Globus Portable clients: phones, laptop, pda, data entry… Consumer clients: PC, TV, video, AG Rob Allan 14/7/04 A Virtual Research Environment Current JCSR Call for Proposals • Be inclusive; • Enable open community process for producing and consuming services and tools; • Fast-track links to existing specifications, standards and technology to avoid new developments; • Fast-track links into existing tools, services and resources – some of which have been costly to produce and should be re-used; • Make UK services and resources available in “familiar environments” e.g. via Web browser; • However, should have choice in presentation, delivery, service and resource provision; • Integrate e-Research, e-Learning and Digital Information to simultaneously add value to all; • Must have lightweight installation procedure to overcome “client problem”; • Must demonstrate added value wrt existing tools and Portals and contain relevant training Must have real users from Day 1 and keep them engaged, e.g. by doing new things such as on-line e-Collaboration. Get feedback and Rob Allan 14/7/04 ensure that they like it! ARDA: Key Services for Distributed Analysis 1: J ob Pro vena nce Inform ation Service Auditing 2: Authentication 3: API Authorisatio n User Interface 6: 4: Accounting M etadata Catalo gue DB Proxy 14: 5: 13: File Catalogue 7: 10: Workload M anagem ent 9: Package Ma nage r Data M an agem e nt 11: 15: Stora ge Elem ent Grid M onitoring 12: Job M onitor 8: Com puting Elem ent Rob Allan 14/7/04 ARDA: API and User Interfaces API (OGSI User Interface Fac tory) Experiment Frameworks API + Authentication + Data Management + Grid Servi ce Management + Job Control + Metadata Management + NewInterface + Posix I/O POOL/ROOT/... (from Experiment Fr...) SOAP (from API) Grid File Access (from API) Grid File System Portals Grid Shel ls Storage Element (POSIX I/O service) Rob Allan 14/7/04