GT2 and GT3 experiences on INWA and HPC Europa UK Globus Week – 7th April 2005 Terry Sloan Project Manager, EPCC t.sloan@epcc.ed.ac.uk +44 131 650 5155?? Overview • INWA - use of GT2 and GT3 – http://www.epcc.ed.ac.uk/inwa • HPC Europa – use of GT 3 – http://www.hpc-europa.org/ • GT 4 wish-list 31-May-16 UK Globus Week 2 INWA Terry Sloan Project Manager, EPCC t.sloan@epcc.ed.ac.uk +44 131 650 5155?? INWA Background • Funded by UK Economic & Social Research Council (UK) in the Pilot Projects in E-Social Science – Small scale projects to explore the potential of Grid technologies within the social sciences – Informing Business & Regional Policy: Grid enabled fusion of global data & local knowledge – INWA : Innovation Node Western Australia • Project Aims – Evaluate the suitability of existing grid solutions for secure distributed data mining and analysis on commercially sensitive data – Investigate the advantages of fusing public and private data enabled by a grid environment • Two funding phases – 1st phase November 2003 to August 2004 – set up INWA grid between UK and Australia – data mining between sites over the grid nd – 2 phase started November 2004, due to finish May 2005 – Addition of a node in China to the existing INWA grid – Basic data mining over the grid 31-May-16 UK Globus Week 4 Grid software employed • Transfer-queue Over Globus (TOG) v1.1 from the UK e-Science Sun Data and Compute Grids project – provides access to remote HPC resource – Uses Globus Toolkit 2.4 • Open Grid Services Architecture – Data Access and Integration (OGSADAI) Release 3.1 – provides access control and discovery of distributed heterogeneous data resources – Uses Globus Toolkit 3.0 • First Data Investigation on the Grid (FirstDIG) – grid data service browser provides SQL access to OGSA-DAI enabled resources – now part of OGSA-DAI R4.0/5.0 • Globus Toolkit 2.4 and 3.0 – Grid middleware 31-May-16 UK Globus Week 5 The INWA Grid EPCC,UK TOG Grid Engine Bank Telco OGSA-DAI Bank data OGSA-DAI UK Property Data Browser user@perth Curtin,Australia TOG Grid Engine user@edinburgh Bank Telco OGSA-DAI Telco data OGSA-DAI Australian property Data Browser 31-May-16 UK Globus Week 6 TOG (Transfer-queue Over Globus) Site B Site A a Transfer queue – – – – 31-May-16 e b c d Globus 2.2.x Grid Engine User A Grid Engine e f g User B h d Integrates Grid Engine and Globus 2.2.x/2.4 Globus GSI for security, GRAM for interaction with remote GE GASS for small data transfer, GridFTP for large datasets Written in Java - Globus functionality accessed through Java COG kit UK Globus Week 7 TOG/GridEngine/Globus set-up 31-May-16 UK Globus Week 8 UK- Australia: lessons learned • Performing Data Integration: – TimeZone date problems – Dates are stored as a time so – 6:00am Dec 25th in Perth Australia is converted to – 10:00pm Dec 24th in Edinburgh, UK – If data is processed in the UK, the wrong date is used. • Security issues: – Bugs in – Globus JavaCoG in GT3 – OGSA-DAI could not switch security for Grid data transfers – TOG had no security option – All of these have been fixed • Middleware not mature enough for commercial deployment – – – – Not out-of-the box: significant effort to build grid Bug fixes were required Sys admin skills still necessary to maintain the grid Scalability- difficulty with large results in OGSA-DAI V3.1 – Fixed in OGSA-DAI V4.0 31-May-16 UK Globus Week 9 UK-Australia-China: lessons learned • Reverse Domain Name Service (DNS lookup) tms@e3500$ nslookup -sil 129.215.56.231 Server: 129.215.56.230 Address: 129.215.56.230#53 231.56.215.129.in-addr.arpa name = e3500.epcc.ed.ac.uk • Required by GT 2 : only for sustaining connections not establishing But • In China, few IP addresses relative to demand • Usually not possible to configure reverse DNS look-up at same DNS server that handles usual forward DNS lookup • Had to explicitly configure INWA participating machines 31-May-16 UK Globus Week 10 HPC Europa Terry Sloan Project Manager, EPCC t.sloan@epcc.ed.ac.uk +44 131 650 5155?? HPC Europa • Full title: Pan-European Research infrastructure on High Performance Computing for the Science of the 21st Century • Goal: to provide advanced computational services in an integrated way to the European Research community • 14 partners across Europe • Project activities – Transnational Access Programme – Networking Activities – Joint Research Activities (JRA1, JRA2) 31-May-16 UK Globus Week 12 JRA2: Single Point of Access • Motivation – To provide a uniform access to resources of all centres, transparently and regardless of physical location • To achieve this – Building a HPC Europa portal to provide access – Develop and adapting the necessary tools – Participating JRA2 centres have existing middleware brokers on top of either Globus and Unicore – Developing a generic portal to sit on top of these • For EPCC this means – JOSH (JOb Scheduling Hierachically) with GT3.2 and Grid Engine 31-May-16 UK Globus Week 13 JOSH (JOb Scheduling Hierarchically) • Based on Globus 3 and grid User services • Adds a new 'hierarchical' scheduler above Grid Engine hiersched user Interface – Command line interface – hiersched submit_ge – Takes GE job script as input (embellished with data requirements) – Queries grid services at each compute site to find best match and submits job – Job controlled through resulting 'job locator‘ 31-May-16 Job Spec UK Globus Week Hierarchical Scheduler Grid Service Layer Grid Service Layer Grid Engine Grid Engine Input Data Site Output Data Site 14 GT 3.2 Issues • MMJFS – Rogue UHE processes blocking ports • Ease of installation – Still requires significant skills/time • Documentation • NOAA – experimental Grid between FSL and PMEL using JOSH, GT3.2, Grid Engine – Did not use for a few days, came back, it had stopped working, found error message in GT pages, no fix for it – ‘A very frustrated Globus user’ – Reinstalled GT3.2 and it all worked again but does not know why 31-May-16 UK Globus Week 15 GT 4 wish-list Terry Sloan Project Manager, EPCC t.sloan@epcc.ed.ac.uk +44 131 650 5155?? GT 4 wish-list • Easier installation and maintenance • Migration path from GT 3.2 to GT 4 • Documentation 31-May-16 UK Globus Week 17