Experiences of the Grid… Gavin McCance University of Glasgow NeSC Meeting, 24 October 2001 Background Experimental Particle Physics background Analysing the structure of matter ATLAS Barrel Inner Detector Ð H→bb b …Fortran (19)77 ! Ð b Working in ‘Grid’-like areas since January this year NeSC 24 October 2001 Gavin McCance, University of Glasgow 2/39 GridPP NeSC 24 October 2001 20+ institutes… Gavin McCance, University of Glasgow 3/39 …GridPP £17M 3-year project Working in collaboration with EU DataGrid project Middleware production Integration of middleware technologies into HEP experiments Validation of Grid Software NeSC 24 October 2001 Gavin McCance, University of Glasgow 4/39 …GridPP Initial GridPP testbed underway A personal snapshot of activities on the grid… Middleware activities we’re involved in Some examples Technologies we’re using Issues with integration of ‘Grid’ with particle physics experiments NeSC 24 October 2001 Gavin McCance, University of Glasgow 5/39 Middleware What is middleware…??? Application programs – local gridopen() Grid middleware Data access specifics – HPSS, Castor Job submission specifics – PBS, LSF Specific security procedures NeSC 24 October 2001 Gavin McCance, University of Glasgow Layered API’s. Transparent security. Transparent data access. Intelligent use of distributed resources. 6/39 Middleware Activities GridPP ~mirrors EU DataGrid: Workload Management What jobs go where? Data Management (*) Where’s the (best) data? Information Services (*) What’s the state of everything? NeSC 24 October 2001 Gavin McCance, University of Glasgow 7/39 …Middleware Activities Fabric Management Interfaces to underlying systems Mass Storage Management How to get the data to/from the fabric e.g. Implementing ‘file-save()’ APIs for different mass storage systems Security Crops up everywhere … transparent to applications NeSC 24 October 2001 Gavin McCance, University of Glasgow 8/39 Data Management Data Replication Transparent and Secure Data Access Meta Data Storage Query Optimisation NeSC 24 October 2001 Gavin McCance, University of Glasgow 9/39 Example problem: Data Replication Problems if data exist only in one place Multiple accesses to the same data overload network! Petabytes! Funding constraints! e.g. CERN can’t store all of the data required Make Replica! But need to keep track of all the files and their various replica! Need replica catalogue! NeSC 24 October 2001 Gavin McCance, University of Glasgow 10/39 …Catalogues Examples solutions: Have a globally unique Logical File Name (LFN) mapping to multiple physical instances of the file (PFNs). LFN Paris File-1 File-1 Replica selection required Glasgow File-1 Chicago Choose the ‘best’ / ‘nearest’ / ‘fastest’ File-1 Cost modelling… how time expensive to transfer files X’ from A to B NeSC 24 October 2001 Gavin McCance, University of Glasgow 11/39 …Data Replication Grid Data Mirroring Package C, C++, JAVA, command-line APIs Replication issues: File transfer… Synchronisation / consistency models Basic middleware doesn’t enforce any policy Scalable architectures NeSC 24 October 2001 Gavin McCance, University of Glasgow 12/39 …GDMP File transfer uses GridFTP Existing IETF-approved (?RFC?) ftp additions + the standard grid security (GSI) Registers new files in replica catalogue E.g. interfaced to the existing Globus Replica Catalogue Basic replica manager functionality to maintain consistency of replica sets NeSC 24 October 2001 Gavin McCance, University of Glasgow 13/39 …Implementation issues Structure not imposed by the middleware software itself… But … must think about scalable implementations E.g. a RC may exist on each storage element Æ responsible for its own files CERN Root RC INFN RC NeSC 24 October 2001 UK RC CERN RC Queries will propagate down until replica information is found… Gavin McCance, University of Glasgow 14/39 …Longer term problems Query / Replica Optimisation Grid can make / delete replica Eg. Many people in Glasgow & Edinburgh access the ATLAS Higgs dataset ‘A1’… Paris Grid might re-cluster data Paris Glasgow A1 A2 B1 Autonomously make new replica in / near Scotland based on historical information B2 NeSC 24 October 2001 B3 A1 A2 A3 A3 Gavin McCance, University of Glasgow Glasgow B1 B3 B2 15/39 …longer term real Grid... MONARC simulation tool …simulated Grid provides testing arena for more adventurous ideas! NeSC 24 October 2001 Gavin McCance, University of Glasgow 16/39 …Integration of middleware Many iterations of requirements and use-cases with end-users… meetings… Middleware solutions must be scalable and useable by a variety of end users HEP, Biological, Earth sciences, Astro Always looking for common elements E.g. replica / meta-data catalogues… data transport… security… NeSC 24 October 2001 Gavin McCance, University of Glasgow 17/39 …examples of common interfaces: generic meta-data catalogue tools SQL Database Service: Problem: many relational databases, diverse security, diverse wire protocols …Solution: Build on existing wire protocols: XML transported over HTTP(S) Grid standard security framework (GSI) NeSC 24 October 2001 Gavin McCance, University of Glasgow 18/39 ..examples Leverage open-source technology JAVA servlet based (Apache Tomcat engine) JDBC drivers Utilises Oracle’s XSQL servlet (open source) Security over HTTPS with Grid-standard GSI mechanism NeSC 24 October 2001 Gavin McCance, University of Glasgow 19/39 …examples Oracle PostgreSQL + + PKI Security Standard communication protocols (XML over HTTPS) = SQL Database Service (Spitfire) Allows any HTTP compliant system e.g. Webbrowsers / standard C++ HTTP libraries to access any relational database… NeSC 24 October 2001 Gavin McCance, University of Glasgow 20/39 Global Grid Forum Global Grid Forum meetings GGF1: Amsterdam meeting in April 2001 Helps define aspects common to all Grid-like projects. E.g. architectures, ‘grid’ protocols As example… Grid Monitoring Architecture (GMA) NeSC 24 October 2001 Gavin McCance, University of Glasgow 21/39 Information Services - GMA One Implementation of the GMA Æ Globus MDS, currently based on (Open)LDAP Hierarchical directory like structure Very fast for information retrieval if you already know the query Æ designed into structure. Bad for complex or ranged queries NeSC 24 October 2001 Gavin McCance, University of Glasgow 22/39 ..complementary implementation Register, re-register, publish Producer Producer API Producer Servlet stream subscribe Registry Servlet Query Schema Servlet Consumer Querying API Implementation of GMA ÆRelational queries in SQL format NeSC 24 October 2001 Gavin McCance, University of Glasgow Relational Database 23/39 …relational GMA Information is transferred in generic SQL format… ‘Producers’ of information register themselves… ‘Consumers’ construct (possibly complex) SQL query and are streamed query results directly from Producers. NeSC 24 October 2001 Gavin McCance, University of Glasgow 24/39 …implementation Again, uses JAVA servlets Tomcat servlet engine Again, communication with servlet is over standard HTTP. All the internal parts communicate via HTTP and XML Æ modular design, easily replaceable… NeSC 24 October 2001 Gavin McCance, University of Glasgow 25/39 Useful Tools… JAVA… nicely platform independent UML Universal(?) Modelling Language Architecture and API’s ‘should be’ defined in this…! CASE tools Together Control Centre NeSC 24 October 2001 Gavin McCance, University of Glasgow 26/39 …useful tools Globus toolkit Both the original and its java implementation (CoG) My experience of CoG so far is generally good…! Easy GSI authentication, Globus file transfer, Globus job submission, MDS interface NeSC 24 October 2001 Gavin McCance, University of Glasgow 27/39 Testbeds For GridPP, primary testbeds are the HEP experiment ones CERN LHC (EU DataGrid WP8) US experiments, e.g. Fermilab, SLAC First software release now!! Integration team ‘show-and-tell’ at CERN end of this month… NeSC 24 October 2001 Gavin McCance, University of Glasgow 28/39 ...testbed work Grid software packaged for release to experiments! Primarily packaged using RPM For end of October release, supported platforms are: Linux (and Solaris on a best effort basis) NeSC 24 October 2001 Gavin McCance, University of Glasgow 29/39 ..Globus installation Generally found the Globus software installation OK! Successfully deployed on a number of batch systems in UK Experience fed back into eScience Centres Difficulties were setting up and recognising each countries’ Certificate Authorities (CAs) Æ Tricky legal implications to resolve! NeSC 24 October 2001 Gavin McCance, University of Glasgow 30/39 Testbed work so far… UK Certificate Authority set-up… Many institutes already on testbed Grid Status and Network monitoring demonstrator available soon Networking status information provided by GridPP and DataGrid networking groups! NeSC 24 October 2001 Gavin McCance, University of Glasgow 31/39 …testbed work so far Successful tests within ATLAS (and others) of some middleware products E.g. Large file transfers between Glasgow, Italy, US and CERN Further tests planned with new release! NeSC 24 October 2001 Gavin McCance, University of Glasgow 32/39 …experimental integration Work to do… Taking the kit and trying to integrate it into the experiments’ software frameworks Make Grid Services transparently available to ATLAS and LHCb programs ATLAS/LHCb software framework (GAUDI) GANGA framework Grid middleware NeSC 24 October 2001 Gavin McCance, University of Glasgow 33/39 Grid validation Preliminary tests of basic middleware has been successful Now we have opportunity to see how it performs and scales with real datasets and real experimental users NeSC 24 October 2001 Gavin McCance, University of Glasgow 34/39 …to do… Preliminary grid software architectures have been defined Basic middleware has been delivered Large scale validation underway A excellent base to build on! Much still to do! NeSC 24 October 2001 Gavin McCance, University of Glasgow 35/39 Overall experience Middleware development is fun! Several good products have already been delivered Re-using industry standard components and protocols where they exist LDAP, SQL, HTTP(S), XML, SOAP PKI security Open Source…! NeSC 24 October 2001 Gavin McCance, University of Glasgow 36/39 …overall Middleware being built using a variety of languages… JAVA, C++, C, Python APIs should be available for all JAVA, C++, C and command line… web access(?) NeSC 24 October 2001 Gavin McCance, University of Glasgow 37/39 …overall Coordination very important Forums for discussion: Vital to ensure middleware is useful to a wide range of applications Prevent divergent technology NeSC 24 October 2001 Gavin McCance, University of Glasgow 38/39 …finally… Experimental testbeds in place Testable software in place Full integration and validation to begin in earnest Now! NeSC 24 October 2001 Gavin McCance, University of Glasgow 39/39