Next Generation Back-ends Thomas Frågåt The Experimental Particle Physics Group, Department of Physics, University of Oslo NorduGrid Conference – Copenhagen 2007 Outline Objectives Requirements Architecture proposal Based on the design document produced by Adrian Taga (UiO) and Thomas Frågåt, and discussions during the KnowARC Oslo Workshop in June 2007 5/29/2016 www.knowarc.eu 2 Back-ends as of ARC v0.6 Back-end components The Grid Manager, ARC-GridFTP server, and The LDAP-server based information engine Reliable and robust In several tests proven superior to other resource managers in terms of Performance, Manageability, and Stability 5/29/2016 www.knowarc.eu 3 Objectives of Next Generation ARC Back-ends The KnowARC project aims at: Further improving the performance, manageability, and scalability Adding support for additional batch systems ARC v0.6.1 supports LSF, Condor, the PBS family, SGE, LoadLeveler, and Fork Improve and standardize the batch system interfaces Cleaned up and improved the back-ends interface in ARC v0.6.1 Offer better control of a Grid resource for the resource owner 5/29/2016 www.knowarc.eu 4 Requirements for the Next Generation Back-ends Information collection should be done in one script (i.e.: cluster.pl, qju.pl, and scan_job.pl) To avoid running almost the same batch commands multiple times. This will increase the performance Clean up job control scripts (i.e.: separate the batch system specific parts from the common parts. The job script creation should be shared code) Will make it easier to integrate with RTEs Will be easier to add new batch systems The code will be easier to maintain 5/29/2016 www.knowarc.eu 5 Requirements cont. Have support for submit, hold/release, and cancel in the job control scripts Create a clear and generalized interface between GM and scripts Add support for GLUE v2.0 for exchange of information between the HED information system and the ‘outside world’ Easier to interoperate with other Grid systems 5/29/2016 www.knowarc.eu 6 Requirements cont. Improve error handling E.g.: the same error message for the same error should be provided regardless of which batch system failed Leave the parsing of xRSL (extended Resource Specification Language) to the common framework The information system framework should only load the required back-end modules 5/29/2016 www.knowarc.eu 7 Architecture of the Next Generation ARC Back-ends 5/29/2016 www.knowarc.eu 8 External back-end scripts One for each LRMS type Job Control scripts, support for submit, hold/release, resume/retry, and cancel scripts External Script Interface External Information Collectors External Information Collectors (LRMS specific) One invocation will provide all the information possible from the LRMS 5/29/2016 www.knowarc.eu Job Control Scripts (LRMS specific) LRMS 9 External back-end scripts cont. The scripts will be called from the AREX resident components through the External Script Interface The input to all scripts should contain the necessary information for calling the LRMS executables Standard output from the scripts will be used to pass back results 5/29/2016 www.knowarc.eu 10 External Script Interface Generalized back-ends interface act like an execution layer for communication between the external scripts and the common code for the LRMSs Basically, it will pass input strings to the external scripts, run the scripts, and fetch the returned output string 5/29/2016 GM Internal Information Collector (input/output) External Script Interface External Information Collectors (LRMS specific) www.knowarc.eu Job Control Scripts (LRMS specific) 11 External Script Interface cont. The External Script Interface must be able to kill running/hanging scripts depending on the configured timeouts Should also have the possibility to kill whole group of processes 5/29/2016 www.knowarc.eu 12 The Internal Information Collector should prepare data for input, and prepare output data as Glue-aware output for information cache/engine Will be internally divided into two parts An Information Collector class A Glue v2.0 converter class 5/29/2016 HED Internal Information Collector Cache / engine (Shared between different services) GM Internal Information Collector (input/output) External Script Interface A-REX www.knowarc.eu 13 Internal Information Collector cont. Scripts should not parse configuration files themselves Configuration will be passed from HED via GM, and will be prepared in the Internal Information Collector The functionality of the currently (as of ARC v0.6) clusert.pl, and qju.pl will be taken over by the Internal Information Collector The scan_jobs.pl script will be taken over by the Internal Information Collector The Internal Information Collector will notify GM when a grid job reaches a finished state in LRMS 5/29/2016 www.knowarc.eu 14 Generalized interface layer can easily be used by other components 5/29/2016 www.knowarc.eu 15