BaBarGrid[1]: eScience for a B Factory experiment T. J. Adye Rutherford Appleton Laboratory T. Barrass The University of Bristol D.J. Colling, J. Martyniak Imperial College London R. J. Barlow, A Forti, M.A.S. Jones, S. Salih The University of Manchester Keywords: Data distribution, data location, metadata, job submission Abstract The BaBar experiment involves 500+ physicists spread across the world, with a requirement to access and analyse hundreds of Terabytes of data. Grid based tools are being increasingly used for the manipulation of data and metadata, and for transferring data and analysis tasks between sites, facilitating and speeding up the analyses and making best use of equipment. We describe many of these developments: in authorisation, job submission, and data location. The successes in introducing Grid technology as a solution to a pre-existent but overloaded computing model is described. 1. Introduction Babar[2] is a B Factory particle physics experiment at the Stanford Linear Accelerator Center. Data collected here are processed at four different Tier A computing centres round the world: SLAC (USA), CCIN2P3 (France), RAL (UK), Karlsruhe (Germany). Research takes place at several institutes in the UK (all with designated BaBar computing resources): the Rutherford Appleton Laboratory and the universities of Birmingham, Bristol, Brunel, Edinburgh, Liverpool, Manchester and Imperial College, Queen Mary and Royal Holloway, University of London. With this number of sites a Grid is the obvious way to manage data analysis. However, unlike the LHC experiments, this has to be put on top of a large existing software structure. The Grid has to be retro-fitted rather than built in from the start. 2. EDG job management for BaBar BaBar uses the EDG[3] job submission system, with a resource broker at Imperial College and compute elements and worker nodes installed at various European and UK sites. The user prepares their data using skimData: an interface to databases storing BaBar metadata (see section 4). It is used to query what data are available at which location. The result is placed in a steering .tcl file. A comprehensive set of this and other steering files are then compiled into one file using dump[4]. A Job description language file is created and the job is submitted with user options (Figure 1): • A user submits a job via a User Interface (UI) to a Resource Broker (1) Site X U.K. 1 RC UI RB 8 3 2 4 9 VO GK Tier A 10 5 B B LB 7 B Batch workers SE 6 SE EDG Globus Compute components Site Compute Component Figure 1 EDG job submission • The RB checks the user is authorised to use the services. (A user becomes a member of the Virtual Organisation[1] ) and logs the request in Book-Keeping (2) & (3). • The RB forwards the request to the intended Tier A’s gatekeeper (Ideally it finds where data is by looking in a Replica Catalogue first.) (4), • If the user is authenticated and mapped to a local username the job is submitted to the local batch system. Output is written to a Storage Element and the Logger & Bookkeeper is updated with status of the job (5), (6) & (7) • The User polls the RB (8) to get the job status. When the job has finished the user retrieves output through the RB. 3. The Event Store The present BaBar event store technology for persistent data is based on the objectivity database[6]. BaBar is replacing this by kanga (Kind ANd Gentle Analysis) data format. . Figure 2. Old Database Model The move to kanga must have minimal user impact, be backward compatible, and the redesigned schema must be hidden. No job reconfiguration should 1 The BaBar VO, created at SLAC using a cron to check users’ certificates, and is accessed via ldap from babar-vo.gridpp.ac.uk[5]. be necessary and the new system must work with old data. Figure 3. New Database Model This is achieved with the scheme shown here, based on the ROOT system[7]. The distinction between different levels of data is broken down, and it is made possible for the user to add extra pieces of information (for example, geometric/kinematic vertex fits) to specific selections of data. 4. Metadata There are now several million BaBar event files (for real data and for simulated data) containing many millions of events at various stages of processing and reprocessing. The file names incorporate the run number, the processing and selection program versions, and all the other pieces of information. Thus the file catalogue is the metadata catalogue. Logical-tophysical file name translation is performed at each server site, and several strategies are employed: from a simple directory tree to dynamic load balancing with MSS integration [8]. The skimData tool is provided to navigate the catalogue, and will return to the user a list of valid files satisfying specified criteria. At each site skimData uses tables in a mySQL/Oracle database to store its information, including whether the file exists at the site in question. This system has been extended with a small extra table which gives the fileexistence information for all BaBarGrid sites. The specific information for each site is collected centrally with a nightly cron job, and this is then copied out to local sites. This mean that for N sites the information about all the other sites is maintained using 2N operations, as opposed to N2, at the price of being possibly a few hours out of date. The user with specific criteria then proceeds firstly by finding what data files exist, and then finding which exist at the sites on which they choose to run (or, on which they are allowed to run.) This is currently done ‘by hand’ rather than in the fully automated styled of the Resource Broker, but a simple to use web based GUI is provided. 5. AFS as a sandbox solution The ideal view of job submission to a grid is one where the user prepares an analysis code and small control deck and places this in an ‘input sandbox’. This is copied to the remote computer or file store. The code is executed reading from the input and writing its results to the ‘output sandbox’. The output sandbox is then copied back to the user. This simple model poses some problems for analysis of BaBar data. On input the BaBar job needs many .tcl files – .tcl files source more .tcl files, and the programs need other files (e.g. efficiency tables). A general BaBar user prepares their job in a ‘working directory’ specifically set up such that all these files are available directly or as pointers within the local file system, and the job expects to run in that directory. One solution adopted for input is to roll up everything that might be needed in the input sandbox and send it. – This will produce a massive file and you can never be sure you’ve sent all the files that the job might require. Another is to require all the ‘BaBar environment’ files at the remote site, but that restricts the number of sites available to the user. One solution adopted for output is to email it to the user – but this is only sensible for small files and the BaBar outputs can be huge (it can be ntuples for further analysis.) Another is to expect the user retrieve the output themselves – this requires the user or their agent knowing where the job actually ran. Another is for the job to push the output back itself, requiring write access (even a password) to the client’s machine. Our solution is to use AFS. The job is executed at a remote site using globusjob-submit. Access to the resource is based on VO management discussed above but only at the Gatekeeper’s end. A simple wrapper round the job executes gsiklog to the home AFS cell. gsiklog is a self-contained binary that performs a klog based on the authorisation of a grid certificate so an AFS password is not required. The job can then cd to the working directory in AFS. Input files can be read and output files can be written just as if running locally. Figure 4. Schematic of an AFS based sandbox for grid submission of analyses Figure 4 shows the topology of grid submissions with AFS. The starting point is that all BaBar software and the user working directory are in AFS. There is no need to provide and ship data or the majority of .tcl files because those can be accessed through links to the parent releases. The user locates data using skimData, and data .tcl files are stored in the user’s working directory. S/he creates a proxy. Jobs are submitted to different sites according to how the data have been split in the skimData process. gsiklog is either found locally, read from AFS or copied across as required. AFS tokens can then be acquired to allow the job access to the working directory and software. The output sandbox is simply written back in the working directory and doesn’t require any special treatment. AFS has a reputation for being ‘slow’. But this is not a problem. It can be slow when dealing with file locking for editing. But for direct IO it is relatively fast. It is also an established file system which works across many platforms and has been used by BaBar and other HEP experiments for some time now. Summary This paper shows some current uses of eScience within the UK Babar Community. Some of the items discussed are still prototypes, but some are actually being employed now for real analyses. References [1] The BaBarGrid homepage: http://www.slac.stanford.edu/BFROOT/www/C omputing/Offline/BaBarGrid [2] The BaBar homepage: http://www.slac.stanford.edu/BFROOT [3] The European Data Grid: http://www.eudatagrid.org [4] D. Boutigny et. Al. on behalf of the Babar Computing Group, (2003) Use of the European Data Grid software in the framework of the BaBar distributed computing model. Conference Proceedings CHEP03 La Jolla [5] GridPP Virtual Organisations: http://www.gridpp.ac.uk/vo/ [6] Objectivity database system: http://www.objectivity.com [7] The ROOT system home page http://root.cern.ch/ [8] The XRootd homepage http://www.slac.stanford.edu/~abh/xrootd/