BaBarGrid

advertisement
BaBarGrid[1]: eScience for a B Factory experiment
T. J. Adye Rutherford Appleton Laboratory
T. Barrass The University of Bristol
D.J. Colling, J. Martyniak Imperial College London
R. J. Barlow, A Forti, M.A.S. Jones, S. Salih The University of Manchester
Keywords: Data distribution, data location, metadata, job submission
Abstract
The BaBar experiment involves 500+ physicists spread across the world, with a requirement to access
and analyse hundreds of Terabytes of data. Grid based tools are being increasingly used for the
manipulation of data and metadata, and for transferring data and analysis tasks between sites, facilitating
and speeding up the analyses and making best use of equipment. We describe many of these
developments: in authorisation, job submission, and data location. The successes in introducing Grid
technology as a solution to a pre-existent but overloaded computing model is described.
1. Introduction
Babar[2] is a B Factory particle physics
experiment at the Stanford Linear
Accelerator Center. Data collected here
are processed at four different Tier A
computing centres round the world:
SLAC (USA), CCIN2P3 (France), RAL
(UK), Karlsruhe (Germany).
Research takes place at several
institutes in the UK (all with designated
BaBar computing resources): the
Rutherford Appleton Laboratory and the
universities of Birmingham, Bristol,
Brunel,
Edinburgh,
Liverpool,
Manchester and Imperial College,
Queen Mary and Royal Holloway,
University of London.
With this
number of sites a Grid is the obvious
way to manage data analysis.
However, unlike the LHC experiments,
this has to be put on top of a large
existing software structure. The Grid
has to be retro-fitted rather than built in
from the start.
2. EDG job management for BaBar
BaBar uses the EDG[3] job submission
system, with a resource broker at
Imperial College and compute elements
and worker nodes installed at various
European and UK sites.
The user prepares their data using
skimData: an interface to databases
storing BaBar metadata (see section 4).
It is used to query what data are
available at which location. The result is
placed in a steering .tcl file.
A
comprehensive set of this and other
steering files are then compiled into one
file using dump[4]. A Job description
language file is created and the job is
submitted with user options (Figure 1):
• A user submits a job via a User
Interface (UI) to a Resource Broker (1)
Site X
U.K.
1
RC
UI
RB
8
3
2
4
9
VO
GK
Tier A
10
5
B
B
LB
7
B
Batch workers
SE
6
SE
EDG Globus Compute components
Site Compute Component
Figure 1 EDG job submission
• The RB checks the user is authorised to
use the services. (A user becomes a
member of the Virtual Organisation[1] )
and logs the request in Book-Keeping
(2) & (3).
• The RB forwards the request to the
intended Tier A’s gatekeeper (Ideally it
finds where data is by looking in a
Replica Catalogue first.) (4),
• If the user is authenticated and mapped
to a local username the job is submitted
to the local batch system. Output is
written to a Storage Element and the
Logger & Bookkeeper is updated with
status of the job (5), (6) & (7)
• The User polls the RB (8) to get the job
status. When the job has finished the
user retrieves output through the RB.
3. The Event Store
The present BaBar event store
technology for persistent data is based
on the objectivity database[6]. BaBar is
replacing this by kanga (Kind ANd
Gentle Analysis) data format.
.
Figure 2. Old Database Model
The move to kanga must have minimal
user impact, be backward compatible,
and the redesigned schema must be
hidden. No job reconfiguration should
1
The BaBar VO, created at SLAC using a cron
to check users’ certificates, and is accessed via
ldap from babar-vo.gridpp.ac.uk[5].
be necessary and the new system must
work with old data.
Figure 3. New Database Model
This is achieved with the scheme shown
here, based on the ROOT system[7]. The
distinction between different levels of
data is broken down, and it is made
possible for the user to add extra pieces
of
information
(for
example,
geometric/kinematic vertex fits) to
specific selections of data.
4. Metadata
There are now several million BaBar
event files (for real data and for
simulated data) containing many
millions of events at various stages of
processing and reprocessing. The file
names incorporate the run number, the
processing and selection program
versions, and all the other pieces of
information. Thus the file catalogue is
the metadata catalogue. Logical-tophysical file name translation is
performed at each server site, and
several strategies are employed: from a
simple directory tree to dynamic load
balancing with MSS integration [8].
The skimData tool is provided to
navigate the catalogue, and will return
to the user a list of valid files satisfying
specified criteria. At each site
skimData
uses tables in a
mySQL/Oracle database to store its
information, including whether the file
exists at the site in question.
This system has been extended with a
small extra table which gives the fileexistence information for all BaBarGrid
sites. The specific information for each
site is collected centrally with a nightly
cron job, and this is then copied out to
local sites. This mean that for N sites
the information about all the other sites
is maintained using 2N operations, as
opposed to N2, at the price of being
possibly a few hours out of date.
The user with specific criteria then
proceeds firstly by finding what data
files exist, and then finding which exist
at the sites on which they choose to run
(or, on which they are allowed to run.)
This is currently done ‘by hand’ rather
than in the fully automated styled of the
Resource Broker, but a simple to use
web based GUI is provided.
5. AFS as a sandbox solution
The ideal view of job submission to a
grid is one where the user prepares an
analysis code and small control deck
and places this in an ‘input sandbox’.
This is copied to the remote computer
or file store. The code is executed
reading from the input and writing its
results to the ‘output sandbox’. The
output sandbox is then copied back to
the user.
This simple model poses some
problems for analysis of BaBar data. On
input the BaBar job needs many .tcl
files – .tcl files source more .tcl files,
and the programs need other files (e.g.
efficiency tables). A general BaBar user
prepares their job in a ‘working
directory’ specifically set up such that
all these files are available directly or as
pointers within the local file system,
and the job expects to run in that
directory.
One solution adopted for input is to roll
up everything that might be needed in
the input sandbox and send it. – This
will produce a massive file and you can
never be sure you’ve sent all the files
that the job might require. Another is to
require all the ‘BaBar environment’
files at the remote site, but that restricts
the number of sites available to the user.
One solution adopted for output is to
email it to the user – but this is only
sensible for small files and the BaBar
outputs can be huge (it can be ntuples
for further analysis.) Another is to
expect the user retrieve the output
themselves – this requires the user or
their agent knowing where the job
actually ran. Another is for the job to
push the output back itself, requiring
write access (even a password) to the
client’s machine.
Our solution is to use AFS. The job is
executed at a remote site using globusjob-submit. Access to the resource is
based on VO management discussed
above but only at the Gatekeeper’s end.
A simple wrapper round the job
executes gsiklog to the home AFS cell.
gsiklog is a self-contained binary that
performs a klog based on the
authorisation of a grid certificate so an
AFS password is not required. The job
can then cd to the working directory in
AFS. Input files can be read and output
files can be written just as if running
locally.
Figure 4. Schematic of an AFS based
sandbox for grid submission of analyses
Figure 4 shows the topology of grid
submissions with AFS. The starting
point is that all BaBar software and the
user working directory are in AFS.
There is no need to provide and ship
data or the majority of .tcl files because
those can be accessed through links to
the parent releases.
The user locates data using skimData,
and data .tcl files are stored in the user’s
working directory. S/he creates a proxy.
Jobs are submitted to different sites
according to how the data have been
split in the skimData process. gsiklog
is either found locally, read from AFS
or copied across as required. AFS
tokens can then be acquired to allow the
job access to the working directory and
software. The output sandbox is simply
written back in the working directory
and doesn’t require any special
treatment.
AFS has a reputation for being ‘slow’.
But this is not a problem. It can be slow
when dealing with file locking for
editing. But for direct IO it is relatively
fast. It is also an established file system
which works across many platforms and
has been used by BaBar and other HEP
experiments for some time now.
Summary
This paper shows some current uses of
eScience within the UK Babar
Community.
Some of the items
discussed are still prototypes, but some
are actually being employed now for
real analyses.
References
[1] The BaBarGrid homepage:
http://www.slac.stanford.edu/BFROOT/www/C
omputing/Offline/BaBarGrid
[2] The BaBar homepage:
http://www.slac.stanford.edu/BFROOT
[3] The European Data Grid: http://www.eudatagrid.org
[4] D. Boutigny et. Al. on behalf of the Babar
Computing Group, (2003) Use of the European
Data Grid software in the framework of the
BaBar distributed computing model. Conference
Proceedings CHEP03 La Jolla
[5] GridPP Virtual Organisations:
http://www.gridpp.ac.uk/vo/
[6] Objectivity database system:
http://www.objectivity.com
[7] The ROOT system home page
http://root.cern.ch/
[8] The XRootd homepage
http://www.slac.stanford.edu/~abh/xrootd/
Download