CMS Status Bristol, Brunel Progress towards GridPP milestones

advertisement
CMS Status
Bristol, Brunel and Imperial College
Progress towards GridPP milestones



Data management – the Data Challenge 2004
Batch analysis framework
Monitoring using the EDG R-GMA middleware
The GRIDPP funded cast list





Tim Barrass
Barry MacEvoy
Owen Maroney
JJ “Henry” Nebrensky
Hugh Tallini
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
CMS and LCG2

CMS Data Challenge DC04 has three components

Tier-0 challenge. Reconstruction at CERN
• Complex enough. Doesn’t need grid per se
• But will publish catalog to CERN RLS service

Distribution challenge. Push/Pull data to Tier-1’s
• Want to use LCG tools, can use SRB. Questions of MCAT/RLS
coherence, SRB pool issues etc. Analysis/Calibration Aspects
• At Tier-1/2 centers (not at CERN during DC04 “proper”)
• Encourage use of LCG2 and GRID3 to run these

Aim to complete first two in “March”


Expect last one to continue and be repeated over next 6 months
as LCG matures. Factorized from Tier0 and distribution
challenges
CMS expert manpower is saturated with work for DC04.
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
DC04 Production

Data Challenge: March 2004 nominal






An end-to-end test of the CMS offline computing system
25% of full world-wide system to be run flat-out for one month
Key test of our Grid-enabled software components
‘Play back’ digitized data, emulating CMS DAQ -> storage,
reconstruction, calibration, data reduction and analysis at T0 &
external T1
Some T2 involvement as “clients” of local T1 centres
T0 to T1 data transfer



New transfer management database
Refinements to schema
CASTOR issues to be resolved (SRM export, 3Tb buffer needed)
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
DC04: Catalogue Deployment
●
Synchronised RLS (LRC & RMC) deployment
●
●
Expect to have Oracle DB deployed at CERN and CNAF
Deployment at RAL and FZK could follow this
●
●
●
May not be achieved on timescale of DC04
Cannot afford to plan on this being in place
Tier 1s without RLS will need POOL MySQL catalogue
●
●
FNAL, RAL, Lyon, FZK
Catalogue should be updated by Tier 1 agent
●
●
FCatalog tool copies POOL data from CERN RLS to local MySQL
catalogue
Catalogue updated as files are transferred from CERN
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Catalogue Use
●
CERN RLS
●
●
Initial registration of POOL data by reconstruction jobs
Registration of files and replicas in SRB by GMCat
●
●
Registration of files produced by analysis jobs ‘outside’ LCG-2
●
●
RAL, Lyon
Only files which are made “globally” accessible in an SRB or SRM server
CNAF RLS
●
●
●
Replication of files to LCG-2 SE
Queries by LCG-2 analysis jobs
Registration of files produced by LCG-2 analysis jobs
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Distribution of Data from T1 to T2
●
LCG-2 sites
●
●
●
●
Distribution through EDG replica manager
Registration in CNAF RLS
Jobs access data from CNAF RLS
Non LCG-2 sites
●
Tier 2 can access POOL data from Tier 1 MySQL catalogue using
Catalo tools
●
Creates local catalogue – XML or local MySQL
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
SRB MCat Failover
●
●
Backup MCat server at Daresbury
Oracle failover solution to be installed soon
●
●
Maintains mirror copy of Oracle backend between RAL and
Daresbury
If RAL MCat has problems will switch to Daresbury
●
●
No need to change SRB server or client
Minimal downtime of MCat
●
●
Change in DNS registration
GMCat service in deployment
●
Optimisation testing on local MySQL LRC & RLS
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Catalogue Summary
●
●
●
RLS Catalogue deployment at CERN and CNAF still
expected
MCat server operational, backup service improved
Likely absence of RLS Catalogues at RAL, FZK
●
●
●
Tier 1s to install local POOL MySQL Catalogues
agents to populate them
Onward distribution to Tier 2s
●
Detailed configuration requires:
●
●
How data is streamed?
To which Tier 2s?
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Batch Analysis Framework
Gridified ORCA Submission System
“GROSS”

Simple UI suitable for non-expert end user

Extensible architecture (as requirements change/get
better defined for DC04 and beyond)

No modification required to ORCA (transparent running on
Grid).

No additional s/w required remotely

Integrates directly to BOSS
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
“GROSS” System Design
Schematic Architecture
USER
PHYSICS
META-CATALOG
UI
JOB SUBMISSION MODULE
MONITORING MODULE
DATA INTERFACE
RB
BOSS
GRID
COMMON BOSS/AF DATABASE
WN
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
How it works
User submits to AF a
single analysis TASK
which comprises:
•ORCA executable
•ORCA user libraries
•Metadata catalogue query
User additionally
specifies:
•Which BOSS DB to use
•Any additional DB to write output details to
•Which metadata catalogue to query
•What to do with output data and logs (in sandbox, register
somewhere, etc).
•Suffix for output filenames
Submission module:
•Makes data query on catalogue
•Splits TASK into multiple JOBS (1 job per run)
•Creates a JDL for each JOB
•Creates wrapper script and steer file for each JOB
•Submits each job (through BOSS)
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Wrapping the ORCA job

ORCA executable wrapper running on WN will






Set up appropriate ORCA environment
Copy input sandbox/input data to working area
Link to correct user libraries
Run executable
Deal with output files
Wrapper is shell script + steering file


One (or many) standard shell script registered in db (but easy to
modify and re-register)
Unique steering file created for each job by submission system
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Data Handling

Data handling part missing right now

What we need


Definition of Physics Meta-Catalogue
Ability to query this meta-catalogue to give
• List of GUIDs per run of data to be included in Grid submission JDL.
This will direct where the job runs (Given that no movement of input
data will take place – i.e. the job will always run where the data is).

Where to catalogue output data for group analysis (AF can handle
writing to multiple DBs – e.g. writing to private local BOSS DB and
to metadata cat.)
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
“GROSS” Summary

Tested extensively on ORCA 7.5.0 + LCG-1

Now installed on LCG-2 UI at Imperial College

Build extra functionality



Multiple DB support
…
Most important missing piece:

Data interface to meta-catalogue (ready to be plugged
into the rest of the framework).
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
RGMA+Boss: Overview

CMS jobs are submitted via a UI node for data analysis.

Individual jobs are wrapped in a BOSS executable.

Jobs are then delegated to a local batch farm managed by
(for example) PBS.

When a job is executed, the BOSS wrapper spawns a
separate process to catch input, output and error streams.

Job status is then redirected to a local DB.
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Boss
CMS
EDG
RefDB
BOSS
DB
parameters
Job output filtering
Runtime monitoring
WN
UI
IMPALA/BOSS
JDL
Workload
Management
System
SE
CE
CMS software
CE
Push data or info
Pull info
CMS Report – GridPP Collaboration Meeting IX
Replica
Manager
SE
data
registration
CE
CMS software
Peter Hobson, Brunel University
SE
4/2/2004
Where R-GMA Fits In


BOSS designed for use within a local batch farm.
– If a job is scheduled on a remote compute element, status
info needs to be sent back the submitter’s site.
– Within a grid environment we want to collate job info from
potentially many different farms.
– Job status info should be filtered depending on where the
user is located – the use of predicates would therefore be
ideal.
R-GMA fits in nicely.
– BOSS wrapper makes use of the R-GMA API.
– Job status updates are then published using a stream
producer.
– An archiver positioned at each UI node can then scoop up
relevant job info and dump it into the locally running BOSS
db.
– Users can then access job status info from the UI node.
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Use of R-GMA in BOSS
UI
IMPALA/BOSS
Sandbox
WN
BOSS wrapper
Job
Tee
OutFile
R-GMA API
BOSS
DB
Receiver
CE/GK
servlets
servlets
Registry
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Job
Test Motivation

Want to ensure R-GMA can cope with volume of expected
traffic and is scalable.

CMS production load estimated at around 5000 jobs.

Initial tests* with v3-3-28 only managed about 400 - must
do better. (Note: first tests at Imperial College a year ago
fell over at around 10 jobs!)
*Reported at IEEE NSS Conference, Oregon, USA, 21-24 October 2003
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Test Design

A simulation of the CMS production system was created.
– An MC simulation was designed to represents a typical
job.
– Each job creates a stream producer.
– Each job publishes a number of tuples depending on the
job phase.
– Each job contains 3 phases with varying time delays.

An Archiver collects published tuples.
– The Archiver db used is a representation of the BOSS db.
– Archived tuples are compared with published tuples to
verify the test outcome.
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Topology
Archiver Mon Box
SP Mon Boxes
Archiver Client
IC
Boss DB
MC Sims
Test Output
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
Test verification
4/2/2004
Test Setup

Archiver & SP mon box setup at Imperial College.

SP mon box & IC setup at Brunel.

Archiver and MC sim clients positioned at various nodes
within both sites.

Tried 1 MC sim and Archiver with variable Job
submissions.

Also setup similar test on WP3 test bed using 2 MC sims
and 1 Archiver.
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Results

1 MC sim creating 2000 jobs and publishing 7600 tuples
proven to work without glitch.

Bi-directional 3000+3000 jobs from Imperial College to
Brunel (and v.v.) worked without problems.

Demonstrated 2 MC sims each running 4000 jobs (with
15200 published tuples) on the WP3 test bed. Peak
loading was ~1000 jobs producing data simultaneously.
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Pitfalls Encountered

Lots of integration problems.
– Limitation on number of open streaming sockets – 1K.
– Discovered lots of OutOfMemoryErrors.
– Various configurations problems at both imperial College
and Brunel sites.
– Usual firewall “challenges”

Probably explained some of the poor initial performance.
Scalability of test is largely dependent on the
specs of the Stream Producer/Archiver Mon
boxes.
i.e. > 1Gb memory and fast processor
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Overall Summary

Preparations for a full scale test of CMS production over a
Grid (T0 + T1 + some T2) well underway. Still on target for
1 March start-up.

New Batch Analysis framework “GROSS” being deployed
(with BOSS and RGMA) via rpm for DC04

Scalability of RGMA now approaching what is needed for
full production load of CMS.
CMS Report – GridPP Collaboration Meeting IX
Peter Hobson, Brunel University
4/2/2004
Download