ATLAS th G id d th UK Roger Jones

advertisement
Edinburgh - 29 January 2009
ATLAS th
ATLAS,
the G
Grid
id and
d th
the UK
Roger Jones
Lancaster University
Roger Jones: ATLAS, the Grid & the UK
1
Edinburgh - 29 January 2009
Event Data Model
z
RAW:
„
z
“ByteStream”
ByteStream format,
format ~1
~1.6
6 MB/event
ESD (Event Summary Data):
„
Full output of reconstruction in object (POOL/ROOT) format:
¾
„
Nominal size 1 MB/event initially, to decrease as the understanding of the detector improves
¾
z
Summary of event reconstruction with “physics” (POOL/ROOT) objects:
¾
„
electrons, muons, jets, etc.
Nominal size 100 kB/event (currently roughly double that)
DPD (Derived Physics Data):
„
Skimmed/slimmed/thinned events + other useful “user” data derived from AODs and
conditions data
„
DPerfD is mainly skimmed ESD
„
Nominally 10 kB/event on average
¾
z
Compromise “being able to do everything on the ESD” and “not storage for oversize events”
AOD ((Analysis
y
Object
j
Data):
)
„
z
Tracks (and their hits),
hits) Calo Clusters,
Clusters Calo Cells,
Cells combined reconstruction objects etc
etc.
Large variations depending on physics channels
TAG:
„
Database (or ROOT files) used to quickly select events in AOD and/or ESD files
Roger Jones: ATLAS, the Grid & the UK
2
Edinburgh - 29 January 2009
Computing
p
g Model: main operations
p
z
Tier-0:
„
„
„
„
z
Tier-1s (x10):
„
„
„
„
„
„
„
z
Store and take care of a fraction of RAW data (forever)
Run “slow” calibration/alignment procedures
R
Rerun
reconstruction
i with
i hb
better calib/align
lib/ li and/or
d/ algorithms
l
i h
Distribute reconstruction output to Tier-2s
Keep current versions of ESDs and AODs on disk for analysis
Run large
large-scale
scale event selection and analysis jobs for physics and detector groups
Looks like some user access will be granted, but limited and NO ACCESS TO TAPE or
LONG TERM STORAGE
Tier-2s (x~35):
„
„
„
z
Copy RAW data to CERN Castor for archival & Tier-1s
Tier 1s for storage and reprocessing
Run first-pass calibration/alignment
Run first-pass reconstruction (within 48 hrs)
Distribute reconstruction output (ESDs, AODs, DPDs & TAGS) to Tier-1s
Run analysis jobs (mainly AOD and DPD)
Run simulation (and calibration/alignment when/where appropriate)
Keep current versions of AODs and samples of other data types on disk for analysis
Ti 3
Tier-3s:
„
„
Provide access to Grid resources and local storage for end-user data
Contribute CPU cycles for simulation and analysis if/when possible
Roger Jones: ATLAS, the Grid & the UK
3
Edinburgh - 29 January 2009
Necessity
y of Distributed Computing
p
g
z
We are going to collect raw data at 320 MB/s for 50k seconds/day and ~100
d s/
days/year
„
RAW dataset: 1.6 PB/year
„
Processing (and re-processing) these events will require ~10k CPUs full time the first year
of data-taking, and a lot more in the future as data accumulate
„
Reconstructed events will also be large, as people want to study detector performance as
well as do physics analysis using the output data
„
ESD dataset: 1.0 PB/year, AOD, DPD datasets: hundreds of TB/year
„
At least 10k CPUs are also needed for continuous simulation production of at least 20-30%
of the real data rate and for analysis
y
z
There is no way to concentrate all needed computing power and storage capacity
„
The LEP/Tevatron model will not scale to this level
Roger Jones: ATLAS, the Grid & the UK
4
Edinburgh - 29 January 2009
Event data flow from online to offline
z
Events are written in “ByteStream” format by the Event Filter farm in <=2 GB files
„
Event rate 40MHz, interesting events 100-1000Hz
„
200 Hz trigger rate (independent of luminosity
luminosity, except for Heavy Ions)
„
Events will be grouped by “luminosity block” (1-2 minute intervals)
¾
One luminosity block can be approximated as having constant luminosity
¾
There should be enough information for each lumi block to be able to calculate the luminosity
„
Nominal RAW event size is 1.6 MB/event
„
Several streams ((event can be in more than one stream):
)
¾
~5 physics event streams, separated by main trigger signature
z
„
e.g. muons, electromagnetic, hadronic jets, taus, minimum bias
¾
Express stream with monitoring and calibration (physics) events to be processed immediately
¾
Calibration data streams
¾
“Trouble maker” events (for debugging)
Data will be transferred to the Tier-0
Tier 0 input buffer at 320 MB/s (average)
Roger Jones: ATLAS, the Grid & the UK
5
Edinburgh - 29 January 2009
Tier-2 Data on Disk
~35 Tier-2 sites of very, very different size contain:
Some fraction of ESD and RAW
z
„ In 2008/early data 2009: 30% of RAW and 150% of ESD in Tier-2 cloud
„ In late 2009 and after: 10% of RAW and 30% of ESD in Tier-2
Tier 2 cloud
„ This will largely be ‘pre-placed’ in early running
„ Recall of small samples through the group production at T1
„
Additional access to ESD and RAW in CAF
¾ 1/18 RAW and 10% ESD
z
10
0 cop
copies
es of full AOD on disk
d sk
z
A full set of official group DPD (in production area)
z
Lots of small group DPD (in production area)
z
User data
•
Access is ‘on demand’
Roger Jones: ATLAS, the Grid & the UK
6
Edinburgh - 29 January 2009
Tier 2 Nominal Disk Share 2009
Tier 2 Disk share 2009
Raw
ESD / DPerfD
AOD
TAG
RAW Sim
ESD Sim (curr.)
AOD Sim
Tag Sim
Group DPD
User Data
Roger Jones: ATLAS, the Grid & the UK
7
Edinburgh - 29 January 2009
Tier-3s
z
These have many forms
z
Basically represent resources not for general ATLAS usage
„
Some fraction of T1/T2 resources
„
L cal University clusters
Local
„
Desktop/laptop machines
„
Tier-3 task force provides recommended solutions (plural!):
¾
http://indico.cern.ch/getFile.py/access?contribId=30&sessionId=14&resId=0&materialId=slides&
confId=22132
z
Concern over the apparent
pp
belief that Tier-3s can host or p
pull down large
g samples
p
„
Required storage and effort, network and server loads at Tier-2s
Roger Jones: ATLAS, the Grid & the UK
8
Edinburgh - 29 January 2009
Minimal Tier-3 requirements
q
z
z
z
The ATLAS software environment, as well as the ATLAS and grid
middleware
iddl
ttools,
ls allow
ll
uss tto b
build
ild a work
k model
d lf
for collaborators
ll b
t s who
h are
located at sites with low network bandwidth to Europe or North America.
The minimal requirement
q
is on local installations,, which should be
configured with a Tier-3 functionality:
„
A Computing Element known to the Grid, in order to benefit from the
aut matic distributi
automatic
distribution
n of
f ATLAS ssoftware
ft are releases
„
A SRM-based Storage Element, in order to be able to transfer data
automatically from the Grid to the local storage, and vice versa
The local cluster should have the installation of:
„
A Grid User Interface suite, to allow job submission to the Grid
„
ATLAS DDM client tools,
tools to permit access to the DDM data catalogues and
data transfer utilities
„
The Ganga/pAthena client, to allow the submission of analysis jobs to all
ATLAS computing resources
Roger Jones: ATLAS, the Grid & the UK
9
Edinburgh - 29 January 2009
A Few Statements of Policy
y
z
The ATLAS data volumes WILL be very large
„
Even after hard selections, you will have large sets to work with
„
We have a distributed computing model
„
Every ATLAS physicist
E
h i i should
h ld h
have managed
d access to the
h data
d
&
some of the CPU at any ATLAS Tier 2 worldwide
¾
z
The same comment does not apply
pp y to Tier 1s - the Tier 1s are f
for
production role only, although some Tier 1s have attached Tier 2s
But not too distributed
„
Attempt to have all analysis sets within a cloud
¾
„
The UK is a cloud, roughly 10% of the ATLAS total
Th cloud
The
l d has
h some degree
d
of
f autonomy
t
¾
We set the policy the placement of sets within the cloud
¾
We have also set aside some Tier 2 disk storage and some Tier 2 CPU
for UK-only use, beyond the disk & CPU pledged to the whole of ATLAS
Roger Jones: ATLAS, the Grid & the UK
10
Edinburgh - 29 January 2009
User Data Movement Policy
y
z
Jobs go to the data, not data to the job
z
Users need to access the files they produce
„ This means they need (ATLAS) data tools on Tier 3s
z
There is a risk: some users may attempt to move large
data volumes
¾ SE overload
¾ Network congestion
¾ DDM meltdown
„ ATLAS policy in outline:
¾ O(10GB/day/user) who cares?
¾ O(50GB/day/user) rate throttled
¾ O(10TB/day/user) user throttled!
¾ Planned large movements possible if negotiated
Roger Jones: ATLAS, the Grid & the UK
11
Edinburgh - 29 January 2009
The UK and Data Placement
z
The movement and placement of data must be managed
„
Overload of the data management system slows the system down
for everyone
„
Unnecessary multiple copies waste disk space and will prevent a full
set being available
„
Some multiple copies will be a good idea to balance loads
z
We have a group for deciding the data placement:
z
UK Physics
y
Co-ordinator, UK deputy
p y spokesman,
p
Tony
y Doyle
y (UK
data ops), Roger Jones (UK ops) + Stewart, Love & Brochu
„
The UK Physics co-ordinator consults the institute physics reps
„
The initial data plans follows the matching of trigger type to site
from previous exercises
„
We will make second copies until we run short of space
space, then the
second copies will be removed *at little notice*
Roger Jones: ATLAS, the Grid & the UK
12
Edinburgh - 29 January 2009
Grid Storage
g for the UK
z
z
The general user area on ATLAS Tier 2s is scratch space
„
There is no guaranteed lifetime, as we cannot control the writing
of new files - users need to be responsible
„
The aim is to have files on disk for many days (a month?)
month?), allowing
them to either be discarded, used and discarded or used and
moved to secure storage
So where can I keep files?
„
On your local non-Grid storage
„
I ATLASLOCALUSERDISK space
In
¾
This means local to the UK - your certificate must be in the ATLAS-UK
g
group
p
¾
We must manage this space!
¾
Be responsible! We would like to avoid heavy policing of this space, but
can and
d will
ill if people
l go crazy
Roger Jones: ATLAS, the Grid & the UK
13
Edinburgh - 29 January 2009
CPU in the UK
z
20% of the UK Tier 2 Grid capacity in the UK is allocated for
UK specific usage
„
At present, we have not implemented this, as we are not heavily
loaded
„
When we do, this will also be based on VO membership, with some
queues only open to ATLAS-UK
Roger Jones: ATLAS, the Grid & the UK
14
Edinburgh - 29 January 2009
ATLAS Software & Computing
p
g Project
j
z
z
The ATLAS Collaboration has developed a set of software and middleware tools
that enable access to data for physics analysis purposes to all members of the
collaboration, independently of their geographical location.
Main building blocks of this infrastructure are:
„
„
z
The Athena software framework, with its associated modular structure of the event data
model, including the software for:
¾
Event simulation;
¾
Event trigger;
¾
Event reconstruction;
¾
y
analysis
y
tools.
Physics
The Distributed Computing tools built on top of Grid middleware:
¾
The Distributed Data Management system;
¾
The Distributed Production System;
¾
The Ganga/pAthena frameworks for distributed analysis on the Grid.
¾
Monitoring and accounting
DDM is the central link between all components
„
As data access is needed for any processing and analysis step!
Roger Jones: ATLAS, the Grid & the UK
15
Edinburgh - 29 January 2009
Disillusionment?
Gartner Group
HEP Grid on the LHC timeline
2003
2004
2007
2002
2008
2006
2005
Roger Jones: ATLAS, the Grid & the UK
16
Edinburgh - 29 January 2009
ATLAS Distributed Data Management
g
z
z
z
The DDM design is based on:
„
A hierarchical
hi
hi l definition
d fi i i of
fd
datasets
„
Central dataset catalogues
„
Data blocks as units of file storage
g and replication
p
„
Distributed file catalogues
„
Automatic data transfer mechanisms using distributed services (dataset
subscription system)
There are also local tools to allow you to access data from the grid from a gridenabled local site
How do I find the data?
„
You can use tools like AMI and ELSSI
„
See James’ talk, or attend a tutorial at CERN or in Edinburgh
g at the end of this month!
Roger Jones: ATLAS, the Grid & the UK
17
Edinburgh - 29 January 2009
Central vrs Local Services
z
z
z
The DDM system has now a central role with respect to ATLAS Grid tools
One f
O
fundamental
d
t lf
feature
t
is th
the presence
s
of
f dist
distributed
ib t d fil
file catalogues
t l
s and
d ((above
b
all)
ll)
auxiliary services
„
Clearly we cannot ask every single Grid centre to install ATLAS services
„
We decided to install “local” catalogues and services at Tier-1 centres
¾
VO Box
¾
FTS channel server (both directions)
¾
L
Local
l file
f l catalogue
l
(part
(
of
f DDM/D
DDM/DQ2)
2)
We believe that this architecture scales
to our needs :
„
Moving several 10000s files/day
„
Supporting up to 100000 organized
production jobs/day
„
Supporting the analysis work of
>1000 active ATLAS physicists
T0VObox
LFC
T1
T1
VObox
LFC
C
T2
T2
Roger Jones: ATLAS, the Grid & the UK
….
FTS Server T0
FTS Server T1
LFC: local within ‘cloud’
All SEs with SRM interface
18
Edinburgh - 29 January 2009
Distributed Analysis
y
z
z
The ATLAS tool for distributed analysis is GANGA
„
Supports all
ll three
h
Grid
G d flavours,
fl
local
l
l jobs
b and
db
batch
h systems llike
k
„
It is quite generic, and can support all sorts of jobs, not just Athena
„
Can be command line,
line scripted & also has a GUI
But what’s this pAthena then?
„
Developed for running Athena jobs on OSG
„
Uses the same back-end as the production system
„
Has the equivalent of the GANGA command line interface
„
Not yet compliant with EGEE and NDGF requirements for general users
z
GANGA has back end to same system; the 2 projects being integrated
integrated.
z
DA tests have been invaluable, especially in the UK
„
Feedback from UK users was very good,
good and encouraging
„
Tests of pilot-based system in the UK a credit to UK
Roger Jones: ATLAS, the Grid & the UK
19
Edinburgh - 29 January 2009
Shifts,, Operations
p
& User Support
pp
z
The computing operations need effort both at CERN & in the UK
z
There is a lot of information available, but it not always clear
where to look
„
We have a UK computing operations wiki that should inform you of
general problems in the UK cloud http://www.atlas.ac.uk/ops.html
„
Also an ATLAS UK Grid operations page
http://www.atlas.ac.uk/grid.html
„
UK mailing lists:
¾
atlas-uk-comp-users@cern.ch (UK atlas user support, discussion, and
announcements)
¾
gridpp-users@jiscmail ac uk (Support for all GridPP experiments)
gridpp-users@jiscmail.ac.uk
„
There is a weekly UK operations meeting that addresses problems
in the UK cloud
„
A Savanna bug tracking system for UK issues. Best used by the
expert team after site problem has been identified,.
Roger Jones: ATLAS, the Grid & the UK
20
Edinburgh - 29 January 2009
The ATLAS UK Grid page
p g
Roger Jones: ATLAS, the Grid & the UK
21
Edinburgh - 29 January 2009
Is everything
y
g ready
y then?
z
z
Unfortunately not yet: a lot of work remains
„
Thorough testing of existing software and tools
l
„
Optimisation of CPU usage, memory consumption, I/O rates and event size on disk
„
Completion
p
of the data management
g
tools with disk space
p
management
g
„
Completion of the accounting, priority and quota tools (both for CPU and storage)
Just one example (but there are many!):
„
In the computing model we foresee distributing a full copy of AOD data to each Tier-1, and
an additional full copy distributed amongst all Tier-2s of a given Tier-1 “cloud”
¾
In total, >20 copies around the world, as some large Tier-2s want a full set
¾
This model is based on general principles to make AOD data easily accessible to everyone for
analysis
„
In reality, we don’t know how many concurrent analysis jobs a data server can support
¾
Tests could be made submitting large numbers of grid jobs to read from the same data server
z
Results will be functions of the server type (hardware, connectivity to the CPU farm, local
file system, Grid data interface) but also access pattern (all events vs sparse data in a file)
„
If we can reduce the number of AOD copies, we can increase the number of other data
samples (RAW, ESD, simulation) on disk
Roger Jones: ATLAS, the Grid & the UK
22
Download