ATLAS software workshop at Brookhaven National Laboratory

advertisement
ATLAS software
workshop at
BNL
23-28 May 2004
Summary of notes
taken during talks and
some slides taken from
interesting talks –
Ricardo, 29 May 2004
Outline
Overall notes and plans
Core software, distribution kit, etc
Simulation and detector description
Reconstruction and event selection
Analysis tools and Event Data Model
Combined Test Beam
Overall notes and plans
Many different sessions and working group
meetings…will try to summarize and will focus on some
subjects more than others.
Also, most of this talk is a transcription of my notes, not
all of it makes sense of is very explicit.
•
•
•
•
•
•
•
•
•
•
•
•
Software plenary sessions I & II
Event selection, reconstruction and
analysis tools
Software infrastructure
Simulation and detector description
Data challenge
Framework working group
Grid and distributed analysis
Event Data Model (EDM) working
group
SUSY working group
Reconstruction working group
Physics coordination
Physics Event Selection Algorithms
(PESA) working group
•
•
•
•
•
•
•
•
•
•
•
•
Grid working group
Physics validation
Detector description working group
Distributed analysis working group
Database and calibration/alignment
Combined performance
International computing board
Analysis tools working group
Database working group
Software distribution and
deployment working group
Calibration/alignment working group
Software plenary session III
• Major production activities soon: DC2, Combined Test
Beam
• Also: HLT testbed and physics studies leading up to
Physics Workshop 2005
• Next software week 20-24 September: after DC2 and at
end of beam test
• Documentation is only in the minds of a few people –
some plans to document parts of software during the next
year
• Several buzzwords in this meeting:






Geant4
Pileup
Digitization
event mixing
EDM: ESD/AOD (what information should be in AOD? 100kB max)
Use of Python job options (no more athena.exe in > 8.2.0, use
jobOptions_XXX.py)
 Grid
 Calibration etc….
ATLAS Computing Timeline
2003
• POOL/SEAL release (done)
• ATLAS release 7 (with POOL persistency) (done)
• LCG-1 deployment (done)
2004
• ATLAS complete Geant4 validation (done)
NOW
• ATLAS release 8 (done)
• DC2 Phase 1: simulation production
2005
• DC2 Phase 2: intensive reconstruction (the real challenge!)
• Combined test beams (barrel wedge)
• Computing Model paper
2006
• Computing Memorandum of Understanding
• ATLAS Computing TDR and LCG TDR
• DC3: produce data for PRR and test LCG-n
2007
• Physics Readiness Report
• Start commissioning run
• GO!
Planning (T.LeCompte)
• For the first time delay in schedule is consistent with zero
• Several project reviews expected soon
• Inner Detector and Core software seem to need lots of
work and people
Core software, distribution, validation, etc
Core software
Release plans:
8.2.0 – 22 May
8.3.0 – 9 June
Core software (C.Leggett)
9.0.0 – 30 June – DC2 production
• Gaudi:
• Current version is v14r5+ATLAS-specific patches (ATLAS version
0.14.6.1)
• Changes for v14:
–
–
–
–
Uses SEAL, POOL, PI (?)
AIDA histogram Svc replaced with ROOT
GaudiPython
Events merging: can now control exactly which events from 2 files are
merged
– Pileup: see Davide's talk Tuesday
– Interval of Validity (IoV) Svc improved
– v8.2.0: new version of CLHEP causes wrong version of HepMC to load
when using athena.exe (not athena.py) -> segfault
• Athena and Gaudi: heading towards rel. 9, Gaudi v15 (try to stay in step
with LHCb); extend installArea to gaudi
Distribution kit
• Development platform vs. deployment platform
• Kit philosophy:
– Address multiple clients: running binaries/doing development
– Possibility of downloading partial releases, binaries only
(Gregory working on this one), source code…
– Multiple platforms (not yet)
• Until recently only RedHat7.3 + gcc 3.2 supported – in
future, CERN Enterprise Linux (CEL) 3 + gcc 3.2.3 /
icc8 (64-bit compiler) / Windows / MacOSX
• Complication in development platforms from external
software – need to cleanup packages and think about
dependencies
• Runtime environment has some problems – not easy to
setup for partial distributions
• Pacman/Kit tutorial to be organized soon
Validation (D.Costanzo)
• 3 activities:
- reconstruction of DC events with old zebra events
- validation of generators for DC2
- reconstruction of G4 events for DC2
• Plan for next year: generate more events and prepare for
physics workshop 2005
• v8.2.0 should be next usable release (moore/muid?),
exercise reconstruction
• Various details:
–
–
–
–
–
Some degeneration of reconstruction was found wrt 7.0.2
Atlfast needed to validate generated event
Several problems found e.g. xKalman in 8.0.3
A lot of activity in electron reconstruction
Jet/ETmiss: jetRec under restructuring, H1-style weights need to
be recalculated for Geant4
– DC2 going into production mode
Grid infrastructure (Luc Goosens, et al.)
•
production system architecture (2)
– executors
• one for each facility flavour
– LCG (lexor), NG (dulcinea), GRID3 (Capone), PBS, LSF, BQS, Condor?, …
• translates facility neutral job definition into facility specific language
– XRSL, JDL, wrapper scripts, …
• implements facility neutral interface
– usual methods: submit, getStatus, kill, …
– data management system
• allows global cataloguing of files
– we have opted to interface to existing replica catalog flavours
• allows global file movement
– an atlas job can get/put a file anywhere
• presents a uniform interface on top of all the facility native data management tools
• implementation -> Don Quijote
– see separate talk
Data Management (database): Don Quixote
Supervisor: Windmill
Executors: NG (dulcinea), LCG (lexor), GRID3 (Capone), batch
dms
Don Quixote prodDB
super
jabber
super
LCG
exe
Windmill
jabber
G3
exe
RLS
NG
super
jabber
NG
exe
RLS
LCG
super
jabber
jabber
LCG
exe
super
LSF
exe
Executors
LSF
Grids
RLS
Grid3
DC2 running (S.Albrand)
• DC2 will have a much more sophisticated production system
than DC1 and uses POOL instead of ZEBRA
• Only one database for all data types; data type is stored in
database
• Users add properties to their datasets (collection of logical
files) from a predefined list of possible properties
• Allows database searches for datasets
• Dataset names proposed to user (mods. possible)
• User interface to submit tasks
• Datasets not owned by one physics group alone anymore;
notion of principal physics group
• Possibility to extract parent and children datasets: parent
may be generator level dataset and children may be
reconstructed dataset
Simulation and Detector Description
Detector description (J.Boudreau)
• The detector description in
“GeoModel” will be used in both
simulation and reconstruction
• LAr is last detector missing in
GeoModel
• TileCal has finished GeoModel
description but will not be
available for DC2
• Volumes know their subvolumes: calculate dead material
etc in transparent way
• Relational database (Oracle) will
contain versions of whole
detector: no hardwired detector
description numbers in
reconstruction code
• Changing things such as
alignment will be dealt with
through different versions of
detector description in database
• Great live demo of GeoModel!
Simulation (A.Rimoldi)
Simulation (A.Rimoldi)
• Generators in advanced stage in both Atlfast and
Atlsim: added “Cascade” and “Jimmy”; some
differences between G3 and G4 Herwig events
• G4atlas being debugged: use versions > 6.0; new
Geant4 release 6.1 (25/3/2004)
• Digitization: most infrastructure software now in
place, but work to do for each subdetector
• Pileup: under development and test; different
subdetectors can be affected by different bunch
crossings
• Combined test beam: general infrastructure is
ready
G4ATLAS status (A.Dell'Acqua)
• Concentrating on DC2 production
• G4 v6.1 gave some problems, went back to v6.0
• 6.0 has some physics problems (which don't seem to be
serious from his tone), but aiming for rubustness at the
cost of some accuracy
• MC truth: fully deployed with rel.8.0.2, full support in
v8.0.3
• Need guinea pigs to test MC truth and for simulation and
reconstruction
• G4 truth uses same HepMC format as generated event
• Migration to python job options
Digitization and Pileup (D.Costanzo)
• Sub-detector breakdown: ID getting there, calorimetry
very mature since DC1, muons not so
• Pileup: each detector reads pileup from different bunch
crossings
• Number of files in POOL should be kept low: many files
leads to mem.leaks!
• Disk usage explodes with use of pileup. most of this is
MC truth for backgr.event
• Which doesn't have to be written, will be eliminated
• Memory leak problems
Detector Simulation Conclusions
•
•
•
•
•
•
G4atlas (A.Rimoldi):
Work on digitization and pileup
Full deployment of MCtruth
Migration to Python job options
Major changes in AtlasG4Sim to converge with atlas
# human resources problem (Armin)
•
•
•
•
•
Digitization and Pileup (D.Costanzo)
Detector folders  each detector reads hits from different bunch crossings
Emphasis moved to detector description in GeoModel
Inner Detector: Noise was implemented with no hits from pixel/SCT
LAr calorimeter: Improvements wrt G3; use of GeoModel
Still bugs and calibration issues in G3/G4
Muon spectrometer: migration to GeoModel done in the last few days
Effort put in validation of hit positions
Detailed simulation of MDT digitization and response
Realistic pileup procedure still needs work
Setup for combined test beam in place (or almost)
•
•
•
Analysis Tools
Analysis tools (K.Assamagan)
•
•
•
•
•
•
RTF recommendations: looked at
modularity, granularity & design of
reconstruction software
AnalysisTools: span gap between
reconstruction and ntuple analysis
Tools: Artemis analysis framework
prototype – Seems to diverge from
EDM, may not be supported for long
PID - prototype to handle part.identif.
Workshop in april at UCL:
http://www.usatlas.bnl.gov/PAT/ucl_
workshop_sumary.pdf
- PyRoot, PyLCGDict
- Physicists Interfaces (PI) project:
– extends AIDA; provides services for
batch analysis in C++, fitting and
minimization, storage in
HBOOK/ROOT/xml,
– plotting in
ROOT/Hippodraw/OpenScientist,
etc.
data
Data flow
•
algorithms
Python scripting language (W.Lavrijsen)
• GaudiPython: provides binding to Athena core objects;
basis for job options after release 8.2.0
• PyROOT: bridge between Python and ROOT; distributed
with ROOT 4; http://root.cern.ch/root/HowtoPyROOT.html
• PyBus: software bus (modules can be “plugged in” to bus)
implemented in Python; http://cern.ch/wlav/pybus
• ASK (Athena Startup Kit): DC2-inspired tutorial online; full
chain of algorithms (generators => ... => simulation => ... =>
analysis); http://cern.ch/wlav/athena/athask/tutorials.html
Analysis with PyROOT (S.Snyder)
• Why? Because ROOT C++ not reliable enough,
python much better and as good if
• No speed problems found
• PyROOT now part of ROOT release
• Can use pyroot to interface own code (or any
code that cint would do)
• PyLCGDict, does the same thing with different
data dictionary (data definition). When to use?
• If you have external code that already has
ROOT/LCG dictionary that helps to decide
• PyRoot has less dependencies
Reconstruction, Trigger and Event Data Model
Reconstruction (D.Rousseau)
• RTF recommendation: Detectors
(e.g.TileCal and LAr) should share
code to facilitate downstream
algorithms
• Combined beam test: analyse CBT
data with offline code, as little
dedicated code as possible; big effort
e.g. to integrate conditions database
for various detectors
• DC2 reconstruction (release 9.x.x )
• Run on Geant4 data with new
detector description; validate G4
• Persistence:
– ESD (event summary data), EDM
output: issues with ~200k cal cells;
target size 100kB/ev
– Ongoing discussions on AOD
(analysis object data) definition: aim
for 10kB/ev
Reconstruction (D.Rousseau)
• Work model:
Reconstruction  Combined ntuple  ROOT/PAW analysis
– Changes to:
reconstr  ESD/AOD  analysis in Athena  small ntuple  ROOT
– CBNT remains as debugging tool but will not be
produced in large scale for DC2
• Status:
– Python job options (no more jobOptions_xxx.txt)
– People needed for transverse tasks: documentation,
offline/CBNT reconstruction integration, AOD/ESD
definition
Calorimeter reconstruction (P.Loch)
• CALO EDM has navigable classes CaloCell, CaloTower and
CaloCluster using consistent 4-momentum representations
(INavigable4Momentum), v8.1.0); can now be used directly by
JetRec
• Container class CaloCellContainer holds both LAr and TileCal
CaloCells and persistifies them in StoreGate (8.2.0, key “AllCalo”);
used by LArCellRec, LArClusterRec, TileRecAlgs explicitly
• Clusters produced by CaloClusterMaker (Sven Menke, v8.1.0,
topological and sliding window clusters) have full 3D neighbours
option, crossing boundaries between calorimeters (8.2.0)
• Cluster splitter with 3D clusters spanning different calorimeters
under test (aim for 8.3.0) – finds individual showers (peaks) in large
connected cluster
Calorimeter reconstruction (P.Loch)
•
New structure for algorithm class CaloTowerAlgorithm – calls different
builders for towers according to calorimeter – makes older FCAL minicells
obsolete
•
CALO algorithm structure slightly behind EDM: new CaloCellMaker (David
Rousseau) to be tested – makes cells also calls cell-type corrections (>10
corrections for LAr) – aim for 8.3.0, needed in 9.0.0
•
No hardwired numbers in code anymore, detector description/job options
(Database? Job options?)
•
Implement relations between cells and clusters for 9.0.0 (using STL
maps?) – Proposal to have classes such as “particleWithTrack” to
implement relations was rejected
•
Asked for volunteers for design, implementation and testing of both the
calorimeter EDM and the reconstruction algorithms
Tracking (E.Moyse)
Many recent developments:
• new tracking class; converters from old formats
• very good Doxygen documentation available from the ID software
homepage
• A lot of reorganization and new packages recently
• track extrapolation for ID - Dmitri Emeliyakov
• DC2 will be based on iPatRec and xKalman
• there will be manual for EDM and utility packages for v9.0.0
Track: Overview
• New interface for Track, shown above.
• TrackStateOnSurface - provides a way of iterating through hits and
scatterers on the track. It contains pointers to:
–
–
–
–
RIO_OnTrack
TrackParameter
FitQualityOnSurface
ScatteringangleOnSurface
• Summary
– “old” summary object still in Track in 8.2.0. Could not be removed
without changing interface (see later slide)
TrackParticle: Overview
• Why do we need TrackParticle?
• Need lightweight object for analysis, providing
momentum
• Need to transform parameters from detector to
physics frame
• Provides navigation
Muon Reconstruction (S.Goldfarb)
• Packages Moore and Muonboy (became MuonBox)
• Moore: moved to GeoModel, reconstructs G4 data, DC2
development
• MuonBoy: G4 reco expected shortly, not using
GeoModel, development for testbeam
• Common features: unit migration now validated
• Efficiency in eta is now perfect (features reported in
SUSY full.sim. paper are gone)
• Combined reconstruction: Staco (now ported to Athena,
will accept all types of tracks
•
being prepared for new EDM)
•
MuID: low pT muon development using tilecal, etc
• Track task force
•
•
Discussions on EDM/ESD/AOD
Data flow is:
Reconstruction  ESD (100kB)  AOD (10kB)  User code  ntuples
Meeting at UCL in April: document with conclusions at:
http://www.usatlas.bnl.gov/PAT/ucl_workshop_sumary.pdf
•
•
•
Discussion:
Proposal for a class of “IdentifiedParticle” which could be lepton, tagged jet etc Proposal was rejected, it seemed to need either a very complicated or a
redundant implementation to be sufficiently general
Discussion on e.g. e/gamma ID:
– egammaBuilder - high-pT electron ID
– softeGammaBuilder: better for low pT/non-isol. electrons but much overlap
•
•
•
•
•
•
Both collections must be kept, but balance must be found in similar matters due
to AOD size restrictions (aim for 10kB/ev.)
CaloCells (200,000! Cannot all be kept!): can keep cells above noise plus sum
of all cells (critical for ETmiss)
Similar issues for tracking, muons, etc
Conclusions:
Keep more than one collection of , e.g. electrons, introduce “event view” to help
choose between candidates
CaloCellContainer: a technical solution in sight but worries about cell removal,
needs study
Reference frames (Richard Hawkins)
Need for more than one frame ?
• Global frame is defined in ATL-GE-QA-2041
• Easier to do reconstruction if we have several subdetector-specific
frames?
• Boosted frames? For example, such that sum of pT_beam=0, to correct
for beam tilt (10-4rad, but p_beam=7TeV)
• Q&A
• Markus Elsing - one global frame must be used for reconstruction. Also
global frame should be determined by inner detector frame, if possible.
Problem otherwise when using lookup tables for subdetector position
• Various - beam tilt should be corrected for if possible/necessary and size
of effect estimated; may be done in Monte Carlo
• Conclusions:
• Strive to use only one global frame
• Beam tilt should be taken care of in the simulation, same as the vertex
smearing
PESA (Simon)
• Several technical matters, software automatic testing;
move from development to production phase (stable,
user-friendly code etc)
• Discussion on forced acceptance: a fraction of the
events must be kept regardless of trigger acceptance ->
studies of noise, background, trigger efficiency etc
• Discussion on how this should be implemented: finegrained (according to LVL1 flags), global (global % of
bunch crossings), etc
• Rather technical reports on status of e/gamma, LAr and
muon slices
Example: e/gamma slice
TrigIdScan
7.0.2 had efficiency
problem in both
TrigIdScan and SiTrack,
now solved
Mean effic.: 94%
Mean effic.: 91%
7.0.2
7.0.2
h
h
Mean effic.: 95%
Mean effic.: 96%
8.0.0
-3
2
1
0
SiTrack
8.0.0
1
2
3h
Single electrons + pile-up 2x1033
-3
2
1
0
1
2
h3
Single electrons + pile-up 1034
egamma Workshop
Several topics have come up which need more discussion:
• Combined testbeam
• How to integrate what we will learn there?
• Reconstruction
• what is done in clustering, what in egamma & electron/photon id?
• Calibration/Corrections
• what is specific to e and gamma and how to?
• G4/G3
• Geometry use
• Physics Studies
• Zee etc
• Validation of Releases
Difficult to find time to discuss all of this in the usual weeks
Date : Mon-Tue, June 28-29 (tbc)
Place: LPNHE Paris (Fred Derue)
Several combined performance studies…
E
E = 20 GeV
?
Energy (MeV)

• Discrepancy found between G3
and G4 – see K.Benslama,
Electron Study
• Apparently already explained
from difference G3 and G4 (“Efield effect” not simulated in G3) –
see G.Unal, LAr calibration
• Below: 50 GeV electrons vs. eta
Transition at h=0.8
Crack barrel/end-cap
eta
JetRec (P.Loch)
• First look at new jets
• Kt for towers seems to be slower than used to be: found some long
standing bugs
• Most jets in forward direction are extremely low in transverse energy
 apply cuts on jet finder input (towers, so far)
• Number of jets in calorimeter and MC truth is very comparable if no
cut on Et of input cells: Et > 100/200/500 MeV (very low Et cuts!!!)
• As soon as cuts are applied, basically all distributions (multiplicity, jet
shapes, radius, etc become different from truth (see next slide)
•
•
•
•
Verify hadronic calibration
Invited contributions of new jet algorithms if needed
Physics groups should use jet finders and give feedback
Preparing extensive documentation for 8.3.0
Number of Kt jets in calorimeter and MC truth:
N evts / N jets 1/ 2
no selection
Etin  100 MeV
N jets / h 1/ 0.1
Etin  300 MeV
no selection
Etin  100 MeV
Etin  300 MeV
Etin  500 MeV
Etin  700 MeV
Etin  1 GeV
ok!
Etin  500 MeV Etin  700 MeV
Etin  1 GeV
N jets
MC truth
h
Tower jets
Up date on vertexing II
• Vertexing tools work quite stable by now
- Billoir FastFit method heavily in use
- SlowFit method: first use case by B-physics
people in Artemis (J. Catmore)
• InDetPriVxFinder package is the standard primary
vertex finder in Athena reconstruction
• Several clients for VxBilloirTools already (Bphysics group, several people from Bonn
University, b-tagging, …)
Andreas Wildauer
Results of InDetPriVxFinder
H→uu and H→bb with vertex constraint, Geant4 H →bb
(0., 0., 0.)±(0.015, 0.015, 56) mm, 4800 events in total
12 µm
32 µm
13 µm
36 µm
Andreas Wildauer
Results on QCD di-jets events [from D. Cavalli]
OLD H1 Calibration - Athens results
NEW H1 Calibration from MissingET :
improves proportionality curve
Combined Test Beam
Combined Test Beam (A.Farilla)
• Immediate plans:
• Finalize and validate first
version of simulation package
• Finalize first version of
reconstruction with ESD and
CBNT output
• Start plugging Combined
Reconstruction algorithms into
RecExTB
• Finalize code access to
Conditions Database
• Develop basic analysis
framework for reconstructed
data
Looking for new people from the CTB
• Work in progress for a
community to add algorithms from
combined event display in
combined reconstruction into RecExTB
Atlantis
That’s it
Acronyms
•
•
•
•
•
•
•
•
•
•
•
PESA – Physics event selection algorithms
ESRAT – Event Selection, Reconstruction and Analysis Tools
EDM – Event Data Model
ID – Inner Detector
CBT (CTB) – Combined Beam Test
STL – Standard Template Library
SEAL – Shared Environment for Applications at LHC
(http://cern.ch/seal/)
AIDA – Abstract Interface for Data Analysis
LCG – LHC Computing Grid (http://lcg.web.cern.ch)
PI – Physicist Interface (extends AIDA, part of LCG)
POOL – Pool Of persistent Objects for Lhc (part of LCG)
Download