ATLAS software workshop at BNL 23-28 May 2004 Summary of notes taken during talks and some slides taken from interesting talks – Ricardo, 29 May 2004 Outline Overall notes and plans Core software, distribution kit, etc Simulation and detector description Reconstruction and event selection Analysis tools and Event Data Model Combined Test Beam Overall notes and plans Many different sessions and working group meetings…will try to summarize and will focus on some subjects more than others. Also, most of this talk is a transcription of my notes, not all of it makes sense of is very explicit. • • • • • • • • • • • • Software plenary sessions I & II Event selection, reconstruction and analysis tools Software infrastructure Simulation and detector description Data challenge Framework working group Grid and distributed analysis Event Data Model (EDM) working group SUSY working group Reconstruction working group Physics coordination Physics Event Selection Algorithms (PESA) working group • • • • • • • • • • • • Grid working group Physics validation Detector description working group Distributed analysis working group Database and calibration/alignment Combined performance International computing board Analysis tools working group Database working group Software distribution and deployment working group Calibration/alignment working group Software plenary session III • Major production activities soon: DC2, Combined Test Beam • Also: HLT testbed and physics studies leading up to Physics Workshop 2005 • Next software week 20-24 September: after DC2 and at end of beam test • Documentation is only in the minds of a few people – some plans to document parts of software during the next year • Several buzzwords in this meeting: Geant4 Pileup Digitization event mixing EDM: ESD/AOD (what information should be in AOD? 100kB max) Use of Python job options (no more athena.exe in > 8.2.0, use jobOptions_XXX.py) Grid Calibration etc…. ATLAS Computing Timeline 2003 • POOL/SEAL release (done) • ATLAS release 7 (with POOL persistency) (done) • LCG-1 deployment (done) 2004 • ATLAS complete Geant4 validation (done) NOW • ATLAS release 8 (done) • DC2 Phase 1: simulation production 2005 • DC2 Phase 2: intensive reconstruction (the real challenge!) • Combined test beams (barrel wedge) • Computing Model paper 2006 • Computing Memorandum of Understanding • ATLAS Computing TDR and LCG TDR • DC3: produce data for PRR and test LCG-n 2007 • Physics Readiness Report • Start commissioning run • GO! Planning (T.LeCompte) • For the first time delay in schedule is consistent with zero • Several project reviews expected soon • Inner Detector and Core software seem to need lots of work and people Core software, distribution, validation, etc Core software Release plans: 8.2.0 – 22 May 8.3.0 – 9 June Core software (C.Leggett) 9.0.0 – 30 June – DC2 production • Gaudi: • Current version is v14r5+ATLAS-specific patches (ATLAS version 0.14.6.1) • Changes for v14: – – – – Uses SEAL, POOL, PI (?) AIDA histogram Svc replaced with ROOT GaudiPython Events merging: can now control exactly which events from 2 files are merged – Pileup: see Davide's talk Tuesday – Interval of Validity (IoV) Svc improved – v8.2.0: new version of CLHEP causes wrong version of HepMC to load when using athena.exe (not athena.py) -> segfault • Athena and Gaudi: heading towards rel. 9, Gaudi v15 (try to stay in step with LHCb); extend installArea to gaudi Distribution kit • Development platform vs. deployment platform • Kit philosophy: – Address multiple clients: running binaries/doing development – Possibility of downloading partial releases, binaries only (Gregory working on this one), source code… – Multiple platforms (not yet) • Until recently only RedHat7.3 + gcc 3.2 supported – in future, CERN Enterprise Linux (CEL) 3 + gcc 3.2.3 / icc8 (64-bit compiler) / Windows / MacOSX • Complication in development platforms from external software – need to cleanup packages and think about dependencies • Runtime environment has some problems – not easy to setup for partial distributions • Pacman/Kit tutorial to be organized soon Validation (D.Costanzo) • 3 activities: - reconstruction of DC events with old zebra events - validation of generators for DC2 - reconstruction of G4 events for DC2 • Plan for next year: generate more events and prepare for physics workshop 2005 • v8.2.0 should be next usable release (moore/muid?), exercise reconstruction • Various details: – – – – – Some degeneration of reconstruction was found wrt 7.0.2 Atlfast needed to validate generated event Several problems found e.g. xKalman in 8.0.3 A lot of activity in electron reconstruction Jet/ETmiss: jetRec under restructuring, H1-style weights need to be recalculated for Geant4 – DC2 going into production mode Grid infrastructure (Luc Goosens, et al.) • production system architecture (2) – executors • one for each facility flavour – LCG (lexor), NG (dulcinea), GRID3 (Capone), PBS, LSF, BQS, Condor?, … • translates facility neutral job definition into facility specific language – XRSL, JDL, wrapper scripts, … • implements facility neutral interface – usual methods: submit, getStatus, kill, … – data management system • allows global cataloguing of files – we have opted to interface to existing replica catalog flavours • allows global file movement – an atlas job can get/put a file anywhere • presents a uniform interface on top of all the facility native data management tools • implementation -> Don Quijote – see separate talk Data Management (database): Don Quixote Supervisor: Windmill Executors: NG (dulcinea), LCG (lexor), GRID3 (Capone), batch dms Don Quixote prodDB super jabber super LCG exe Windmill jabber G3 exe RLS NG super jabber NG exe RLS LCG super jabber jabber LCG exe super LSF exe Executors LSF Grids RLS Grid3 DC2 running (S.Albrand) • DC2 will have a much more sophisticated production system than DC1 and uses POOL instead of ZEBRA • Only one database for all data types; data type is stored in database • Users add properties to their datasets (collection of logical files) from a predefined list of possible properties • Allows database searches for datasets • Dataset names proposed to user (mods. possible) • User interface to submit tasks • Datasets not owned by one physics group alone anymore; notion of principal physics group • Possibility to extract parent and children datasets: parent may be generator level dataset and children may be reconstructed dataset Simulation and Detector Description Detector description (J.Boudreau) • The detector description in “GeoModel” will be used in both simulation and reconstruction • LAr is last detector missing in GeoModel • TileCal has finished GeoModel description but will not be available for DC2 • Volumes know their subvolumes: calculate dead material etc in transparent way • Relational database (Oracle) will contain versions of whole detector: no hardwired detector description numbers in reconstruction code • Changing things such as alignment will be dealt with through different versions of detector description in database • Great live demo of GeoModel! Simulation (A.Rimoldi) Simulation (A.Rimoldi) • Generators in advanced stage in both Atlfast and Atlsim: added “Cascade” and “Jimmy”; some differences between G3 and G4 Herwig events • G4atlas being debugged: use versions > 6.0; new Geant4 release 6.1 (25/3/2004) • Digitization: most infrastructure software now in place, but work to do for each subdetector • Pileup: under development and test; different subdetectors can be affected by different bunch crossings • Combined test beam: general infrastructure is ready G4ATLAS status (A.Dell'Acqua) • Concentrating on DC2 production • G4 v6.1 gave some problems, went back to v6.0 • 6.0 has some physics problems (which don't seem to be serious from his tone), but aiming for rubustness at the cost of some accuracy • MC truth: fully deployed with rel.8.0.2, full support in v8.0.3 • Need guinea pigs to test MC truth and for simulation and reconstruction • G4 truth uses same HepMC format as generated event • Migration to python job options Digitization and Pileup (D.Costanzo) • Sub-detector breakdown: ID getting there, calorimetry very mature since DC1, muons not so • Pileup: each detector reads pileup from different bunch crossings • Number of files in POOL should be kept low: many files leads to mem.leaks! • Disk usage explodes with use of pileup. most of this is MC truth for backgr.event • Which doesn't have to be written, will be eliminated • Memory leak problems Detector Simulation Conclusions • • • • • • G4atlas (A.Rimoldi): Work on digitization and pileup Full deployment of MCtruth Migration to Python job options Major changes in AtlasG4Sim to converge with atlas # human resources problem (Armin) • • • • • Digitization and Pileup (D.Costanzo) Detector folders each detector reads hits from different bunch crossings Emphasis moved to detector description in GeoModel Inner Detector: Noise was implemented with no hits from pixel/SCT LAr calorimeter: Improvements wrt G3; use of GeoModel Still bugs and calibration issues in G3/G4 Muon spectrometer: migration to GeoModel done in the last few days Effort put in validation of hit positions Detailed simulation of MDT digitization and response Realistic pileup procedure still needs work Setup for combined test beam in place (or almost) • • • Analysis Tools Analysis tools (K.Assamagan) • • • • • • RTF recommendations: looked at modularity, granularity & design of reconstruction software AnalysisTools: span gap between reconstruction and ntuple analysis Tools: Artemis analysis framework prototype – Seems to diverge from EDM, may not be supported for long PID - prototype to handle part.identif. Workshop in april at UCL: http://www.usatlas.bnl.gov/PAT/ucl_ workshop_sumary.pdf - PyRoot, PyLCGDict - Physicists Interfaces (PI) project: – extends AIDA; provides services for batch analysis in C++, fitting and minimization, storage in HBOOK/ROOT/xml, – plotting in ROOT/Hippodraw/OpenScientist, etc. data Data flow • algorithms Python scripting language (W.Lavrijsen) • GaudiPython: provides binding to Athena core objects; basis for job options after release 8.2.0 • PyROOT: bridge between Python and ROOT; distributed with ROOT 4; http://root.cern.ch/root/HowtoPyROOT.html • PyBus: software bus (modules can be “plugged in” to bus) implemented in Python; http://cern.ch/wlav/pybus • ASK (Athena Startup Kit): DC2-inspired tutorial online; full chain of algorithms (generators => ... => simulation => ... => analysis); http://cern.ch/wlav/athena/athask/tutorials.html Analysis with PyROOT (S.Snyder) • Why? Because ROOT C++ not reliable enough, python much better and as good if • No speed problems found • PyROOT now part of ROOT release • Can use pyroot to interface own code (or any code that cint would do) • PyLCGDict, does the same thing with different data dictionary (data definition). When to use? • If you have external code that already has ROOT/LCG dictionary that helps to decide • PyRoot has less dependencies Reconstruction, Trigger and Event Data Model Reconstruction (D.Rousseau) • RTF recommendation: Detectors (e.g.TileCal and LAr) should share code to facilitate downstream algorithms • Combined beam test: analyse CBT data with offline code, as little dedicated code as possible; big effort e.g. to integrate conditions database for various detectors • DC2 reconstruction (release 9.x.x ) • Run on Geant4 data with new detector description; validate G4 • Persistence: – ESD (event summary data), EDM output: issues with ~200k cal cells; target size 100kB/ev – Ongoing discussions on AOD (analysis object data) definition: aim for 10kB/ev Reconstruction (D.Rousseau) • Work model: Reconstruction Combined ntuple ROOT/PAW analysis – Changes to: reconstr ESD/AOD analysis in Athena small ntuple ROOT – CBNT remains as debugging tool but will not be produced in large scale for DC2 • Status: – Python job options (no more jobOptions_xxx.txt) – People needed for transverse tasks: documentation, offline/CBNT reconstruction integration, AOD/ESD definition Calorimeter reconstruction (P.Loch) • CALO EDM has navigable classes CaloCell, CaloTower and CaloCluster using consistent 4-momentum representations (INavigable4Momentum), v8.1.0); can now be used directly by JetRec • Container class CaloCellContainer holds both LAr and TileCal CaloCells and persistifies them in StoreGate (8.2.0, key “AllCalo”); used by LArCellRec, LArClusterRec, TileRecAlgs explicitly • Clusters produced by CaloClusterMaker (Sven Menke, v8.1.0, topological and sliding window clusters) have full 3D neighbours option, crossing boundaries between calorimeters (8.2.0) • Cluster splitter with 3D clusters spanning different calorimeters under test (aim for 8.3.0) – finds individual showers (peaks) in large connected cluster Calorimeter reconstruction (P.Loch) • New structure for algorithm class CaloTowerAlgorithm – calls different builders for towers according to calorimeter – makes older FCAL minicells obsolete • CALO algorithm structure slightly behind EDM: new CaloCellMaker (David Rousseau) to be tested – makes cells also calls cell-type corrections (>10 corrections for LAr) – aim for 8.3.0, needed in 9.0.0 • No hardwired numbers in code anymore, detector description/job options (Database? Job options?) • Implement relations between cells and clusters for 9.0.0 (using STL maps?) – Proposal to have classes such as “particleWithTrack” to implement relations was rejected • Asked for volunteers for design, implementation and testing of both the calorimeter EDM and the reconstruction algorithms Tracking (E.Moyse) Many recent developments: • new tracking class; converters from old formats • very good Doxygen documentation available from the ID software homepage • A lot of reorganization and new packages recently • track extrapolation for ID - Dmitri Emeliyakov • DC2 will be based on iPatRec and xKalman • there will be manual for EDM and utility packages for v9.0.0 Track: Overview • New interface for Track, shown above. • TrackStateOnSurface - provides a way of iterating through hits and scatterers on the track. It contains pointers to: – – – – RIO_OnTrack TrackParameter FitQualityOnSurface ScatteringangleOnSurface • Summary – “old” summary object still in Track in 8.2.0. Could not be removed without changing interface (see later slide) TrackParticle: Overview • Why do we need TrackParticle? • Need lightweight object for analysis, providing momentum • Need to transform parameters from detector to physics frame • Provides navigation Muon Reconstruction (S.Goldfarb) • Packages Moore and Muonboy (became MuonBox) • Moore: moved to GeoModel, reconstructs G4 data, DC2 development • MuonBoy: G4 reco expected shortly, not using GeoModel, development for testbeam • Common features: unit migration now validated • Efficiency in eta is now perfect (features reported in SUSY full.sim. paper are gone) • Combined reconstruction: Staco (now ported to Athena, will accept all types of tracks • being prepared for new EDM) • MuID: low pT muon development using tilecal, etc • Track task force • • Discussions on EDM/ESD/AOD Data flow is: Reconstruction ESD (100kB) AOD (10kB) User code ntuples Meeting at UCL in April: document with conclusions at: http://www.usatlas.bnl.gov/PAT/ucl_workshop_sumary.pdf • • • Discussion: Proposal for a class of “IdentifiedParticle” which could be lepton, tagged jet etc Proposal was rejected, it seemed to need either a very complicated or a redundant implementation to be sufficiently general Discussion on e.g. e/gamma ID: – egammaBuilder - high-pT electron ID – softeGammaBuilder: better for low pT/non-isol. electrons but much overlap • • • • • • Both collections must be kept, but balance must be found in similar matters due to AOD size restrictions (aim for 10kB/ev.) CaloCells (200,000! Cannot all be kept!): can keep cells above noise plus sum of all cells (critical for ETmiss) Similar issues for tracking, muons, etc Conclusions: Keep more than one collection of , e.g. electrons, introduce “event view” to help choose between candidates CaloCellContainer: a technical solution in sight but worries about cell removal, needs study Reference frames (Richard Hawkins) Need for more than one frame ? • Global frame is defined in ATL-GE-QA-2041 • Easier to do reconstruction if we have several subdetector-specific frames? • Boosted frames? For example, such that sum of pT_beam=0, to correct for beam tilt (10-4rad, but p_beam=7TeV) • Q&A • Markus Elsing - one global frame must be used for reconstruction. Also global frame should be determined by inner detector frame, if possible. Problem otherwise when using lookup tables for subdetector position • Various - beam tilt should be corrected for if possible/necessary and size of effect estimated; may be done in Monte Carlo • Conclusions: • Strive to use only one global frame • Beam tilt should be taken care of in the simulation, same as the vertex smearing PESA (Simon) • Several technical matters, software automatic testing; move from development to production phase (stable, user-friendly code etc) • Discussion on forced acceptance: a fraction of the events must be kept regardless of trigger acceptance -> studies of noise, background, trigger efficiency etc • Discussion on how this should be implemented: finegrained (according to LVL1 flags), global (global % of bunch crossings), etc • Rather technical reports on status of e/gamma, LAr and muon slices Example: e/gamma slice TrigIdScan 7.0.2 had efficiency problem in both TrigIdScan and SiTrack, now solved Mean effic.: 94% Mean effic.: 91% 7.0.2 7.0.2 h h Mean effic.: 95% Mean effic.: 96% 8.0.0 -3 2 1 0 SiTrack 8.0.0 1 2 3h Single electrons + pile-up 2x1033 -3 2 1 0 1 2 h3 Single electrons + pile-up 1034 egamma Workshop Several topics have come up which need more discussion: • Combined testbeam • How to integrate what we will learn there? • Reconstruction • what is done in clustering, what in egamma & electron/photon id? • Calibration/Corrections • what is specific to e and gamma and how to? • G4/G3 • Geometry use • Physics Studies • Zee etc • Validation of Releases Difficult to find time to discuss all of this in the usual weeks Date : Mon-Tue, June 28-29 (tbc) Place: LPNHE Paris (Fred Derue) Several combined performance studies… E E = 20 GeV ? Energy (MeV) • Discrepancy found between G3 and G4 – see K.Benslama, Electron Study • Apparently already explained from difference G3 and G4 (“Efield effect” not simulated in G3) – see G.Unal, LAr calibration • Below: 50 GeV electrons vs. eta Transition at h=0.8 Crack barrel/end-cap eta JetRec (P.Loch) • First look at new jets • Kt for towers seems to be slower than used to be: found some long standing bugs • Most jets in forward direction are extremely low in transverse energy apply cuts on jet finder input (towers, so far) • Number of jets in calorimeter and MC truth is very comparable if no cut on Et of input cells: Et > 100/200/500 MeV (very low Et cuts!!!) • As soon as cuts are applied, basically all distributions (multiplicity, jet shapes, radius, etc become different from truth (see next slide) • • • • Verify hadronic calibration Invited contributions of new jet algorithms if needed Physics groups should use jet finders and give feedback Preparing extensive documentation for 8.3.0 Number of Kt jets in calorimeter and MC truth: N evts / N jets 1/ 2 no selection Etin 100 MeV N jets / h 1/ 0.1 Etin 300 MeV no selection Etin 100 MeV Etin 300 MeV Etin 500 MeV Etin 700 MeV Etin 1 GeV ok! Etin 500 MeV Etin 700 MeV Etin 1 GeV N jets MC truth h Tower jets Up date on vertexing II • Vertexing tools work quite stable by now - Billoir FastFit method heavily in use - SlowFit method: first use case by B-physics people in Artemis (J. Catmore) • InDetPriVxFinder package is the standard primary vertex finder in Athena reconstruction • Several clients for VxBilloirTools already (Bphysics group, several people from Bonn University, b-tagging, …) Andreas Wildauer Results of InDetPriVxFinder H→uu and H→bb with vertex constraint, Geant4 H →bb (0., 0., 0.)±(0.015, 0.015, 56) mm, 4800 events in total 12 µm 32 µm 13 µm 36 µm Andreas Wildauer Results on QCD di-jets events [from D. Cavalli] OLD H1 Calibration - Athens results NEW H1 Calibration from MissingET : improves proportionality curve Combined Test Beam Combined Test Beam (A.Farilla) • Immediate plans: • Finalize and validate first version of simulation package • Finalize first version of reconstruction with ESD and CBNT output • Start plugging Combined Reconstruction algorithms into RecExTB • Finalize code access to Conditions Database • Develop basic analysis framework for reconstructed data Looking for new people from the CTB • Work in progress for a community to add algorithms from combined event display in combined reconstruction into RecExTB Atlantis That’s it Acronyms • • • • • • • • • • • PESA – Physics event selection algorithms ESRAT – Event Selection, Reconstruction and Analysis Tools EDM – Event Data Model ID – Inner Detector CBT (CTB) – Combined Beam Test STL – Standard Template Library SEAL – Shared Environment for Applications at LHC (http://cern.ch/seal/) AIDA – Abstract Interface for Data Analysis LCG – LHC Computing Grid (http://lcg.web.cern.ch) PI – Physicist Interface (extends AIDA, part of LCG) POOL – Pool Of persistent Objects for Lhc (part of LCG)