ATLAS Data Challenges
The Physics point of view
UCL, September 5th 2001
Fabiola Gianotti (CERN)
 Three data challenges are foreseen:
-- DC0 : end 2001
-- DC1 : first half 2002
-- DC2 : first half 2003  Computing TDR
 Goals : validate our computing model and our software
Important physics content : provide data samples for physics
studies and hopefully many physics results
 How ?
Start with data which looks like real data
 need MC generators, G3/G4 simulation, event model, detailed
detector response (e.g. noise, cross-talk, etc.), pile-up
Run the filtering/trigger and reconstruction chain
Store the output data into the database
Run the analysis
Produce physics results
DC0: November - December 2001
 In principle should be a test of the WHOLE software chain : a kind
of “rehearsal” for DC1 (check that everything works for DC1)
 Issue is therefore not massive production of huge data samples
but few 100k events able to test the whole software chain
 Chosen physics sample : few 100 k Z+jet events, with Z  .
-- allows tests of ALL sub-detectors (including b-tagging
since 6% of jets are b-jets)
-- idea is to produced several samples with the 3
general-purpose generators (PYTHIA, Isajet, Herwig)
 If you want to participate in DC1, you are (strongly) encouraged
to participate in DC0 as well.
DC1: February - July 2002
Scope : stress-test the system with large-scale production,
reconstruction and analysis
Several samples of up to 107 events   10% data collected
at LHC in one year.
Crucial issues :
-- simulation will be done mainly with G3 but it is important
to perform smaller-scale production with G4
-- comparison G3/G4 (with same geometry, to be meaningful …)
-- learn about event model and detector description
-- I/O performances : N events with different technologies
-- pile-up treatment
-- understand bottle necks
-- understand distributed computing model / GRID (not discussed
DC1: Physics samples
107 jets for e/jet separation studies in view of
Trigger/DAQ TDR (due end of 2002).
~ 10 times more statistics than “old jet production”.
Study performance of ATHENA and HLT algorithms.
Useful also for other physics studies (e.g. optimisation of jet energy
reconstruction algorithm)
Any other CPU-consuming physics sample considered useful for
physics studies. Mainly SM “background processes” : examples:
-- inclusive muon sample (for B-physics and muon performance
studies), Zbb and Wbb samples (backgrounds to many searches),
WW/ZZ samples
-- Z   for tau-lifetime studies
-- several samples with different generators to understand the
physics of various MC
Physics groups and Combined Performance groups asked to prepare
list of wishes  first discussions at Physics Coordination in Lund and
at October ATLAS week. Everybody is encouraged to make
DC2 : January - September 2003
 Scope/precise goals: depend on the outcomes of DC0/1
 Present goals:
-- 108 events ( data collected in 1 LHC year)
-- Geant4 should play a major role
-- full test of calibration/alignement procedures and condition database
-- question : do we want to add part or all of DAQ, LVl1, LVL2, Event
filter ?
 Physics content:
-- demonstrate capability of extracting and interpreting a signal
from New Physics
-- generate various SM samples and “hide” in each one a different
New Physics process (e.g. SUSY for one mSUGRA point, excited
leptons, etc.).
-- people will be asked to understand the nature and all possible
features of the signal (without knowing a priori what it is)
DC production : CPU and data size
Number of
~ 10
~ 10
~ 10
Time hours
Total size
~ 0.2 TB
~ 20 TB
~ 200 TB
“Physics readiness document”
(kind of Physics TDR prime … ) :
LHC t0-1year
Content (examples):
 Work done with MC generators, the ATLAS MC library,
status/strategy for MC production
 Strategy for using different levels of simulation (full,
parametrisations, fast) for different processes
 Comparisons G4/test-beam data, FLUKA/test beam-data 
systematics from full simulation
 Main figures of Physics TDR redone with new/final software
 Specialised packages needed for various physics studies (e.g.
MSSM scan packages for Higgs and SUSY with up-to-date
theoretical calculations, etc.)
 etc.
Status of the non-core software
(my view, emphasis on “physics
 Main generators (PYTHIA, ISAJET, HERWIG) interfaced to HepMC
(HERWIG being finalised …).
Next : specialised generators (e.g. VECBOS, QQ)
 Simulation :
-- G4 : physics validation not completed (lot of work done with EM
physics, hadronic physics being tested now); full ATLAS geometry
not yet in.
-- DC0, DC1: use G3 plus smaller/restricted (e.g. to some detector parts)
productions with G4
-- FLUKA : I am 100% sure with need it. I intitiated a pilot-project
with Tilecal : G4 test-beam geometry input to FLUKA (first results
in Lund). Then extend to other sub-detectors
 Intermediate simulation (e.g. shower/track parametrisation): I am 100%
sure we need it. Tried to find people over the last two years  failed.
Recently a couple of groups have shown some interest.
 ATLFAST OO (UK product):
-- runs in ATHENA
-- reads HepMC from Objectivity, writes output into Objectivity
(and ntuples)
-- first validation made. Further results in Lund (from “users-nondevelopers”)
-- next steps: improve functionality (beyond ATLFAST fortran).
E.g. : shower shapes ? Trigger simulation ? Parametrisation for
B-physics ?
 C++/OO reconstruction:
-- runs in ATHENA
-- reads G3 hits/digis (Phyiscs TDR data)
-- validation results in Lund
 Less clear situation (to me ...) for : e.g.
-- event data model
-- detector description
-- database , condition database, technology choice
-- simulation framework vs ATHENA
-- analysis tools (maybe premature today but one of the aims of DC’s
should be validation of analysis tools)
Where could you contribute ?
Lot of work to be done everywhere , of course ….
 Improve understanding of ATLAS potential for physics
(e.g. SUSY, Extra-dimensions, backgrounds) and detector performance
(e.g. can we tag charm-jets ?) by analysing data produced by DC’s.
 Improve reconstruction, algorithms, etc. (e.g. HLT, E-flow algorithm
for jet reconstruction using ID+CALOs)
 Validation of MC generators: e.g. Which MC for which process ? For which
processes do we need more calculations and/or additional/specialised MC ?
 Validation of G4/FLUKA physics : comparisons with test-beam data (in
particular nuclear interactions)
 Validation of ATLFAST OO and new reconstruction against old/fortran
 Intermediate simulation (shower and track parametrisations)
 Detector response : hits  digi (including noise, pile-up with correct time
structure, efficiency, etc.)
HLT-DC1 scenario
 This has to be discussed with the HLT community but
the basis could be similar to what has been done
 Generation:
Pt hard scattering > 17 Gev
| h | < 2.7
2 samples
1) e-candidate
 S Et > 17 Gev, no m, no n
• Grid 0.12 x 0.12
2) Jet-candidate
 S Et > 40 Gev
• Grid 1.0 x 1.0
A first selection is made at the level of the event generation
One keeps 14.5% of generated events
• 14.4% for (1) and 2% for (2)
HLT- DC1 scenario
The remaining events are run through the full
The Lvl1 trigger is applied at that level
One keeps 13.7% of the events
• 97% for (1) and 10% for (2)
The pile-up is run for the remaining events means
~2% of the ‘generated’ sample
Then the events are run through Lvl2, Event Filter
and offline reconstruction
What next
Prepare a first list of goals & requirements
HLT, Physics community
simulation, reconstruction, database communities
people working on ‘infrastructure’ activities (bookkeeping)
to be discussed
with A-team
with CSG (July 24th meeting)
In order to
prepare a list of tasks
• Some Physics oriented
• But also like testing code, running production, …
define the priorities
Start the validation of the various components in
the chain (putting dead lines for readiness)
Simulation, pile-up, …
Database, bookkeeping, …
Estimate what it will be realistic (!) to do
For DC0
For DC1
 “And turn the key”
The ATLAS Data Challenges Project Structure Organisation
ATLAS Data Challenges
DC Overview Board
Work Plan Definition
Expression of interests
So far, after the NCB meeting of July 10th:
Canada, France, Italy, Japan, Nordic Grid, Russia,
Taiwan, UK
Proposition to help in DC0
Proposition to participate to DC1
Contact with HLT community
Contact with EU-Data-GRID
Kit of ATLAS software