Lecture8

advertisement

LHC computing

HEP 101 Lecture #8 ayana arce

Outline

• Major computing systems for LHC experiments:

– (ATLAS) Data Reduction

– (ATLAS) Data Production

– (ATLAS) Data Analysis

• End-user tools:

– Exercise: plotting and fitting data with ROOT

– homework: writing a toy Monte Carlo

managing the data volume

DATA REDUCTION

10010011000001 10010011000001 10010011000001 10010011000001 10010011000001 10010011000001 10010011000001 10010011000001

1011010010011100

100011011010001

1011010010011100

100011011010001

11001000101000 11001000101000 11001000101000 11001000101000 11001000101000 11001000101000 11001000101000 11001000101000

0100001100010

101001101001

0100001100010

101001101001

0100001100010

101001101001

0100001100010

101001101001

0100001100010

101001101001

0100001100010

101001101001

0100001100010

101001101001

0100001100010

101001101001

Hardware Trigger (prefilter)

10010011000001

1011010010011100

100011011010001

11001010111110

01000011111010

101001101101

10010011000001

1011010010011100

100011011010001

11001000101000

0100001100010

101001101001

Event Filter (software event selection) data reconstruction and distribution overview: the data reduction chain

The TDAQ system

• Trigger:

– (almost) real-time filtering of collision events

– Events read every ~25ns:

• how long does the trigger take to decide?

• DAQ:

– Sends event data through the trigger and readout systems

– Merges trigger and detector conditions data with event data

L1

ATLAS trigger system

• select 1/10,000 in 2.5 µs

• hardware-based, 256 items local (event fragments)

• select 1/15 in 40 ms

L2

L3

• read global detector data

• select 1/15 in 4 seconds

ATLAS full events

Storage

• similar triggers grouped: data streams analysis

• trigger data used to account for bias

~1700 nodes

(8/12 core, 16/24 GB) dedicated L3

~10 Gb links flexible L2/L3 processors

10 Gb links

Example: electron trigger

are any EM calorimeter regions hot?

is there a cluster of hot cells with straight tracks nearby?

clustering cluster selection tracking cluster/track matching clustering is it an electron?

tracking electron selection

managing the data volume

DATA PRODUCTION

Global data processing and storage

• LHC data output estimate:

15 PB/year

(and we prefer multiple copies)

– Stored and processed on WLCG: shared by all CERN experiments

– Your “local” Tier-1: BNL

– Your local Tier-3: in your backpack!

• Every stored physics event is modeled by many simulated events

– thus most resources are spent in

Monte Carlo simulation note: ATLAS computing systems alone must handle MILLIONS of production/analysis jobs daily

ATLAS Tier computing: roles

store RAW calibrate reconstruct (6k cores)

Tier 0

Tier 1

Tier 2

Tier 2

Tier 2

  backup RAW reprocess (re-reconstruct)

Tier 1 analyze create MC

Tier 2

Tier 2

Tier 2

Tier 1

Tier 2 físicos

Tier 2

Tier 2 physicists

38 T2 centers

120k cores total cernVM environment

ATLAS

Production: data

bytestream trigger convert

RDO

(raw)

RECO pattern recognition esd aod tag

MERGE

& derive sorting

D3PD aod

Production: Monte Carlo

MONTE CARLO PRODUCTION CHAIN

RDO

(raw)

RECO esd aod tag

MERGE

& derive

D3PD aod

What is Monte Carlo, really?

Monte Carlo calculation of π

• HEP predictions require a lot of convolution integrals

– one reason: QM! pick random x, random y if y 2 < 1-x 2 : increment area

What is Monte Carlo?

Monte Carlo calculation of π

• HEP predictions require a lot of convolution integrals

– one reason: QM! pick random x, random y if y 2 < 1-x 2 : increment area

• The Monte Carlo Method:

– use random numbers as an integration tool

Z picks mass and decay angles

What is Monte Carlo?

• The Monte Carlo Method:

– use random numbers as an integration tool

• Very intuitive picture of convolution integrals:

– a series of choices from probability distributions electron

E

T

Z picks mass and decay angles

What is Monte Carlo?

• The Monte Carlo Method:

– use random numbers as an integration tool

• Very intuitive picture of convolution integrals:

– a series of choices from probability distributions electron

E

T calorimeter

(mis)measurement observed electron E

T

Meet your (3-part) Monte Carlo

Slides: Sjöstrand

Meet your MC:

PYTHIA, HERWIG, MadGraph, MCFM, MC@NLO, BaurMC, POWHEG,

&c.

Meet your MC:

PYTHIA, HERWIG, MadGraph, MCFM, MC@NLO, BaurMC, POWHEG,

&c.

Meet your MC:

PYTHIA, HERWIG/JIMMY,

Sherpa…

Meet your MC:

PYTHIA, HERWIG/JIMMY,

Sherpa…

What’s the third part?

• Detector simulation: up to 5 minutes for a high-mass event (lots of particles, each individually tracked through hundreds of detector elements) why is this essential?

measurements and discoveries!

DATA ANALYSIS

ATLAS computing for users

Programming languages

• Main programming languages:

– FORTRAN (some generators)

– C++ (main reconstruction algorithms, analysis)

– python (steering, analysis)

Interactive interfaces

• Main interface: athena

– reads all data formats

– C++ ; steered by python

– this runsall simulation and reconstruction

– can run your analysis too…but excecutable typically 4GB

• Light interface: ROOT

Data representation

• always organized by event

• global quantities:

– metadata

– missing energy…

• physics object lists:

– muons

– jets

– tracks

– “truth” particles …

• object properties:

– hits on tracks

– jet constituents

µ event track track track track trac k hit k hit

“n-tuple”  “tree”

1

2

3

Event number

0

2

1

5

Data representation

nTracks

3 track pT

12.4

8.1

5.0

24.5

20.5

2.0

40.4

… track eta

0.9

1.9

0.1

0.3

1.1

-0.9

1.1

track phi

3.3

1.4

0.8

2.1

1.0

4.0

0.2

17

5

21

14

17

22 track layers…

30

User’s interface to nature: histograms

``Hello World’’ for HEP computing: making a histogram

TH1F(“name”, “title; x title; y title”, nBins, firstBinValue, LastBinValue)

: histo = makeHisto(nbins=50, firstbin=0*GeV, lastbin=200*GeV) for thisEvent in allEvents: if HasZBoson( thisEvent ): m = reconstructZBosonMass( thisEvent ) histo.FillWith( m )

TH1F::Fill(value,weight)

note: in code examples, your input is given in green

EXAMPLE!

Let’s measure the kaon lifetime

(again)!

• open the ROOT file:

– you% root Hep101Data_2013.root

• How to see everything in the file:

– root [1] new TBrowser();

Some ROOT features:

root [0] double x(3.0),y(4.0); sqrt(x*x+y*y)

(const double)5.00000000000000000e+00 root [1] TLorentzVector pion(1500,0,0,1506.482); root [2] printf("The mass is %3.4g\n", pion.M( ));

The mass is 139.6

root [3] TMath::C( <TAB>

Double_t C() // m s^-1 root [4] TMath::C()

(Double_t)2.99792458000000000e+08

Mathematical functions in

ROOT

• Simple :

FitPanel (under Tools)

• Also easy: root [9] KaonDecays->Fit (“expo”)

• More explicit: root [10] TF1 f("f"," [0] *exp(-x/(100* [1] *TMath::C()))",0,60); //free parameters specified in brackets root [11] KaonDecays->Fit(f);

• Complete program (from Dave)

Next steps

• You can download ROOT:

– root.cern.ch

• Homework : write your own Monte Carlo generator to solve Problem 2 from lecture 5 a neutral pion beam with energy E decays to two photons. What is the photon energy distribution in the laboratory frame?

• Feel free to contact atarce@phy.duke.edu

with solutions, questions, etc!!

homework hint: random numbers

• Use the ROOT class TRandom3 for good performance.

• Example

– root [1] TRandom3 r;

– root [2] float random1 = r.Gaus(0,35);

//generate a gaussian-distributed random number with mean 0 and width 35;

– root [3] float random2 = r.Flat(0,2*TMath::Pi());

//generate a scalar meson decay angle

Postscript: if you don’t like C++

>>> import ROOT #from ROOT import * also works

>>> pion = ROOT.TLorentzVector(1500,0,0,1506.482);

>>> print "The mass is", pion.M(), "MeV"

The mass is 139.5994854 MeV

Download