Twelve Ways to Build CMS Crossings from Root Files Benefits and deficiencies of Root trees and clones when : - NOT dealing with TObjects, - reading the trees entries NOT sequentially, - processing them NOT one by one. CHEP ' 2003 David Chamont (CMS - LLR) 1 Outline • Goal & scope. • Main use-case. • 4 kinds of containers for the crossing data model. • 3 kinds of persistency managers. • Results. • Conclusions. CHEP ' 2003 David Chamont (CMS - LLR) 2 Goal & Scope • Evaluate the benefits of TTree and TClonesArray for the persistency of CMS event data (whose classes heavily rely on templates and external packages). • Focus on the generation of crossings (pile-up of about 160 simulated events chosen pseudo randomly). • Not covered yet : meta-data, associations between persistent objects, schema evolution . CHEP ' 2003 David Chamont (CMS - LLR) 3 Main Use-Case Digitizer signal event (hits) Persistency Manager minbias event (hits) Persistency Manager digis Persistency Manager minbias events file signal events file CHEP ' 2003 minbias events file David Chamont (CMS - LLR) digis file 4 Crossing Data Model • The folders //root/crossing/* represent the events composing the current crossing. • Each event folder contains a container for each kind of event objects : TrackHit, CaloHit,… • The kind of container is chosen among four : – std::vector<> (by value). – dynamic C array (each event object is wrapped inside a class instrumented with classdef). – TObjArray (each event object is wrapped inside a class derived from TObject). – TClonesArray (each event object is wrapped inside a class derived from TObject). CHEP ' 2003 David Chamont (CMS - LLR) 5 Persistency Managers • The task of a persistency manager is to transfer an event from memory (TFolder) to disk (TFile) and vice-versa. • Three flavors have been implemented : – RtbPomKeys : directly write the TFolder in the TFile, each time with a different meaningful name. – RtbPomTreeMatrix : for each container in the folder, creates a TMatrixD and attach it to a branch of a TTree. – RtbPomTreeDirect : attach directly each container to a branch of a TTree. CHEP ' 2003 David Chamont (CMS - LLR) 6 Implementation issues • Recent progress : – can now use -ansi -pedantic. – nice support of foreign classes. – better and better support of templates and std containers. • Recurrent problems with Root I/O : – must explicitly ask to parse namespaces, components types and template instances. – multiple containers sizes and misleading operator[], – tuning of chain branchs, – it is unclear which subset of C++ is supported by Root I/O, and which in TTree, and which in TClonesArray. CHEP ' 2003 David Chamont (CMS - LLR) 7 Configuration • Pentium 4, 1.8 GHz. • ~ 512 Mo of RAM. • IDE disk. • RedHat Linux 7.3 • Gcc 3.2 • Root 3.05/03 CHEP ' 2003 David Chamont (CMS - LLR) 8 Parameters of the Testbed • compression level. • size of buffers. • split level. • randomness : burst and jump. • size of the containers within the events. • number of crossings. • direct inheritance or not from TObject, direct instrumentation or not with ClassDef. • resetting or not the values in the empty constructors. CHEP ' 2003 David Chamont (CMS - LLR) 9 Best Results CHEP ' 2003 File size (Kb/event) Cpu time (s/crossing) Stl C ObjArray Clones Keys 152 3.16 175 4.82 155 9.65 155 4.43 Matrix 149 2.44 149 2.85 149 3.15 149 2.72 Tree 153 2.63 176 4.05 156 7.27 54 1.87 David Chamont (CMS - LLR) 10 Remove compression CHEP ' 2003 File size (Kb/event) Cpu time (s/crossing) Stl C ObjArray Clones Keys 341 1.76 568 3.00 427 8.27 384 2.95 Matrix 400 1.01 400 1.45 400 1.71 400 1.23 Tree 343 1.53 570 2.70 429 6.17 214 1.16 David Chamont (CMS - LLR) 11 Then increase random CHEP ' 2003 File size (Kb/event) Cpu time (s/crossing) Stl C Keys 341 2.20 568 3.42 427 8.63 384 3.40 Matrix 400 1.31 400 1.78 400 2.03 400 1.59 Tree 343 2.45 570 3.71 429 6.98 214 2.71 David Chamont (CMS - LLR) ObjArray Clones 12 Then reduce containers /10 CHEP ' 2003 File size (Kb/event) Cpu time (s/crossing) Stl C ObjArray Clones Keys 34.9 0.82 54.2 0.96 43.6 1.51 39.5 1.05 Matrix 40.6 0.48 40.6 0.51 40.6 0.59 40.6 0.52 Tree 35.4 1.09 55.0 1.40 44.2 1.59 22.7 1.36 David Chamont (CMS - LLR) 13 Then remove random CHEP ' 2003 File size (Kb/event) Cpu time (s/crossing) Stl C ObjArray Clones Keys 34.9 0.32 54.2 0.44 43.6 0.95 39.5 0.52 Matrix 40.6 0.15 40.6 0.18 40.6 0.24 40.6 0.18 Tree 35.4 0.22 55.0 0.32 44.2 0.66 22.7 0.14 David Chamont (CMS - LLR) 14 Conclusions • We succeeded to read pseudo-random entries from a chain and dispatch them to few hundred folders (despite tuning of TChain branches has not been straightforward). • Support for foreign classes, templates and C++ standard library has greatly improved. • The magic couple TTree/TClonesArray has proved very efficient, yet it requires top level TObjects and the benefits can become losses with less data or random access pattern. • One can simply use std vectors and store them directly into root files. Their integration in a TTree is not as worth as a TClonesArray. This could change in the next release of ROOT. • It would be interesting to retry with direct associations between objects, and to apply the testbed to POOL. CHEP ' 2003 David Chamont (CMS - LLR) 16