Architecture Nuts & Bolts Vincenzo Innocente CMS Vincenzo Innocente, BluePrint RTAG Nuts & Bolts 1 No Flames It is very difficult to use as (good/bad) example any of those marvelous frameworks and toolkits that never made it into a popular product All my respect goes to those who developed products that have the misfortune to be daily used by thousand of people and are easy target for my (positive/negative) criticisms… AHisto.fill TObject.draw ~G4RunManager Please accept my apologies 2 CMS Data Analysis Model Quasi-online Reconstruction Environmental data Detector Control Request part of event Online Monitoring store Event Filter Request part of event Object Formatter Store rec-Obj Request part of event store Persistent Object Store Manager Database Management System store Simulation Store rec-Obj and calibrations Data Quality Calibrations Group Analysis Request part of event Physics Paper User Analysis on demand 3 Architecture Overview Data Browser Generic analysis Tools Analysis job wizards Detector/Event Display Federation wizards GRID Objy ORCA COBRA tools OSCAR FAMOS CMS tools Software development and installation Consistent User Interface Distributed Data Store & Computing Infrastructure Coherent set of basic tools and mechanisms 4 Simulation, Reconstruction & Analysis Software System Uploadable on the Grid Physics modules Specific Framework Event Filter Grid-enabled Generic Application Framework Framework Reconstruction Algorithms Calibration Objects Physics Analysis Configuration Objects Data Monitoring Event Objects Grid-Aware Data-Products adapters and extensions Basic Services ODBMS Geant3/4 CLHEP C++ Paw standard library Replacement Extension toolkit 5 Framework Dynamics Framework: Controls flow of execution Defines object interaction (implementing design patterns) Calls client (plug-in) functions May offer a traditional “client API” for integration in more specialized frameworks Clients specialize framework behavior: Inheriting from framework classes Overwriting their methods Instantiating other framework classes Interacting directly with other, more general, frameworks Client API Flow of control Framework API Call backs Customized Extension (client plug-in) 6 Devil is in the Details Build independent components: Avoid Dependencies among components at the same level Gratuitous and exaggerated re-use One hammer does not fit all screws global states (even cout) Exposure of internal relationships (a->b()->c(i)->d(“b”)) assumptions on higher level behavior (lent pointers) Interfaces that force your environment on user code Balance inheritance (white box) vs composition (black box) Distinguish Framework API, Client API and User API These are Architectural issues NOT coding guidelines I do not mind of “#define int float” in your .cc, I mind if in a .h 7 Examples Exceptions throw internal exception (avoid inheriting from std::exception?) Catch it in the framework adapter and throw appropriate framework exception Algorithms do not throw a CARFSkipEventException deep inside No one even think of inheriting from Python exceptions Do not hardcode cout CobraOut G4out If really critical, implement a proper messanger: Every package implement one based on some “pattern” An adapter takes care of the communication with the framework Use envelops (not Proxies) and facades toward the user Stick to the standard and the language (avoid being smarter) In CMS we could add Architecture.h (config.h) on the fly at each .cc just before compiling Do not use Cint or Python where native C++ suffices 8 Package Metrics Project Anaphe ATLAS Release 3.6.1 1.3.2 1.3.7 CMS/ORCA 4.6.0 CMS/COBRA 5.2.0 CMS/IGUANA 2.4.2 Geant4 4.3.2 ROOT 2.25/05 Packages Average # of direct dependencies Cycles (Packages Involved) 31 230 236 199 87 35 108 30 2.6 6.3 7.0 7.4 6.7 3.9 7.0 6.4 -2 (92) 2 (92) 7 (22) 4 (10) -3 (12) 1 (19) # of levels ACD* CCD* NCCD* 8 96 97 35 19 6 21 22 5.4 167 70 16211 77 18263 24 4815 15 1312 5.0 174 16 1765 19 580 Size 1.3 630/170k 10 1350k 11 1350k 3.6 420k 2.7 180k 1.2 150/38k 2.8 680k 4.7 660k *) John Lak os, Large-Scale C++ Programming Size = total amount of source code (roughly—not normalised across projects!) ACD = average component dependency (~ libraries linked in) CCD = sum of single-package component dependencies over whole release Indicates testing/integration cost NCCD = Measure of CCD compared to a balanced binary tree A good toolkit’s NCCD will be close to 1.0 < 1.0: structure is flatter than a binary tree (= independent packages) > 1.0: structure is more strongly coupled (vertical or cyclic) Aim: Minimise NCCD for given software/functionality 9 Metrics: NCCD vs Cycles 12 ATLAS 10 NCCD 8 6 ROOT ORCA 4 G4 2 COBRA Anaphe IGUANA 0 0% Toolkits & Frameworks 10% 20% 30% 40% 50% 60% 70% Fraction of Packages in Cycles 10 Toward a Project Praxis Define the global software model Granularity, role and nature of “Modules” Physical vs logical modules (yesterday at CMS plenary M.Livny concluded asking for staticly linked, check-pointable executables…) Reuse model of sub-components Which “glues” have to be used, where and how Define THE set of basic components Agree on Metrics to measure modularity Not only Frameworks, also applications based on them 11