Nuts and bolts - LCG Applications Area

advertisement
Architecture
Nuts & Bolts
Vincenzo Innocente
CMS
Vincenzo Innocente,
BluePrint RTAG
Nuts & Bolts
1
No Flames
It is very difficult to use as (good/bad) example any of
those marvelous frameworks and toolkits that never
made it into a popular product
All my respect goes to those who developed products that
have the misfortune to be daily used by thousand of
people and are easy target for my (positive/negative)
criticisms…
AHisto.fill
TObject.draw
~G4RunManager
Please accept my apologies
2
CMS Data Analysis Model
Quasi-online
Reconstruction
Environmental data
Detector Control
Request part
of event
Online Monitoring
store
Event Filter
Request part
of event
Object Formatter
Store rec-Obj
Request part
of event
store
Persistent Object Store Manager
Database Management System
store
Simulation
Store rec-Obj
and calibrations
Data Quality
Calibrations
Group Analysis
Request part
of event
Physics
Paper
User Analysis
on demand
3
Architecture Overview
Data Browser
Generic analysis
Tools
Analysis job
wizards
Detector/Event
Display
Federation
wizards
GRID
Objy
ORCA
COBRA tools
OSCAR
FAMOS
CMS
tools
Software development
and installation
Consistent
User Interface
Distributed
Data Store
& Computing
Infrastructure
Coherent set of
basic tools and
mechanisms
4
Simulation, Reconstruction &
Analysis Software System
Uploadable on the Grid
Physics modules
Specific
Framework
Event
Filter
Grid-enabled
Generic
Application
Framework
Framework
Reconstruction
Algorithms
Calibration
Objects
Physics
Analysis
Configuration
Objects
Data
Monitoring
Event Objects
Grid-Aware Data-Products
adapters and extensions
Basic
Services
ODBMS
Geant3/4
CLHEP
C++
Paw
standard library
Replacement Extension toolkit
5
Framework Dynamics
Framework:
Controls flow of execution
Defines object interaction (implementing design
patterns)
Calls client (plug-in) functions
May offer a traditional “client API” for
integration in more specialized frameworks
Clients specialize framework
behavior:
Inheriting from framework classes
Overwriting their methods
Instantiating other framework
classes
Interacting directly with other,
more general, frameworks
Client API
Flow of control
Framework API
Call backs
Customized Extension
(client plug-in)
6
Devil is in the Details
Build independent components: Avoid


Dependencies among components at the same level
Gratuitous and exaggerated re-use
One hammer does not fit all screws




global states (even cout)
Exposure of internal relationships (a->b()->c(i)->d(“b”))
assumptions on higher level behavior (lent pointers)
Interfaces that force your environment on user code
Balance inheritance (white box) vs composition (black
box)
Distinguish Framework API, Client API and User API
These are Architectural issues NOT coding guidelines
I do not mind of “#define int float” in your .cc, I mind if in a .h
7
Examples
Exceptions

throw internal exception


(avoid inheriting from std::exception?)
Catch it in the framework adapter and throw appropriate
framework exception


Algorithms do not throw a CARFSkipEventException deep inside
No one even think of inheriting from Python exceptions
Do not hardcode cout CobraOut G4out

If really critical, implement a proper messanger:


Every package implement one based on some “pattern”
An adapter takes care of the communication with the framework
Use envelops (not Proxies) and facades toward the user
Stick to the standard and the language (avoid being smarter)


In CMS we could add Architecture.h (config.h) on the fly at
each .cc just before compiling
Do not use Cint or Python where native C++ suffices
8
Package Metrics
Project
Anaphe
ATLAS
Release
3.6.1
1.3.2
1.3.7
CMS/ORCA 4.6.0
CMS/COBRA 5.2.0
CMS/IGUANA 2.4.2
Geant4
4.3.2
ROOT
2.25/05
Packages
Average #
of direct
dependencies
Cycles
(Packages
Involved)
31
230
236
199
87
35
108
30
2.6
6.3
7.0
7.4
6.7
3.9
7.0
6.4
-2 (92)
2 (92)
7 (22)
4 (10)
-3 (12)
1 (19)
# of levels ACD* CCD* NCCD*
8
96
97
35
19
6
21
22
5.4
167
70 16211
77 18263
24 4815
15 1312
5.0
174
16 1765
19
580
Size
1.3 630/170k
10
1350k
11
1350k
3.6
420k
2.7
180k
1.2 150/38k
2.8
680k
4.7
660k
*) John Lak os, Large-Scale C++ Programming



Size = total amount of source code (roughly—not normalised across projects!)
ACD = average component dependency (~ libraries linked in)
CCD = sum of single-package component dependencies over whole release


Indicates testing/integration cost
NCCD = Measure of CCD compared to a balanced binary tree




A good toolkit’s NCCD will be close to 1.0
< 1.0: structure is flatter than a binary tree (= independent packages)
> 1.0: structure is more strongly coupled (vertical or cyclic)
Aim: Minimise NCCD for given software/functionality
9
Metrics: NCCD vs Cycles
12
ATLAS
10
NCCD
8
6
ROOT
ORCA
4
G4
2
COBRA
Anaphe
IGUANA
0
0%
Toolkits &
Frameworks
10%
20%
30%
40%
50%
60%
70%
Fraction of Packages in Cycles
10
Toward a Project Praxis
Define the global software model

Granularity, role and nature of “Modules”
Physical vs logical modules
(yesterday at CMS plenary M.Livny concluded asking for
staticly linked, check-pointable executables…)
Reuse model of sub-components
 Which “glues” have to be used, where and how

Define THE set of basic components
Agree on Metrics to measure modularity

Not only Frameworks, also applications based on them
11
Download