1 - HEPG

advertisement
Framework Technologies and Progress
Huang Xingtao Zou jiaheng Li weidong Zhang xueyao
2013.07.05
What’s a Framework?
 Framework Definition
– A skeleton of an application into which developers plug in their
–
code and provides most of the common functionalities.
Provide and define standard interfaces between components.
 Framework Benefits
– Better specifications of what needs to be done
– Better understanding of the system.
– Low coupling between concurrent developments
– Smooth integration and organization of the development.
– Robustness, resilient to change (change-tolerant).
– Fostering code re-use
Offline
Tutorial II (X.T.
Huang , SDU)
1
Software Organization
Framework
 Basic libraries (such as Geant4, Root,
Python, Openscientist, CLHEP, etc)
Foundation Libraries
Offline
Tutorial II (X.T.
Huang , SDU)
2
Analysis
interfaces, data exchange and
persistency mechanisms,
interactivity
Trigger
 Provides basic services, common
Simulation
components (Algorithms,
Services, etc)
Reconstruction
 Applications using framework
几个离线软件系统的构成
Gaudi Object Diagram
Converter
Converter
Converter
Application
Manager
Message
Service
JobOptions
Service
Particle Prop.
Service
Other
Services
Event Data
Service
Persistency
Service
Data
Files
Transient
Event Store
Algorithm
Algorithm
Algorithm
Detec. Data
Service
Transient
Detector
Store
Persistency
Service
Data
Files
Histogram
Service
Transient
Histogram
Store
Persistency
Service
Data
Files
4
Advantages of Gaudi
 Clear separation between data and algorithms
 Clear separation between persistent data and transient
data
 Data Store-centered architectural style
 Encapsulated User code localized in few specific places
– Algorithms and Converters
 Run-time loading of components (dynamic libraries).
– addition of new components requires minimal recompilation
 All components with well defined interfaces and as
generic as possible
Offline
Tutorial II (X.T.
Huang , SDU)
5
Disadvantages of Gaudi




Multi-layers structure
Too many third-party software or tools
Relatively Slower
Not suitable for non-accelerator experiments ,especially
events with time or space correlation
Light-weighted Framework (LAF) was designed and used
for analysis.
13-1-6
Beijing, CHINA for Review
6
New Experiments and New Framework
 JUNO and LHAASO Experiment under R&D in China
– JUNO performs very high precision measurement
– LHAASO is very large scale experiment
– Both are non-accelerator experiments
– NuWa users’ experience shows that Gaudi is not suitable
– Met the “similar” situation with BESIII in 2001 ,but huge
difference
 Fortran , C , C++
 belle (BASF), LHCb(Gaudi) , Babar or new one
 Finally Gaudi was chosen!!
– Lots
of work has been done!
General Design Principle
事
例
产
生
器
探
测
器
模
拟
数
据
刻
度
事
例
重
建
物
理
分
析
事
例
产
生
器
JUNO
探
测
器
模
拟
数
据
刻
度
事
例
重
建
物
理
分
析
LODESTAR
…
SNiPER
AIDA
CLHEP
GCCXML
ROOT
Geant4
SNiPER : Software for Non-collider Physics ExpeRiments
LODESTAR : LHAASO Offline Data Processing Software Framework
……
Requirements for SNiPER
 Learning from Gaudi
–
–
–
–
–
–
–
Data store centered architectural style
 algorithms as data producers and consumers
Separation between data and algorithms
 decease coupling and better for team development
User code encapsulated in few specific places
 Algorithm, Service, DataObject
Modularity Structure
 Run-time loading of components (dynamic libraries).
 Flexible execution control of algorithms
Object I/O
 Capability to read/write C++ object.
 Independent of specific data models
Programmable script parser for the control (Lin Tao , Xia xin)
Separation between “transient” and “persistent” representations of data???
 based on the difference in different processing steps.
 New requirements
–
13-1-6
Interface to distributed computing ( Zou Jiaheng )
 Parallel processing
 Data file access over WAN (i.e. GRID)
Beijing, CHINA for Review
9
Kernel Structure of SNiPER
OptionParser/PropertyMgr: run time parameters configuration
Algorithms: data calculations
Services: other useful functionalities
OptionParser
PropertyMgr
setOption(name, value)
SniperMgr
initialize()
run()
finilize()
AlgMgr
algs
initialize()
execute()
finilize()
initialize()
execute()
finilize()
SvcMgr
initialize()
finilize()
IAlgorithm
svcs
IService
initialize()
finilize()
10
Algorithm and Service Management
IAlgorithm
ConcreteAlg
name()
Concrete Algorithm:
1.
2.
模块化,可动态加载
仅通过接口与框架通信
3.
每个特定类型算法可有一个或多个实例,通过
name进行区分
4.
可按配置顺序生成算法序列,支持嵌套、分支
等控制流程
Concrete Service:
• 具有与算法1-3相同特性
• 可通过name在SNiPER内任意位置获取所需的
service实例
IService
ConcreteSvc
name()
11
Algorithm and Data
数据与算法分离
Input
Service
Algorithm 1
Algorithm 2
DISK
DATA
in
MEMORY
Algorithm 3
Output
Service
DISK
• 便于不同算法间分享数据
• 算法专注于数据的计算处理,与I/O解耦
12
DataModel
DataObj
Header
header
context
readout_map
header
RawReadout
Readout
setHeader()
McReadout
RecReadout
Header: 事例整体信息,id、detector、time等
Readout: 具体分支信息
Header中的readout_map对普通用户隐藏
13
DataBuffer
DataBuffer
current()
bigin()
end()
size()
seek(int i)
• 对普通用户仅提供只读接口
• 仅做为数据容器,与I/O解耦
只对头、尾进行增删操作,内部使用deque
head1
head2
head3
用户经由header间接访问,
且有随机性(lazy load),
使用list
head4
head5
head6
head7
head8
rec1
rec2
rec5
rec6
MC2
MC5
MC7
MC8
rec8
考虑到GenEvent等类型与Header可能有一对多的对应关系,为降低内存
数据维护难度,buffer中可保存“智能指针”
14
Event Buffer
Current event
Other events
Event buffer
专为对“时间关联事例”分析进行的设计
Exe Num
EvtNum: 0
1 2
3
4
5
6
7
0
1
2
3
4
5
6
7
15
运行时序示意图
SniperMgr
syncData
BufferMgr DataWriter
Algorithms
write
read
loop
execute
get
put
DataReader
每次事例循环开始时同步内存与磁盘数据(I/O)
算法通过BufferMgr接口完成内存数据的存取
16
Short Summary
 Brief introduction to Framework
 Decide to design New Non-Gaudi based Framework
 One working version of SNiPER exists
– several main functionalities have been implmented
 SniperMgr
 Algorithm interface and management
 Service interface and management
 configuration interface
 one example is provided
– some need to be further optimized and discussed today
 In-memory data management
 Event data model
 persistency mechanism
 parallel processing
 python configuration
 Interface to Data base, Geometry and so on
Gaudi heavily relies on ROOT
FairRoot:developed by GSI-IT
Start testing
the VMC
concept for
CBM
Panda decided
to join->
FairRoot: same
Base package
for different
experiments
R3B joined
2004
2006
2010
First Release of
CbmRoot
MPD (NICA)
start also using
FairRoot
ASYEOS joined
(ASYEOSRoot)
22.05.12
Florian Uhlig
CHEP 2012, New York
EIC (Electron
Ion Collider
BNL)
EICRoot
2011
GEM-TPC
separated
from PANDA
branch
(FOPIRoot)
19
FairRoot for
• Simulation,reconstruction ,Data analysis
• Fully based on the ROOT
Planned Start in 2016 !
typedef std::map< std::string, TObject * > StoreObjMap
typedef std::map< std::string, TClonesArray * > StoreArrayMap
ROME
Root based Object oriented Midas Extension
• Tool for Event based Data Analysis
• Fully Object Oriented
• Root based
• Full connection to the Midas Environment
• Online and Offline
• Based on Tasks, Containers and Folders for a good Data and Program
Structure
• Experiment independent Base Classes
• Experiment dependent Classes are generated out of simple XML-Files
• The Users write only experiment specific code (physics)
• Administrative code is implemented in the generated code
• Self Documenting Code
• Shuei
Self Linking Project
YAMADA @
MEG review
meeting, 2
31
ROME Objects
Folders
Tasks
• Objects, where data is stored in
• Tasks are objects, which provides actions
• Stores the data of one detector (or sub
detector) component
• They make calculations
• Hierarchically arranged
• Data inside of Folders is structured
• Store and read data in folders
• Fill trees and histograms
• Hierarchically arranged
• Task also own histograms
Trees
• Data Objects : only written, never read
Histograms
• Used to write data on files
• Graphical Data Objects : only written
• Belong to one Task
Steering Parameters
• Task steering
• Framework steering
Shuei
YAMADA @
MEG review
meeting, 2
32
Interconnections
Disk
Read (any Format)
(Input)
Read
Histograms
Histograms
Histograms
Fill
Tasks
Tasks
Tasks
Fill
Folders
Histograms
Histograms
Histograms
Fill
Histograms
Histograms
Histograms
Flag
Read
Shuei
YAMADA @
MEG review
meeting, 2
Trees
Trees
Trees
Write (ROOT)
33
Disk
(Output)
Facts show us
 ROOT becomes more and more popular and powerful
 From generator to analysis, most processes data can be
saved into ROOT
 It is better to have as same as possible event data
definition between transient and Persistency
– easy implementation of I/O
 Design of Event Data Model based on ROOT
– Data Object inherits from Tobject
Event Data Model based on ROOT
Navigation functions reply
on Header Objects.
Pros : straightforward
cons: no general enough
Dyb uses
RegistrationSequnce
which is separately
from headers
Tags for fast review
Purpose of tag is to support fast review events and decide whether
or not to further read more data into Store.
Layout of Store
Front Store
std::deque<Tobject* > frontStore
Back Store
std::map< std::string, std::list<Tobject* > > backStore
Object Access and I/O
 Data Store is used to manage the layout of TObjects
– support access to Data Store with path or type
 Data objects could be written to and read from Data
Store with the streamers of these classes which will be
automatically generated and included in the dictionary
library.
 Manipulation of this Data Store:
– Filling
– Reading
– Trimming
– Writing
Need further design and implement
Discussion
 Further collect requirements of framework
–
in form of Use Case Study
 Discuss and finalize the Data Model Design
–
–
–
–
data structure
data navigation
data storage
implementation
 similar with GOD or coded
 Optimize design of In-Memory Management
–
–
–
Buffer mechanism
Adopting ROOT
I/O
 Other functionalities discussion
–
–
–
–
Job configuration ( python )
User interfaces (Algorithm, Services, Tools……)
structure of the whole offline softer system
……
More inputs
More thinking
More discussion
Thanks a lot!
Art
 art is a generic C++-based modular analysis framework,
for use from generator-level or DAQ event building
through simulation, production and user analysis.
– g-2 ,Mu2e,NOνA, LArSoft(μBooNE,ArgoNeuT, LBNE)
 art grew out of and forked from CMS in 2010
 Developer has been involved with frameworks:
– DØ, BTeV, CMS and MiniBooNE.
 art plans to support parallel processing of independent
events as well as to permit parallel processing within
events
Art architecture
Download