Chapter 4 The BESIII Detector¾ Physics Goals and Design

advertisement
7. Offline Analysis System and Computing Environment
7.1 Overview
The BES detector has operated for about 12 years, and the BES offline data
analysis environment also has been developed and upgraded along with developments
of the BES hardware and software. At present, BES data are processed on HP-UNIX
workstations and physics analysis of DST data are proceeded on PC-farm. A
1000Mbps optical fiber network distributed with 100Mbps fast Ethernet system, as
well as 100Mbps FDDI local area network, serves as highway of communication and
data transfer between BES HP/UX workstations and PC machines.
Based on existing BES computing environment, following points should be taken
into account for future BESIII offline analysis software:

To benefit the collaboration and make better exchanges with the international
HEP experiments, this system should be set up by adopting or referring the
newest technology in the hardware and software from advanced experiments in
the world.

This system should support hundreds of the existing BES software packages and
serve for both old experts who is familiar with BESII software and computing
environment, and new members in the BES who is going to work under the new
system.

Most of the existing software packages of the BESII will be modified or redesigned according to the hardware upgrade.
The BESIII computing facility and software system will operate for many years.
Thus they should meet requirements from developments of high speed in hardware
and new technology in software. It should be highly flexible, powerful, stable and
easy for maintenance.
7.2 Requirements
7.2-1 BESIII Data Yields
The designed luminosity of the BEPC2 is 1033 cm 2 s 1 at the
J /
energy. Main
trigger rate of BESIII is roughly estimated to be 2000 Hz.
The event size is about 6 Kbytes/event for raw data (100 Mbytes per 30K events
for BESII case), 15 Kbytes for reconstructed data, 3 Kbytes for DST data, which is
about one fifth of reconstructed data.
Assuming 5 months of data taking per year with the effective factor of 0.3, then
1
total number of events per year is: 0.32000360024305 8109. Total ammount
of data size in raw data format each year is 61038109501012 bytes. The detail
information is listed in table 7-1:
Table 7-1. Estimation of the BESIII Data Yields per year
Data type
Event size
(Kbytes)
Total number
Of events
(10 9 )
Online Data
Raw Data
Rec. Data
DST Data
6
6
15
3
8
8
8
8
50
50
120
24
M.C. Rec. Data
15
8
120
Total events size
(Tbytes)
Total
364
7.2-2 CPU Power
According to data processing at BESII, required CPU for data reconstruction and
DST production is 20 and 0.2 MIPSs per event, respectively. On average, data
processing of real events are repeated three times due to improving of reconstruction
or calibration software. Thus requirement of total CPU to make events reconstruction
and its DST production per year, as well as Monte Carlo simulations and their event
reconstruction, is about 36280 MIPS. The details on CPU requirements are listed in
table 7-2.
Table 7-2. The CPU power required for handling the BES3 data.
Job type
Speed/Event
(MIPS  s)
Total
number of
events
(109 )
Data
Preprocessing
Data
Reconstruction
DST
Production
MC Data
Reconstruction
Processing
time ( 10 7 sec)
Total CPU
(MIPS)
0.1
8
2
40
20
24
2
24000
0.2
24
2
240
30
8
2
12000
Total
36280
2
7.2-3
Data Storage and Management
There are various types of data records that are recorded in output data files by
DAQ system. Original DAQ data files, raw data of the BES main trigger events and
their reconstructed data events are stored onto tapes mounted on Robot in the
computer center. The total amount of all types of data produced each year is
estimated to be 484 Tbytes, and among which 120 Tbytes is for MC data, 364 Tbytes
for reconstructed data.(two copies of reconstructed data are required to prevent data
lost from tape damage), DAQ recorded data, raw data of BES main trigger and DST
data. The DST data are stored on the RAID disk via high-speed network system with
capacity of about 24 Tbytes per year. The disk space installed in each physics
analysis platforms are determined by requirements of individual physics project itself.
7.2-4 Bandwidth for Data Transportation
The bandwidth required for DAQ data transfer from DAQ system to offline data
server should be more than 100Mbps, which is determined by product of event trigger
rate times event length, i.e. 2000  3Kbytes8 = 48 Mbps. It also requires that the
network system should be highly stable and secure to avoid losing of events.
The bandwidth required for data transfer from data server (i.e. RAID disk) to
reconstruction platform of PC farm mainly depends on processor speed of selected
machines. The higher processor speed is reached, the wider bandwidth is required.
Because loading and unloading of huge amount of data would be done in local
network, it is necessary to create an isolated BES computing environment, which is
separated from other parts of IHEP network, and ensure reasonable efficiency in data
transfer.
High transfer rate is a critical factor for PC farm, where many processors are
connected each other via high-speed network. The bandwidth required for data
transfer from RAID disk to such a physics analysis PC platform should be extremely
high. Suppose that the speed of PC-processor is 10 times higher than HP/J5000, and
CPU consumed in physics analysis is 25-30 times less than that for event
reconstruction, then data transfer speed is estimated to be higher than 12 Mbytes/s
= 96 Mbps. (at present event reconstruction of real data takes about 0.4 sec per event
on HPWS9/HPJ5000, which is equivalent to the transfer speed of 40 Kbytes/sec, then
required transfer speed is 401030 Kbytes/s = 96 Mbps). High-speed networkbased architecture (such as Fiber Channel, FDDI etc.) is the best way to transfer DST
data from RAID disk to local disk of the physics platform.
3
7.3 Computing System
7.3.1 Overall Description
The BESIII computing system is designed to handle offline data production,
physics analysis, data storage and management, and domestic and overseas
communication of data and information. In existing BES computing system, the
HPWS8-9 (HP-J2400 and -5000) serve for raw data production, data calibration,
event reconstruction and DST data compression, while the communication is
implemented through 1000Mbps fast Ethernet and 100Mbps FDDI network. The
BESIII computing system can be sketched as shown in Fig.7-1.
4
Figure 7-1. The Scheme of BES3 and IHEP Computing System
The BESIII computing system consists of PC farm and HP/UX machines
located in the computer centre, and PC farm and PCs scattered in physics user sites. A
typical PC farm, BESPCFARM, has been supplied to the J/ physics analysis users
and is shown in the Fig.7-2. Data transfer from DAQ system to offline data server,
data reconstruction and calibration, DST data production, MC event generation and
detector simulation are proceeded on HP/UX and PC farm in the computer center.
Physics analysis jobs and data communication to the computer center are processed
via individual PCs and PC farm in the physics user sites.
5
High-speed network is also essential for mass storage system, such as Robot library
and RIAD disk.
High performance for the computing system, including stability, reliability and
flexibility, should be reached with a reasonable and acceptable commercial price.
Fast development of advanced technology in both computer hardware and software
should be followed closely, and optimized options for the computing system can be
made via couple of separated upgrades.
6
7.3-2 Networking System
A fast Ethernet with 1000 Mbps for the main ring and 100 Mbps for the local
area network have been set up at the IHEP as shown in Fig.7-3. Efficient data
transferring from DAQ system to the IHEP Computing Center, fast communication
between data server, reconstruction PC-farm, HP machine and physics analysis PCfarm; and synchronized management of code library among domestic and
international collaboration institutions are essential here. Dedicated design with
special fast Ethernet networks, wide area and local area networks are clearly required.
In order to establish the network environment, we must investigate advanced
technologies and their market prices in Giga-Ethernet, fast Ethernet and Fiber channel
and so on.
7
Fig. 7-3
The network architecture of IHEP
8
7.3-2 Data Production System
Total CPU power required for raw data production, offline data calibration and
event reconstruction, and the MC event generation and detector simulation, that are
proceeds on the PC farm and HP machines at IHEP Computing Center, is about
36280 MIPS.
7.3-3 Physics Analysis System
BESIII physics analysis PC farm is located on physics users sites. The DST data
can be transferred through network or by data tapes. There are huge disk space in the
analysis farm which are sufficient to store DST data and analysis job output produced
by physics users.
7.3-4 Tape Libary System
Suppose that all the BESIII data, including DAQ data, offline raw data and
reconstructed data are written onto Robot tapes. The capacity of this Robot library
system is suggested to be large enough so that storage of 484 Tbytes data and their
access each year can be smoothly managed. High stability and reliability, advanced
management technology and related software system are required.
7.3-5 RAID Disk System
The BES DST data are stored on RAID disk with a volume increment of 24
Tbytes per year. High data transfer rate of this device is required so that large and
occasional network load for data transfer between the RAID disk and physics analysis
farm can be carried out efficiently. The device should be equipped with advanced
software system and management technology in order to provide convenient and
rapid response for data reading and writing.
9
7.4 BESIII Offline Data Analysis System
7.4.1 Overall Description
The BESIII software system includes various software packages, with that data
production, data transfer between tapes and disks, MC data production, physics
analysis, and management of data files and source files are carried out. The scheme of
BESIII offline data analysis system is shown in Fig.7-4
10
Access
Source files
Access
Database
Utility
Packages
Main Framwork
User
Interfac
e
Raw Data
Events
Prod.
Data Cal.
Eve Rec.
MC
Prod.
Phy Ana.
Figure 7-4. The scheme of BES3 offline data analysis system.
This system should be established in such a way that it would not only take the
advantage of the power of the OO approach and C++ language, but also incorporate
existing BES Fortran software packages developed in passed years and still
acceptable after necessary modifications. The practical aspects of this system would
also be taken into account, such as usability, stability and flexibility, to accommodate
the needs of different levels from experts and novices.
11
7.4.2 Common packages and utility software
7.4.2-1 Framework of BESIII Offline Analysis System
The framework of the BES offline analysis system, including two main packages
of DRUNK and SOBER, is written in the Fortran language 15 years ago and
immigrated from the MARKIII experiment. It containes I/O package for accessing
all kinds of the BES data files, calibration package for the offline data calibration,
user interface package, utility packages for the physics analysis; simulation
packages for MC data production. The framework of the BESIII offline analysis
system will be similar as that of BESII, but takes following main features into
account:

It must be based on the Object-Oriented methodology and the OO programming
language C++, and support the BES existing packages in the Fortran language.

This system must provide the convenient-to-use and standard access for the
database management system and user interface.

The system will link various utilities and the special tools that are available from
frontier HEP laboratories and experiments.
7.4.2-2 User Interface
The user interface packages of the BES are written in the Fortran language.
However, the C++ language and OOP(Object-Oriented programming) approach have
been accepted in the high energy physics community in recent years, and become one
of leading software technologies. The user interface in the C++ environment should
be set up under the BESIII data analysis system,. Lots of works need to be done:
investigation of OOP (Object-Oriented programming) technology and the C++
language, immigration of the C++ common library and establishing of user interface
in different platforms (HP-Unix and PC-Linux).
7.4.2-3 C++ and Physics Common Library
The common libraries, mathematics and the graphic tools used in the BESII
physics analysis, supporting the Fortran, are copied from CERN. The new version of
CERN library exploited by the OOP and C++ language is available. To learn the
technology by sending experts to frontier experiments and inviting foreign experts to
visit our institute is highly suggested. Some other C++ common libraries also need to
be established, such as GUN library and so on.
12
7.4.2-4 Database Management System
The database of the BESIII should satify requirements of huge information
management from the experiment. Various data need to be managed and accessed by
the database, such as constants of the detector geometry, constants of offline data
calibration, detector monitoring data, detector and machine operation condition, and
physics event data. BES data are managed in a way of hand-operated. It is poor in
safety and inefficient in access, especially in the case of searching specific
information from bulk of constant files. The BESIII system needs a standard
commercial software package to manage the database and the physics event data. The
ORACLE package has been used in some other HEP experiments, such as
Babar/PEPII at SLAC. The investigation of using ORACLE to manage BESIII
database is recommended. Benefit of doing so is that some of the application
packages of it can be migrated from other HEP experiments.
7.4.2-5 Management of BESIII Source codes
The BES is using software package, Codeman, to manage the source files at
different kinds of machine platforms in individual sites of BES collaboration
institutions. The Codeman is one of candidates to be used to manage the BESIII
source codes. We are also considering some other new software packages for code
management, such as CVS, RCVS or the OODBMS to overcome the disadvantage of
the Codeman.
7.4.2-6 Utility packages and Tools for BES Physics Analysis
Many physics analysis special packages and tools exist in the BES software
system. Such as the event display package, the kinematic fitting package (Telesis) for
physics analysis. All these packages should be modified to match BESIII offline
analysis environment, or written by using OOP and C++ language.
7.4.3 Offline Calibration system and Event reconstruction codes
7.4.3-1 Offline Data Calibration
The offline data calibration packages, such as the MDC_t0, VC_t0, TOF,
BEMC/EEMC, Mu counter and dE/dx, need to be newly developed by using OOP
methodology and C++ language according to the new design of the BESIII detector
and to match the BESIII offline data analysis environment.
13
7.4.3-2 Event Reconstruction codes for the BEMC
The event reconstruction package for BESIII BEMC should be designed and
written by the new OOP methodology and the C++ language to accommodate the
main framework of the BESIII data analysis environment. The reconstruction
packages of L3 EMC would be a good reference.
7.4.3-3 Event Reconstruction Codes of the BESIII MDC
DCJULIE of the BESII MDC event reconstruction may not be used, because the
structure of BESIII drift chamber may be redesigned to match BEPCII beam pipe
geometry. A new MDC event reconstruction package should be designed on the basis
of OOP and C++ language. The tracking method will refer to the newest packages
used in other HEP experiments, and incorporate the experience of the BESI and
BESII. It should be built in such a way that it is more convenient to access the
constants, easy for maintenance and further development.
7.4.3-4 Event Reconstruction Codes of the VC,  and TOF Counters
The event reconstruction packages for vertex chamber (VC),  counter and TOF
counters of BESII need to be re-designed and modified to match new geometry of
VC,  and TOF counters. Their reconstruction packages should be modified for
accessing to physics data records and database management system of the BESIII
analysis system.
7.4.4
MC Simulation Packages
The SOBER and SIMBES packages in BESII had been developed for MC event
generators and detector simulation in Fortran language. All the event generators and
detector simulation codes need to be redesigned and rewritten by using OOP
technology and C++ language. At least, the MC simulation packages need to be
modified to fit the BESIII new geometry and its database assessment.
The investigation is needed to develop a new detector simulation package based
on GEANT4 with support of the C++ language.
14
Download