The Run Control and Monitor System (RCMS) of the experiment

advertisement
RCMS Introduction
RCMS Description
RCMS Introduction
The CMS Run Control and Monitor System (RCMS) configures, controls and monitors all the
distributed objects composing the DAQ system (RUs, BUs, EVM, FU, etc.) and the trigger system. It
also instructs the Detector Control System (DCS) to operate the different sub-detectors according to
the needs. The users gain access to the experiment through the portal RCMS that assignes them the
rights to operate with the apparatus according to their privileges. It also hides to the users the
complexity of the single control and monitor actions providing the proper midlleware to perform
them.
The architecture, as depicts in the above figure, enables the users to access and control the
experiment from any part of the world providing the definition of a virtual counting room where
physicists and operators might perform remotley the experiment shifts.
More in details RCMS is defined as all the elements, hardware and software, that are required in
order to:
-
completely configure and set the whole of or partitions of the CMS apparatus. Such
partitions may function independently and concurrently.
-
Control and synchronize operation of the separate components of the CMS apparatus
-
Monitor the separate components of the CMS apparatus
-
Handle errors and information messages
-
Log continuously the current state of the experiment
-
Provide a user interface for both control and monitor
RCMS Description
The here below figure shows the basic logical layout of the Run Control and Monitor System
(RCMS). It consists of 3 types of elements: the session manager , the sub-system controller and a set
of services needed to support specific RCMS functionalities like security, logging services, etc..
The RCMS job is organized on the base of “run sessions”. We can define a run session as the
hardware and software needed to operate a physics or test run with the entire or a partition of CMS
apparatus. A run session is composed by all or some sub-systems (according to the session type) and,
inside a given sub-system, by a selected partition of its resources (e.g. RUs, BUs, etc.). Thus multiple
run sessions may coexists and run concurrently. Every activated run session has the own Session
Manager that is in charge to coordinate the specific run session. It mainly handles the
communications with the users have joined the session and with the sub-systems participating to the
run session. The Session Manager can see the single resources composing the sub-systems, and then
command them or getting information from, through the related sub-system resource broker called
Sub-System Controller. The RCMS services provide support for security, log facility, job control, etc.
The here below figure displays the block diagram of RCMS showing the services that have been
identified, and thus defined, so far.
These are the services RCMS provides to support the interaction with users and to manage the subsystem resources. Some services have a backend data base. At the present the defined services are:
SECURITY SERVICE (SS). It provides login procedures to RCMS and all the needed functions to
manage user accounts. It handles the UserDB data base.
RESOURCE SERVICE (RS). It provides access to the backend configuration data base (ConfigDB)
where are stored information concerning the defined sessions, sub-system partitions, DAQ
resources description and related hardware configuration. This service handles also the global
“still alive heart beat” of the entire CMS DAQ.
INFORMATION AND MONITOR SERVICE (IMS). It collects messages and monitor data coming
from any DAQ resources or RCMS internal components and stores them in the logDB data
base. IMS messages are classified in types: errors, warnings and generic information. IMS can
distribute such messages, with the possibility to filter them according to their type, level of
severity, source of the message, etc., to any external subscriber. A similar mechanism is also
used to distribute monitor data. IMS handles even the run bookkeeping of the experiment.
JOB CONTROL (JC). It starts, monitors and kills the software infrastructure of RCMS, including the
software agents running in the single DAQ resources
PROBLEM SOLVER (PS). It uses the information provided by the RS and by IMS to catch severe
malfunctions of the whole apparatus and try to fix them.
As the name implies, the Sub-System Controllers (SSC) are the controllers of the sub-systems. Every
controller receives commands from the session manager and dispatches them to its own DAQ
resources or to a subset (corresponding to a partition) of them. Every SSC includes a Sub-System
Management Tool to provide a central point of operations for basic functions such as OS installations,
software installation, update and distribution, system level monitoring, to name a few.
A sub-system is composed of DAQ resources (referred to as “resources” from here onwards). A
resource can be defined as a piece of hardware with some intelligence and network access, or as the
related software that characterizes the resource type (Readout Unit, Builder Unit, etc.). A piece of
hardware with network access is defined as a network node (or simply “node”). At the present time
we can define the following resources for the event builder, event filter and trigger sub-systems:
-
Event Builder (EB)
-
o
FED Builder
o
RU Builder
Event Filter (EF)
o
-
-
Trigger (TRG)
o
Calorimeter trigger
o
Muon trigger
o
Global trigger
Detector Control System (DCS)
o
-
Filter Units (FU)
To be defined
Computer Service (CS)
o
To be defined
The next figure shows schematically the interaction between a given session manager and the defined
sub-systems.
Finally the DAQ sub-system resources (RUs, BUs, etc.) will provide software agents to deal with the
own sub-system controller. Such agent has the aim to receive commands, to provide their execution
interacting with the proper application software and to communicate the status of the operation to the
RS and IMS.
Download