RCMS Introduction RCMS Description RCMS Introduction The CMS Run Control and Monitor System (RCMS) configures, controls and monitors all the distributed objects composing the DAQ system (RUs, BUs, EVM, FU, etc.) and the trigger system. It also instructs the Detector Control System (DCS) to operate the different sub-detectors according to the needs. The users gain access to the experiment through the portal RCMS that assignes them the rights to operate with the apparatus according to their privileges. It also hides to the users the complexity of the single control and monitor actions providing the proper midlleware to perform them. The architecture, as depicts in the above figure, enables the users to access and control the experiment from any part of the world providing the definition of a virtual counting room where physicists and operators might perform remotley the experiment shifts. More in details RCMS is defined as all the elements, hardware and software, that are required in order to: - completely configure and set the whole of or partitions of the CMS apparatus. Such partitions may function independently and concurrently. - Control and synchronize operation of the separate components of the CMS apparatus - Monitor the separate components of the CMS apparatus - Handle errors and information messages - Log continuously the current state of the experiment - Provide a user interface for both control and monitor RCMS Description The here below figure shows the basic logical layout of the Run Control and Monitor System (RCMS). It consists of 3 types of elements: the session manager , the sub-system controller and a set of services needed to support specific RCMS functionalities like security, logging services, etc.. The RCMS job is organized on the base of “run sessions”. We can define a run session as the hardware and software needed to operate a physics or test run with the entire or a partition of CMS apparatus. A run session is composed by all or some sub-systems (according to the session type) and, inside a given sub-system, by a selected partition of its resources (e.g. RUs, BUs, etc.). Thus multiple run sessions may coexists and run concurrently. Every activated run session has the own Session Manager that is in charge to coordinate the specific run session. It mainly handles the communications with the users have joined the session and with the sub-systems participating to the run session. The Session Manager can see the single resources composing the sub-systems, and then command them or getting information from, through the related sub-system resource broker called Sub-System Controller. The RCMS services provide support for security, log facility, job control, etc. The here below figure displays the block diagram of RCMS showing the services that have been identified, and thus defined, so far. These are the services RCMS provides to support the interaction with users and to manage the subsystem resources. Some services have a backend data base. At the present the defined services are: SECURITY SERVICE (SS). It provides login procedures to RCMS and all the needed functions to manage user accounts. It handles the UserDB data base. RESOURCE SERVICE (RS). It provides access to the backend configuration data base (ConfigDB) where are stored information concerning the defined sessions, sub-system partitions, DAQ resources description and related hardware configuration. This service handles also the global “still alive heart beat” of the entire CMS DAQ. INFORMATION AND MONITOR SERVICE (IMS). It collects messages and monitor data coming from any DAQ resources or RCMS internal components and stores them in the logDB data base. IMS messages are classified in types: errors, warnings and generic information. IMS can distribute such messages, with the possibility to filter them according to their type, level of severity, source of the message, etc., to any external subscriber. A similar mechanism is also used to distribute monitor data. IMS handles even the run bookkeeping of the experiment. JOB CONTROL (JC). It starts, monitors and kills the software infrastructure of RCMS, including the software agents running in the single DAQ resources PROBLEM SOLVER (PS). It uses the information provided by the RS and by IMS to catch severe malfunctions of the whole apparatus and try to fix them. As the name implies, the Sub-System Controllers (SSC) are the controllers of the sub-systems. Every controller receives commands from the session manager and dispatches them to its own DAQ resources or to a subset (corresponding to a partition) of them. Every SSC includes a Sub-System Management Tool to provide a central point of operations for basic functions such as OS installations, software installation, update and distribution, system level monitoring, to name a few. A sub-system is composed of DAQ resources (referred to as “resources” from here onwards). A resource can be defined as a piece of hardware with some intelligence and network access, or as the related software that characterizes the resource type (Readout Unit, Builder Unit, etc.). A piece of hardware with network access is defined as a network node (or simply “node”). At the present time we can define the following resources for the event builder, event filter and trigger sub-systems: - Event Builder (EB) - o FED Builder o RU Builder Event Filter (EF) o - - Trigger (TRG) o Calorimeter trigger o Muon trigger o Global trigger Detector Control System (DCS) o - Filter Units (FU) To be defined Computer Service (CS) o To be defined The next figure shows schematically the interaction between a given session manager and the defined sub-systems. Finally the DAQ sub-system resources (RUs, BUs, etc.) will provide software agents to deal with the own sub-system controller. Such agent has the aim to receive commands, to provide their execution interacting with the proper application software and to communicate the status of the operation to the RS and IMS.