Overview of the Computing Fabric at CERN Bernd Panzer-Steindel Computing Fabrics Area Manager, CERN/IT The following overview tries to describe in some details the setup and organization of the Computing Fabric at CERN. The LCG project has of course by nature a large international aspect, while the Fabric Area of the project has a very strong CERN focus, as it has the goal to provide the necessary local infrastructure, T0 center, to safely store and process the large amount of raw data produced by the 4 LHC Experiments (ALICE ATLAS CMS LHCb). The second goal is the installation and deployment of a more general T1 center for the reprocessing, analysis and import/export of the data. (Tier center structure) ( dataflow diagram ) The internal structure and the choice of technology is handled separately by each center, but some coordination is done through the GDB , regular conferences like HEPIX and bilateral projects between HEP centers. The Fabric activities are integrated into the management structure of the CERN IT division. The Computing Fabric consists mainly of services and activities to improve or evolve them and uses as a basic principle the coupling of computing elements through software and hardware to reach different complexity levels (complexity diagram). The Computing Fabric consists in the functional view out of several layers (fabric diagram). Infrastructure The basic infrastructure is providing space, electricity and cooling for the computing equipment. The CERN computer center is currently undergoing some major changes to prepare for the large amount of equipment needed in 2007 when LHC starts. The vault of building 513 has been completely refurbished (pictures) to nearly double the available space. The major project during the next 2 years is the upgrade of the power stations to provide the center with 2 MW of power for electricity and ensure the corresponding cooling. (overview electricity and cooling) The amount of available power determines the size of the compute power which can be provided. Unfortunately there are not yet enough technological developments to improve the ratio of compute power per consumed Watt (power status). Another important activity is the management of the material flow in the computer center, which covers a wide range of issues : the organization of purchases, the deployment and installation of new equipment, the regular replacement of ‘old’ systems , the daily repair/maintenance of broken equipment and the detailed bookkeeping of the material flow and its status. One focus here the preparation of the purchase procedures for very large amounts of computing items, as during Phase 2 of the the LCG project (2006 – 2008) we will have to invest about 60 million SFr to cope with the computing requirements of LHC (cost evaluation). System management and operation The CERN IT approach to the general handling of system administration has evolved during the last 2 years. We are moving from an outsourced solution to in sourcing. This is based on the detailed experience from running a large center during the last years with a mixture of local service teams and a IT company providing system administration services. This period also improved greatly the understanding of our Total-Cost-of-ownership (description). In parallel the software tools have been developed and adapted to achieve a high level of automation. The current architecture distinguishes between three different functional clusters in the center : CPU nodes for computation, disk nodes for storage and tape servers for the access to the mass storage (tape silos). All functionalities are provided by dual INTEL processor ‘deskside’ boxes running the LINUX operating system. The system management team uses tool sets to install the nodes (operating system and environment), configure the environment, monitor the status and correct problems. Overview installation and configuration Overview monitoring and fault tolerance Overview basic control A team of LINUX experts is providing the necessary support for the system management team. The evolution and deployment of LINUX versions on the production cluster (mainly CPU nodes concerned) is done in a close collaboration of the different service teams and the experiments (LINUX certification). In addition some HEP coordination is done regularly in the framework of the HEPIX conference and via HEP mailing lists. Services for the Data and Task flow In this layer the different computing elements are coupled through the physical Network structure and logically through a batch scheduling system and through storage systems. Our network strategy is based on a hierarchical tree structure using different Ethernet implementations (Fast, Gigabit, 10 Gbit, network diagram), which will expand in size and performance from 2005 onwards to cope with the expected data rates in 2008 and later (data rate note). This is organized by the CERN IT network team. Overview cluster network The large production cluster is separated into two parts called Lxplus and Lxbatch. The Lxplus service provides the interactive user environments for the different experiments. The major resources are provided by Lxbatch which uses the batch scheduling system LSF for the coordination and load balancing of the user jobs. Overview batch scheduler There are two services with different mechanisms to enable the access to data and software repositories : 1. The access to small files (software, user environment, etc.) is managed by the Andrew Filesystem (AFS). The shared filesystem approach is also becoming a key element for the distribution mechanism of experiment software environments in the GRID. Overview shared storage 2. The bulk data management is done through the Hierarchical Storage Management system CASTOR. It is not a full fledged shared filesystem, but rather provides a minimal set of functionality similar to that to keep the complexity low. Today about 300 disk server with about 200 TB of disk space are managed by CASTOR as a disk cache layer and about 2 PB of tape space in large tape libraries as the backend storage part. Hierarchal Storage Manager, CASTOR Couplings to the ‘inside/outside’ The production part of the Fabric is ‘coupled’ via the online farms of the experiments, feedback loops to experimental test clusters, WAN and GRID middleware to other centers, projects with Industry and Institutes. The deployed nodes in the center are actually not a monolithic installation used exclusively for production, but is rather a collection of sub-clusters (cluster diagram) for a variety of activities. Besides the main production nodes an important system is the so called prototype farm. This is used for technology evaluations focused on future production use (relations with Industry , openlab) and high performance computing data challenges like the ALICE mass storage test or online activities ( ALICE event building, ATLAS control). (data challenges). This is related to the computing models for the four main computing tasks : 1. Central Data Recording ( CDR) and reprocessing 2. Analysis 3. Monte Carlo production 4. Data import and export A considerable number of nodes is used in the different GRID testbeds and certification activities. The boundary between Online computing at the experiments and Offline computing in the center is a rather arbitrary one. Both sides are now using as much as possible commodity computing equipment, but have still a few differences in requirements. Some discussions about synergies in these areas have started (workshop). CERN is the lead partner in the DataTag EU project to test high speed (10 Gbit) Wide Area Network connections. The equipment will be used to provide a 10 Gbit WAN connection to the clusters in the center, starting with a link to the prototype farm. A project between US-CMS and IT in the LCG framework will use this to test on a large scale (time and size) the data export via WAN (proposal). The interaction with the experiments takes place via the different services and for the resource planning this was done partly through the COCOTIME committee and now the PEB. The center provides of course not only services for the LHC experiments , but also for the fixed target experiments ( already large data rates, 150 MB/s CDR peak values), general user community (small experiments, engineers, etc.). (IT home page) During the last two years we have organized 2 external audits of the activities in the center (audit1 audit2 ). Details about the planning and developments in the different areas described above can be found in the milestone status page The progress in the Fabric developments is reported regularly (per quarter) in the quarterly reports together with the other areas in the LCG project, there is also a detailed risk analysis paper available.