LHCb Development Glenn Patrick Rutherford Appleton Laboratory 4th February 2004 GRIDPP9 1 LHCb - Reminder B meson 1.2M electronic channels Weight ~4,000 tonnes b d Muon System Tracking stations (inner and outer) Magnet Calorimeters VELO 4th February 2004 20 m RICH1 GRIDPP9 RICH2 Anti-B meson b d 2 LHCb GridPP Development LHCb development has been taking place on three fronts: MC Production Control and Monitoring Gennady Kuznetsov (RAL) Data Management Carmine Cioffi (Oxford) Karl Harrison (Cambridge) GANGA Alexander Soroko (Oxford) Karl Harrison (Cambridge) 4th February 2004 GRIDPP9 All developed in tandem with LHCb Data Challenges 3 Data Challenge DC03 65M events processed. Distributed over 19 different centres. Averaged 830,000 events/day. Equivalent to 2,300 × 1.5GHz computers. RICH2 RICH1 34% processed in UK at 7 different institutes. VELO TT All data written to CERN. “Physics” Data Challenge. Used to redesign and optimise detector … 4th February 2004 GRIDPP9 4 The LHCb Detector Reduced number of layers for M1 (4 2) Reduced number of tracking stations behind the magnet (4 3) No tracking chambers in the magnet No B field shielding plate Changes were made for Full Si station material reduction and Reoptimized RICH-1 design 4th February 2004 GRIDPP9 5 L1 trigger improvement Reduced number of VELO stations (25 21) “Detector” TDRs completed 4th February 2004 GRIDPP9 Only Computing TDR remains 6 Data Challenge 2004 “Computing” Data Challenge. April – June 2004 Produce 10 × more events. At least 50% to be done via LCG. Store data at nearest Tier-1 (i.e. RAL for UK institutes) Try out distributed analysis. Test computing model and write computing TDR. Require stable LCG2 release with SRM interfaced to RAL DataStore 4th February 2004 GRIDPP9 7 DC04: UK Tier-2 Centres NorthGrid Daresbury, Lancaster, Liverpool, Manchester, Sheffield SouthGrid Birmingham, Bristol, Cambridge, Oxford, RAL PPD 11 01 10 00 11 ScotGrid Durham, Edinburgh, Glasgow LondonGrid Brunel, Imperial, QMUL, RHUL, UCL 4th February 2004 GRIDPP9 8 DIRAC Architecture Information Service API Job Provenance Auditing Authentication Authorisation User Interface Accounting Metadata Catalogue Grid Monitoring File Catalogue Workload Management Package Manager DIRAC components Other project components: AliEn, LCG, … Data Management Storage Element Computing Element Resources: LCG, LHCb production sites 4th February 2004 GRIDPP9 9 MC Control Status Gennady Kuznetsov DIRAC Distributed Infrastructure with Remote Agent Control Control toolkit breaking down production workflow into components – modules, steps. To be deployed in DC04. SUCCESS! 4th February 2004 GRIDPP9 10 DIRAC v1.0 Original scheme Monitoring service Bookkeeping service Production service Bookkeeping data Monitoring info Agent Get jobs Agent Site A Agent Agent Site D “Pull” rather than “Push” 4th February 2004 Site B Site C GRIDPP9 11 Components – MC Control Module is the basic component of the architecture Step Module Module Module Module Module Module Workflow Production Step Step Step Step Step Step Step 4th February 2004 Module Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job GRIDPP9 This structure allow the Production Manager to construct any algorithm as a combination of modules. Levels of usage: 1. Module – Programmer 2. Step – Production Manager 3. Workflow – User/Production manager Each step generates job as a Python program. 12 Gennady Kuznetsov Module Name Module Editor Stored as XML file Description Module variables. 4th February 2004 GRIDPP9 Python code of single module. Can be many classes. 13 Gennady Kuznetsov Step Editor Step Name Stored as XML file, where all modules are embedded Definitions of Modules Instances of Modules Description Selected instance Variables of currently selected instance Step variables. 4th February 2004 GRIDPP9 14 Gennady Kuznetsov Workflow Editor Workflow Name Stored as XML file Step Definitions Step Instances Selected Step Instance Description Variables of currently selected Step Instance Workflow Variables. 4th February 2004 GRIDPP9 15 Gennady Kuznetsov Job Splitting The input value for the job splitting is a Python list object. Every single (top level) element of this list applies to the Workflow Definition and propagates through the code and generates a single element of the production (one or several jobs). Python List Workflow Definition Step Step Step Step Step Step Step 4th February 2004 GRIDPP9 Production Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job Job 16 Gennady Kuznetsov Future: Production Console Once an agent has received a workflow, the Production Manager has no control over any function in a remote centre. Local Manager must perform all of the configurations and interventions at individual site. Develop ”Production Console” which will provide extensive control and monitoring functions for the Production Manager. Monitor and configure remote agents. Data replication control. Intrusive system – need to address Grid security mechanisms 4th February 2004 and provide robust GRIDPP9 environment. 17 DIRAC v1.0 Architecture Production preparation Application Application packager packager Workflow Workflow Production Production editor editor editor editor Create application tar file Edit Instantiate workflow Production manager Production Manager Production resources Site A Agent Agent A A Site B … Production Production DB DB Agent Agent B B Job Job XML XML Job Job request request Job Job status status Meta Meta XML XML Dataset Dataset replica replica Site n Production Production Service Service Monitoring Monitoring Service Service Monitoring Monitoring DB DB Bookkeeping Bookkeeping Service Service Bookkeeping Bookkeeping DB DB Central Storage Castor Castor MSS MSS CERN CERN Agent Agent nn 4th February 2004 Central Services GRIDPP9 18 DIRAC v2.0 WMS Architecture Based on central queue service Production Production Service service DIRAC Workload Management GANGA Job Receiver Optimizer 1 Optimizer Optimizer 11 Optimizer Optimizer 11 Command line UI Job queue Job DB 4th February 2004 Match Maker GRIDPP9 Computing resources Agent 1 LCG LCG CE CE Agent 2 LCG LCG WMS WMS Agent 3 DIRAC DIRAC CE CE Also data stored remotely 19 Data Management Status Carmine Cioffi File catalogue browser for POOL Integration of POOL persistency framework into GAUDI new EventSelector interface. 4th February 2004 GRIDPP9 SUCCESS!20 Main Panel, LFN Mode Browsing POOL file catalogue provides LFN & PFN association. Write mode selection Import the fragment of a catalog Shows the metadata schema, with the possibility to change it Reload the catalog Read the next and previous bunch of files from the catalog List the files selected Filter text bar. Search text bar List of LFNs List of PFNs associated to the LFN selected from the list of LFNs on the left sub-panel Tabs for LFN / PFN mode selection 4th February 2004 List all the metadata value of the catalog Browser allows user to interact with catalogue via GUI. GRIDPP9 Can save list of LFNs for job sandbox 21 Main Panel, PFN Mode Browsing In PFN mode, the files are browsed in the same way as Windows Explorer. The folders are shown on the left sub-panel and the value of the folder on the right sub-panel. Sub menu with three operations to be done on the file selected. 4th February 2004 GRIDPP9 Write mode button opens WrFCBrowser frame allowing user to 22 write to the catalogue… Write Mode Panel Remove a LFN Add a PFN replica Delete a PFN Add LFN Add metadata value Rollback Commit Register a PFN Show the action performed 4th February 2004 GRIDPP9 23 PFN register frame Frame to show and change the metadata schema of the catalog This frame allows setting of the metadata value 4th February 2004 GRIDPP9 24 This frame shows the metadata value of the PFN Myfile This frame shows the attribute value of the PFN Shows the list of the files selected 4th February 2004 GRIDPP9 25 GAUDI/POOL Integration Benefit from investment in LCG Retire parts of Gaudi reduce maintenance. Designed and implemented a new interface for the LHCb EventSelector. Criteria: One or more “datasets” (e.g. list of runs, list of files matching a given criteria). One or more “EventTagCollections” with extra selection based on Tag values. One or more physical files. Result of an event selection is a virtual list of event pointers. 4th February 2004 GRIDPP9 26 Physicist’s View of Event Data Dataset Dataset File1 Event Files Event 1 Event Event 2 1 Event 2 … Event 2 … … 3 Event Event 3 Event N RAW2-1/1/2008 RAW3-22/9/2007 RAW4-2/2/2008 … Collection Set B -> ππ Candidates (Phy) B -> J/Ψ (μ+ μ-) Candidates … Dataset Dataset Event Event 1 tag collctn Event 1 Tag 21 Event Event 2 … Tag 2 … … 3 Event Event 3 Tag M 0.3 1.2 8 3.1 Gaudi Bookkeeping 4th February 2004 5 2 GRIDPP9 27 Future: Data to Metadata File catalogue holds only a minimal amount of metadata. LHCb deploys a separate “bookkeeping” database service to store the metadata for datasets and event collections. Corresponds to ARDA Job Provenance DB and Metadata Catalogue Based on central ORACLE server at CERN with query service through XML-RPC interface. Not scaleable, particularly for Grid, and completely new metadata solution required. ARDA based system will be investigated. Vital that this is development is optimised for LHCb and synchronised with data challenges. 4th February 2004 GRIDPP9 28 Metadata: Data Production Job.xml Data Production Production done Production Jobs Bookkeeping File Catalogue Prod.Mgr •Build new configuration •Selection of Defaults 4th February 2004 Configuration GRIDPP9 Information Flow 29 Metadata: Data Analysis Information Flow Job.opts Bookkeeping DIRAC User Job File Catalogue Modify Defaults Select input data User 4th February 2004 Pick-up default configuration GRIDPP9 Configuration 30 LHCb GANGA Status Alexander Soroko, Karl Harrison LHCb ATLAS + Alvin Tan Janusz Martyniak BaBar User Grid Interface. First prototype released in April 2003. To be deployed for LHCb 2004 Data Challenge. SUCCESS! 4th February 2004 GRIDPP9 31 GANGA for LHCb GANGA will allow LHCb user to perform standard analysis tasks: Data queries. Configuration of jobs, defining the job splitting/merging strategy. Submitting jobs to the chosen Grid resources. Following the progress of jobs. Retrieval of job output. Job bookkeeping. 4th February 2004 GRIDPP9 32 GANGA User Interface Grid/Batch System Gatekeeper Submit job Worker nodes JDL file Job Options file Ganga Job object Get job output Job script Send Get Monitoring Info Send job output Ganga Job object File Transfer Storage Element Local Client Ganga Job object Ganga Job object Ganga Job object Job Factory (Job Registry Class) Job Options Editor Data Selection Database of Standard Job Options (Input/Output Files) 4th February 2004 GRIDPP9 Strategy Selection Job Requirements Strategy Database (Splitting scripts) (LSF Resources, etc) 33 Software Bus GUI CLI Job Definition Software Bus Gaudi/Athena Job Definition Gaudi/Athena Job Options Editor BaBar Job Definition and Splitting Job Registry Job Handling Ganga components of general applicability or Core Components (to right in diagram) Ganga components providing specialised functionality (to left in diagram) External components (at bottom in diagram) File Transfer Python Native Gaudi Python Python Root PyCMT Py Magda PyAMI 4th February 2004 User has access to functionality of Ganga components through GUI and CLI, layered one over the other above a Software Bus Software Bus itself is a Ganga component implemented in Python Components used by Ganga fall into 3 categories: GRIDPP9 34 GUIs Galore 4th February 2004 GRIDPP9 35 DIRAC WMS Architecture Production service DIRAC Workload Management GANGA GANGA Job Receiver Optimizer 1 Optimizer Optimizer 11 Optimizer Optimizer 11 Command line UI Job queue Job DB 4th February 2004 Match Maker GRIDPP9 Computing resources Agent 1 LCG LCG CE CE Agent 2 LCG LCG WMS WMS Agent 3 DIRAC DIRAC CE CE 36 Future Plans Motivation Software Cache Component Cache Software/Component Server Refactorisation of Ganga, with submission on remote client Ease integration Remote Client of external Execution node components Grid/ Batch-System Agent Remote-Client Scheduler Scheduler (Runs/Validates Job) Facilitate multiperson, distributed Local Client development JDL, Classads, Increase Scheduler Proxy Dispatcher Job Requirements customizability/ LSF Resources, etc flexibility NorduGrid LSF Job Collection Permit GANGA Derived Requirements Local PBS (XML Description) components to DIAL EDG be used DIRAC USG Job Factory Other externally more (Machinery for Generating XML Descriptions of Multiple Jobs) simple Scheduler Service Job-Options Template Job-Options Editor Job-Options Knowledge Base 4th February 2004 Database of Standard Job Options Dataset Dataset Selection Strategy Selection User Requirements Dataset Catalogue Strategy Database (Splitter Algorithms) Database of Job Requirements 2ndGRIDPP9 GANGA prototype ~ April 2004 37 Future: GANGA Develop into generic front-end capable of submitting a range of applications to the Grid. Requires central core and modular structure (started with version 2 re-factorisation) to allow new frameworks to be plugged in. Enable GANGA to be used in complex analysis environment over many years for many users. Hierarchical structure, import/export facility, schema evolution, etc. Interact with multiple Grids (e.g. LCG, NorduGrid, EGEE…). Needs to keep pace with development of Grid services. Synchronise with ARDA developments. Interactive analysis? ROOT, PROOF 4th February 2004 GRIDPP9 38