Distributed Database Deployment for the LHC Experiments

LCG Distributed Database Deployment for the LHC Experiments Stefan Stonjek (Oxford), Dirk Düllmann (CERN), Gordon Brown (RAL) on behalf of the LCG 3D project 02-Nov-2005 Edinburgh Spatiotemporal Databases for Geosciences, Biomedical sciences and Physical sciences Outline LCG • Introduction • LHC computing and databases • The distributed database deployment (3D) project • Service architecture and implementations • Summary 03-Nov-2005 Distributed Database Deployment for the LHC experiments 2 Large Hadron Collider Grid Computing LCG • 4 elementary particle physics experiments at CERN in Geneva – ATLAS, CMS, Alice LHCb – LCG = LHC computing grid • ~15 PetaBytes per year – mostly in files – slowly changing data in databases • e.g. file catalogue, geometry and conditions • Decentralized computing (multi tier model) – CERN = Tier-0; 11 large Tier-1 centres; ~100 Tier-2 centres – files copied between sites – databases need replication, at least partly 03-Nov-2005 Distributed Database Deployment for the LHC experiments 3 Multi Tier Computing for LHC LCG T2 T2 T2s and T1s are inter-connected by the general purpose research networks T2 T2 T2 T2 GridKa IN2P3 T2 T2 Any Tier-2 may access data at any Tier-1 Dedicated 10 Gbit links Brookhaven TRIUMF ASCC T2 Nordic Fermilab T2 RAL CNAF T2 T2 T2 03-Nov-2005 PIC SARA T2 T2 Distributed Database Deployment for the LHC experiments 4 Client access patterns LCG • • Main applications: Reconstruction, Simulation Access Patterns – – – – • • • Read from file: experiment data, physics model Read from database: file catalogue, geometry, conditions Write to file: reconstructed data, simulated data Write to database: file catalogue Some minor applications (calibration) write to conditions database Geometry, conditions: high volume; file catalogue: low volume No need for instantaneous replication  Model for conditions database: write only at Tier-0 and replicate to Tier-1s and from there to Tier-2s • • Geometry change less, can be deployed by files File catalogue is a more localized issue, not covered in this talk • file catalogue is local for two but distributed for two other experiments 03-Nov-2005 Distributed Database Deployment for the LHC experiments 5 Why a LCG Database Deployment Project? LCG • • LCG today provides an infrastructure for distributed access to file based data and file replication Physics applications (and grid services) require a similar services for data stored in relational databases – Several applications and services already use RDBMS – Several sites have already experience in providing RDBMS services • Goals for common project as part of LCG – – – – • increase the availability and scalability of LCG and experiment components allow applications to access data in a consistent, location independent way allow to connect existing db services via data replication mechanisms simplify a shared deployment and administration of this infrastructure during 24 x 7 operation Need to bring service providers (site technology experts) closer to database users/developers to define a LCG database service – Time frame: First deployment in 2005 data challenges (autumn ‘05) 03-Nov-2005 Distributed Database Deployment for the LHC experiments 6 Project Non-Goals LCG  Store all database data • Experiments are free to deploy databases and distribute data under their responsibility  Setup a single monolithic distributed database system • Given constraints like WAN connections one can not assume that a single synchronously updated database work and provide sufficient availability.  Setup a single vendor system • Technology independence and a multi-vendor implementation will be required to minimize the long term risks and to adapt to the different requirements/constraints on different tiers.  Impose a CERN centric infrastructure to participating sites • CERN is one equal partner of other LCG sites on each tier  Decide on an architecture, implementation, new services, policies • Produce a technical proposal for all of those to LCG PEB/GDB 03-Nov-2005 Distributed Database Deployment for the LHC experiments 7 Current Database Services at LCG Sites LCG • Several sites provide Oracle production services for HEP and non-HEP applications – Deployment experience and procedures exists… – … but can not be changed easily without affecting other site activities • MySQL is very popular in the developer community – Expected to deployable with limited db administration resources – Still limited large scale production experience with MySQL • But several applications are bound to MySQL • Expect a significant role for both database flavors – To implement different parts of the LCG infrastructure 03-Nov-2005 Distributed Database Deployment for the LHC experiments 8 Application software stack and Distribution Options LCG APP web cache web cache 03-Nov-2005 client software RAL RAL = relational abstraction layer SQLite file Oracle MySQL network database and cache servers Distributed Database Deployment for the LHC experiments 9 Tiers, Resources and Level of Service LCG • Different requirements and service capabilities for different tiers – Tier1 Database Backbone • • • • High volume, often complete replication of RDBMS data Can expect good network connection to other T1 sites Asynchronous, possibly multi-master replication Large scale central database service, local dba team – Tier2 • Medium volume, often only sliced extraction of data • Asymmetric, possibly only uni-directional replication • Part time administration (shared with fabric administration) – Tier3/4 (eg Laptop extraction) • Support fully disconnected operation • Low volume, sliced extraction from T1/T2 • Need to deploy several replication/distribution technologies – Each addressing specific parts of the distribution problem – But all together forming a consistent distribution model 03-Nov-2005 Distributed Database Deployment for the LHC experiments 10 Possible distribution technologies LCG • Vendor native distribution,Oracle replication and related technologies – – – – – • Table-to-Table replication via asynchronous update streams Transportable tablespaces Little (but non-zero) impact on application design Potentially extensible to other back-end database through API Evaluations done at FNAL and CERN Combination of http based database access with web proxy caches close to the client – Performance gains • reduced real database access for largely read-only data • reduced transfer overhead compared to low level SOAP RPC based approaches – Deployment gains • Web caches (e.g. squid) are much simpler to deploy than databases and could remove the need for a local database deployment on some tiers • No vendor specific database libraries on the client side • “Firewall friendly” tunneling of requests through a single port 03-Nov-2005 Distributed Database Deployment for the LHC experiments 11 Possible Service Architecture LCG O T0 M M - autonomous reliable service T1- db back bone - all data replicated - reliable service O T2 - local db cache T3/4 -subset data -only local service M O M Oracle Streams Cross vendor extract MySQL Files Proxy Cache 03-Nov-2005 Distributed Database Deployment for the LHC experiments 12 Oracle Real Application Cluster RAC LCG • Several sites are testing/deploying Oracle clusters – CERN, CNAF, FNAL, GridKA, IN2P3, RAL,BNL • Several experiments foresee Oracle services at their pit • Propose to organize meetings for more detailed DBA discussions – – – – 03-Nov-2005 Tier 1 DB teams and Online DB teams Coordinate test plans RAC setup and existing test results Storage configuration and performance tests Distributed Database Deployment for the LHC experiments 13 Proposed Tier 1 Architecture LCG • Experiment plans firming up on application list – DB volume and access predictions still rather uncertain • Several sites looking at db clusters • Propose to buy modular hardware which at least can be turned into clusters later - eg – Shareable (SAN) storage (FC based disk arrays) – Dual CPU boxes with 2GB with mirrored disk for system and database software 03-Nov-2005 Distributed Database Deployment for the LHC experiments 14 Proposed Tier 1 DB Server Setup LCG • Propose to setup (preferably as cluster) – 2 nodes per supported experiment (ATLAS, LHCb) – 2 nodes for the LCG middleware services (LFC, FTS, VOMS) • As Oracle server node if site provides Oracle service • As specified by GD for MySQL based implementation • Note: LFC may come with Oracle constraint for LHCb replication – Squid node (CMS - specs to be clarified) • Target release Oracle 10gR2 (first patch set) – Sites should consider RedHat ES 3.0 to insure Oracle support 03-Nov-2005 Distributed Database Deployment for the LHC experiments 15 Service aspects discussed in 3D LCG • DB Service Discovery – – • • Connectivity, firewalls and connection constraints Access Control - authentication and authorization – • How does a job find a close by replica of the database it needs? Need transparent (re)location of services – e.g. via a database replica catalog Integration between DB vendor and LCG security models Installation and Configuration – Database server and client installation kits • – Server and client version upgrades (e.g. security patches) • – Are transparent upgrades required for critical services? Server administration procedures and tools • • • Which database client bindings are required (C, C++, Java(JDBC), Perl, ..) ? Need basic agreements to simplify shared administration Monitoring and statistics gathering Backup and Recovery – – 03-Nov-2005 Backup policy templates, responsible site(s) for a particular data type Acceptable latency for recovery Distributed Database Deployment for the LHC experiments 16 Summary LCG • Together with the LHC experiments LCG will define and deploy a distributed database service at Tier 0-2 sites • The 3D project has together with the experiments define a model distributed database deployment • The 3D project started to test the deployment of the databases • Hardware setup for deployment in 2006 defined now, production scheduled for March 2006 03-Nov-2005 Distributed Database Deployment for the LHC experiments 17

Distributed Database Deployment for the LHC Experiments

Related documents

Products

Support

Distributed Database Deployment for the LHC Experiments

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib