Distributed Database Deployment for the LHC Experiments

advertisement
LCG
Distributed Database
Deployment for the LHC
Experiments
Stefan Stonjek (Oxford), Dirk Düllmann (CERN), Gordon Brown (RAL)
on behalf of the LCG 3D project
02-Nov-2005
Edinburgh
Spatiotemporal Databases for Geosciences, Biomedical sciences and Physical
sciences
Outline
LCG
• Introduction
• LHC computing and databases
• The distributed database deployment
(3D) project
• Service architecture and
implementations
• Summary
03-Nov-2005
Distributed Database Deployment for the LHC experiments
2
Large Hadron Collider
Grid Computing
LCG
• 4 elementary particle physics experiments at CERN in
Geneva
– ATLAS, CMS, Alice LHCb
– LCG = LHC computing grid
• ~15 PetaBytes per year
– mostly in files
– slowly changing data in databases
• e.g. file catalogue, geometry and conditions
• Decentralized computing (multi tier model)
– CERN = Tier-0; 11 large Tier-1 centres; ~100 Tier-2 centres
– files copied between sites
– databases need replication, at least partly
03-Nov-2005
Distributed Database Deployment for the LHC experiments
3
Multi Tier Computing for LHC
LCG
T2
T2
T2s and T1s are inter-connected
by the general purpose research
networks
T2
T2
T2
T2
GridKa
IN2P3
T2
T2
Any Tier-2 may
access data at
any Tier-1
Dedicated
10 Gbit links
Brookhaven
TRIUMF
ASCC
T2
Nordic
Fermilab
T2
RAL
CNAF
T2
T2
T2
03-Nov-2005
PIC
SARA
T2
T2
Distributed Database Deployment for the LHC experiments
4
Client access patterns
LCG
•
•
Main applications: Reconstruction, Simulation
Access Patterns
–
–
–
–
•
•
•
Read from file: experiment data, physics model
Read from database: file catalogue, geometry, conditions
Write to file: reconstructed data, simulated data
Write to database: file catalogue
Some minor applications (calibration) write to conditions database
Geometry, conditions: high volume; file catalogue: low volume
No need for instantaneous replication
 Model for conditions database: write only at Tier-0 and replicate to
Tier-1s and from there to Tier-2s
•
•
Geometry change less, can be deployed by files
File catalogue is a more localized issue, not covered in this talk
• file catalogue is local for two but distributed for two other experiments
03-Nov-2005
Distributed Database Deployment for the LHC experiments
5
Why a LCG Database Deployment Project?
LCG
•
•
LCG today provides an infrastructure for distributed access to file based
data and file replication
Physics applications (and grid services) require a similar services for
data stored in relational databases
– Several applications and services already use RDBMS
– Several sites have already experience in providing RDBMS services
•
Goals for common project as part of LCG
–
–
–
–
•
increase the availability and scalability of LCG and experiment components
allow applications to access data in a consistent, location independent way
allow to connect existing db services via data replication mechanisms
simplify a shared deployment and administration of this infrastructure during
24 x 7 operation
Need to bring service providers (site technology experts) closer to
database users/developers to define a LCG database service
– Time frame: First deployment in 2005 data challenges (autumn ‘05)
03-Nov-2005
Distributed Database Deployment for the LHC experiments
6
Project Non-Goals
LCG
 Store all database data
• Experiments are free to deploy databases and distribute data
under their responsibility
 Setup a single monolithic distributed database system
• Given constraints like WAN connections one can not assume
that a single synchronously updated database work and provide
sufficient availability.
 Setup a single vendor system
• Technology independence and a multi-vendor implementation
will be required to minimize the long term risks and to adapt to
the different requirements/constraints on different tiers.
 Impose a CERN centric infrastructure to participating sites
• CERN is one equal partner of other LCG sites on each tier
 Decide on an architecture, implementation, new services,
policies
• Produce a technical proposal for all of those to LCG PEB/GDB
03-Nov-2005
Distributed Database Deployment for the LHC experiments
7
Current Database Services
at LCG Sites
LCG
• Several sites provide Oracle production services for HEP and
non-HEP applications
– Deployment experience and procedures exists…
– … but can not be changed easily without affecting other site
activities
• MySQL is very popular in the developer community
– Expected to deployable with limited db administration resources
– Still limited large scale production experience with MySQL
• But several applications are bound to MySQL
• Expect a significant role for both database flavors
– To implement different parts of the LCG infrastructure
03-Nov-2005
Distributed Database Deployment for the LHC experiments
8
Application software stack
and Distribution Options
LCG
APP
web
cache
web
cache
03-Nov-2005
client software
RAL
RAL = relational abstraction layer
SQLite file
Oracle
MySQL
network
database and cache servers
Distributed Database Deployment for the LHC experiments
9
Tiers, Resources and Level of Service
LCG
•
Different requirements and service capabilities for different tiers
– Tier1 Database Backbone
•
•
•
•
High volume, often complete replication of RDBMS data
Can expect good network connection to other T1 sites
Asynchronous, possibly multi-master replication
Large scale central database service, local dba team
– Tier2
• Medium volume, often only sliced extraction of data
• Asymmetric, possibly only uni-directional replication
• Part time administration (shared with fabric administration)
– Tier3/4 (eg Laptop extraction)
• Support fully disconnected operation
• Low volume, sliced extraction from T1/T2
•
Need to deploy several replication/distribution technologies
– Each addressing specific parts of the distribution problem
– But all together forming a consistent distribution model
03-Nov-2005
Distributed Database Deployment for the LHC experiments
10
Possible distribution technologies
LCG
•
Vendor native distribution,Oracle replication and related technologies
–
–
–
–
–
•
Table-to-Table replication via asynchronous update streams
Transportable tablespaces
Little (but non-zero) impact on application design
Potentially extensible to other back-end database through API
Evaluations done at FNAL and CERN
Combination of http based database access with web proxy caches
close to the client
– Performance gains
• reduced real database access for largely read-only data
• reduced transfer overhead compared to low level SOAP RPC based approaches
– Deployment gains
• Web caches (e.g. squid) are much simpler to deploy than databases and could
remove the need for a local database deployment on some tiers
• No vendor specific database libraries on the client side
• “Firewall friendly” tunneling of requests through a single port
03-Nov-2005
Distributed Database Deployment for the LHC experiments
11
Possible Service Architecture
LCG
O
T0
M
M
- autonomous
reliable service
T1- db back bone
- all data replicated
- reliable service
O
T2 - local db cache
T3/4
-subset data
-only local service
M
O
M
Oracle Streams
Cross vendor extract
MySQL Files
Proxy Cache
03-Nov-2005
Distributed Database Deployment for the LHC experiments
12
Oracle Real Application Cluster
RAC
LCG
• Several sites are testing/deploying Oracle
clusters
– CERN, CNAF, FNAL, GridKA, IN2P3, RAL,BNL
• Several experiments foresee Oracle services
at their pit
• Propose to organize meetings for more
detailed DBA discussions
–
–
–
–
03-Nov-2005
Tier 1 DB teams and Online DB teams
Coordinate test plans
RAC setup and existing test results
Storage configuration and performance tests
Distributed Database Deployment for the LHC experiments
13
Proposed Tier 1 Architecture
LCG
• Experiment plans firming up on application list
– DB volume and access predictions still rather
uncertain
• Several sites looking at db clusters
• Propose to buy modular hardware which at
least can be turned into clusters later - eg
– Shareable (SAN) storage (FC based disk arrays)
– Dual CPU boxes with 2GB with mirrored disk for
system and database software
03-Nov-2005
Distributed Database Deployment for the LHC experiments
14
Proposed Tier 1 DB Server Setup
LCG
• Propose to setup (preferably as cluster)
– 2 nodes per supported experiment (ATLAS, LHCb)
– 2 nodes for the LCG middleware services (LFC, FTS, VOMS)
• As Oracle server node if site provides Oracle service
• As specified by GD for MySQL based implementation
• Note: LFC may come with Oracle constraint for LHCb replication
– Squid node (CMS - specs to be clarified)
• Target release Oracle 10gR2 (first patch set)
– Sites should consider RedHat ES 3.0 to insure Oracle
support
03-Nov-2005
Distributed Database Deployment for the LHC experiments
15
Service aspects discussed in 3D
LCG
•
DB Service Discovery
–
–
•
•
Connectivity, firewalls and connection constraints
Access Control - authentication and authorization
–
•
How does a job find a close by replica of the database it needs?
Need transparent (re)location of services – e.g. via a database replica catalog
Integration between DB vendor and LCG security models
Installation and Configuration
–
Database server and client installation kits
•
–
Server and client version upgrades (e.g. security patches)
•
–
Are transparent upgrades required for critical services?
Server administration procedures and tools
•
•
•
Which database client bindings are required (C, C++, Java(JDBC), Perl, ..) ?
Need basic agreements to simplify shared administration
Monitoring and statistics gathering
Backup and Recovery
–
–
03-Nov-2005
Backup policy templates, responsible site(s) for a particular data type
Acceptable latency for recovery
Distributed Database Deployment for the LHC experiments
16
Summary
LCG
• Together with the LHC experiments LCG will
define and deploy a distributed database
service at Tier 0-2 sites
• The 3D project has together with the
experiments define a model distributed
database deployment
• The 3D project started to test the deployment
of the databases
• Hardware setup for deployment in 2006
defined now, production scheduled for March
2006
03-Nov-2005
Distributed Database Deployment for the LHC experiments
17
Download