4 Approaching computing and data grid

advertisement
INFN computing scenarios for GRID architecture
Padova, 14 February 2000
Document for discussing on GRID tools and services with Carl Kesselman
R.Cucchi, A.Ghiselli, L.Luminari, L.Perini, M.Mazzucato M.Sgaravatto, C.Vistoli
1
1 Introduction ...........................................................................3
2 HEP computing....................................................................3
3 Computing requirements for LHC experiment .5
4 Approaching computing and data grid ................5
4.1
Layout Model....................................................................... 6
4.2
GRID middleware required .................................................. 7
5 Use case 1 ..............................................................................8
6 Use case 2 ..............................................................................8
6.1
Testbed ............................................................................... 9
2
1 Introduction
A computational grid is more than just a collection of resources: it is also a set of services for obtaining
information about grid components, locating and scheduling resources, communicating, accessing code and
data, measuring performance, authenticating users and resources , ensuring the privacy of communications ,
and so forth (from GRID book).
The aim of this document is to try to summarize the most important characteristics of LHC computing (from
Monarc documents), the requirements and to describe some use-cases in order to plan a test program for
GRID services and tools.
2 HEP computing
LHC experiments collaborations have already investigated many aspects of LHC computing in different
frameworks like RD45 and MONARC. From them some well-established elements are listed below:
 data type :
 RAW (taken at CERN) ~1MB per event, 10** 9 events per year per experiment (1PB)
 ESD ( Event Summary Data) refers to physics objects by construction not larger than 100KB
 AOD (Analysis Object data) refers to objects which facilitate analysis, by construction not larger
than 10KB. Created by collaboration analysis group from ESD.
 Tag refers to very small objects (100 to 500B) which identify an event by its physics signature.
 Tasks of offline software of each experiment:
 Data reconstruction: from RAW to ESD
 MC production
 Offline calibration
 Successive data reconstruction
 Analysis
 Technical Services
 database maintenance (including backup, recovery, installation of new versions, monitoring and
policing)
 basic and experiment-specific sw maintenance (backup, updating, installation)
 support for experiment-specific sw development
 production of tools for data services
 production and maintenance of documentation (including Web pages)
 storage management (disks, tapes, distributed file systems if applicable)
3
 CPU usage monitoring and policing

database access monitoring and policing

I/O usage monitoring and policing

network maintenance (as appropriate)

support of large bandwidth
 Current estimates for a single LHC experiment capacity to be installed at CERN by 2006 (see
Robertson):
 520,000 SI95 for CPU, covering data recording, first-pass reconstruction, some reprocessing,
basic analysis of the ESD, support for 4 analysis groups
 about 1400 boxes to be managed

540 TBs of disk capacity

3 PBs of automated tape capacity

46 GB/s LAN throughput
 One can assume that 10 to 100 TB of disk space is allocated to AOD/ESD/tags at the central site.
 Distributed, Hierarchical Regional Centers Architecture
The envisaged hierarchy of resources may be summarized in terms of tiers of RC with five decreasing levels
of complexity and capability. A possible scheme is:

Tier-0: CERN, acting also as a tier-1

Tier-1: large RC on national scale, expensive, multi-service

Tier-2: smaller RC, less expensive, mostly dedicated to analysis

Tier-3: institute workgroup servers, satellites of tier-2 and/or tier-1

Tier-4: individual desktops
 Data model
 Distributed ODBMS presently based on Objectivity/DB.
 OBJECTIVITY/DB characteristics (many of them to be kept in case of other DB-SW)
o
o
o
o
Store and manage object via C++
Distributed Database Architecture: Federated Database
Application architecture: client/server
 local server
 remote server
 lock server (MROW: multiple reader one write)
federated database: more database and server and only one lock server (single partition)
4
o
o
o
o
o
o
FTO (fault tolerant option) it allows to split one Federated DB in more partitions, with a
lock server.
DRO (database replication option) it allows database replication and to access the nearest
ones. It allows also parallel access to the data.
data modeling o SCHEMA : it’s the same for application and database.
Locking granularity:
 federated database
 database
 container
SERVER design
 page read/write
 cluster di oggetti
 caching nel server
DIMENSIONI del F.D.B.
 64-bit Object Reference 10M di TeraBytes
 OID (Object Identification) e’ unico in un F.D.B.
3 Computing requirements for LHC experiment
There are several characteristics of experimental HEP code and applications that are important in designing
computing facilities (based on GRID?) for HEP data processing.

In general the computing problem consists of processing very large number of independent
transactions, which may therefore be processed in parallel – the granularity of the
parallelism can be selected freely;

Modest floating point requirements – computational requirements are therefore expressed in
SPECint (not SPECfp) units;

Massive data storage: measured in PetaBytes (1015 Bytes) for each experiment;

Read-mostly data, rarely modified, usually simply completely replaced when new version
are generated;

High sustained throughput is more important than peak speed – the performance measure is
the time it takes to complete processing for all of the independent transactions;

Resilience of the overall (grid) system in the presence of sub-system failure is far more
important than trying to ensure 100% availability of all sub-system at all times.
Therefore HEP applications need High Throughput Computing rather then high performance computing.
4 Approaching computing and data grid
Which aspects to consider first?

Application programming: based on a wide spectrum of programming paradigm: Java/RMI,
CORBA, GRID-oriented …, running in multi-platform heterogeneous computing
environment (application oriented middleware system).
5

4.1
Computing resources distributed through INFN sites:
o
Tier 1; 2 or more regional center ; all kind of data, batch and interactive computing.
Receive data from CERN at 100Hz and replicate them. The RC could be distributed.
o
Tier 2/3; institute workgroup servers, mostly dedicated to analysis, satellites of tier1
o
Tier 4, individual desktop
Layout Model
The logical layout of the multi-tier client-server architecture for one LHC experiment is represented in the
following figure:
CERN –
Tier 0
Client
Client
Client
Data Server
Tier 1
Client
Data Server
Tier 2/3
Data Server
Tier 2/ 3
desktop
Client
deskop
WAN
Condor Pool
In the above configuration example there are three RCs with Data server (or Data mover) and computing
farms (client/server model). Several client machines connect data servers through LAN and WAN links.
(This will provide a direct comparison between LAN and WAN behavior and evaluating network impact on
application behavior and efficiency). Users access through desktop or Client machines. The INFN WAN
Condor pool is connected to the system grid.
Data are distributed through all the RC data servers; the users from the desktop run jobs and resource
managers must locate clients and data in order to process data in the more efficient way.
6
4.2
GRID middleware required
The INFN GRID working group (with physicists an IT experts) defined the following requirements:




Wide-area workload management

Optimal co-allocation of data, CPU and network for specific grid/network aware jobs

Distributed scheduling (data and/or code migration)

Unscheduled/scheduled job submission

Management of heterogeneous computing systems

Uniform interface to various local resource managers and schedulers

Priorities, policies on resource (CPU, DATA, Network) usage

Bookkeeping and ‘web’ user interface
Wide-area data management

Universal name-space: transparent and location independent

Data replication and caching

Data mover (scheduled/interactive at object/file/DB granularity)

Loose synchronization between replicas

Application metadata, interfaced with DBMS, i.e. Objectivity,….

Network services definition for a given application

End systems network protocol tuning
Wide-area application monitoring

Performance, “instrumented systems” with timing information and analysis tools

Run-time analysis of collected application events

Bottleneck analysis

Dynamic monitoring of GRID resources to optimize resource allocation

Failure management
Computing fabric and general utilities for a global managed Grid:

Configuration management of computing facilities

Automatic software installation and maintenance

System, service, network monitoring and global alarm notifications, automatic recovery from
failures

Resource use accounting

Security of GRID resources and infrastructure usage
7

5
Information service
Use case 1
5 Globus machines geographically distributed and configured to run simulation programs called High Level
Trigger. These jobs run on single machines with local disk I/O. The aim is to optimize cpu usage of the all 5
machines
6 Use case 2
This use case describes the WAN Monarc testbed with Objectivity 5.2 in a multiserver configuration:

Atlfast++ program is used to populate the database following the Tag/Event data model proposed by the
LHC++ project to read data from the database

3 AMS servers

1 single federation Objectivity database containing about 50.000 events (~ 2Gbytes).

Application program performs read/write access to the database
The procedure followed to perform these tests consists in submitting an increasing number of concurrent
jobs from each client and then monitoring CPU utilization, network throughput and job execution time(wall
clock time). The idea is to use globus and resource manager to optimize client cpu usage …..
QoS Flows
Client
Client
Client
Data Server
Data Server
Data Server
Client
Client
Network Layout will be configured also with QoS mechanisms based on Differentiated Services (DS)
allowing different priority traffic flows. The aim is to perform a careful evaluation of TCP performance and
8
Application performance and draw conclusions about how to configure DS providing premium service in
this scenario.
Together with DS mechanisms, GARA should be used to deliver per-flow, advance reservation, end-to-end
Quality of service.
Then data grid:…….
6.1
Testbed
Start Up
Clients
Genova
Clients
10MbpsWan link
server
Mi
10Mbps
TEN155
10Mbps
Padova
10Mbps
Garr-B
LAN or international
links
CERN
server
Clients
10Mbps
Cnaf
server
Roma
Bologna
Clients
Clients
There will be 3 data server: one in Milano, one in CNAF and one at CERN. These servers will be
interconnected with dedicated links at 10Mbps. At each of these servers will be linked different client sites at
10Mbps: Genova to Milano, Padova, Roma and Bologna will connect CNAF..
9
Download