Globus Data and Replica Management Ann Chervenak USC Information Sciences Institute

advertisement

Globus Data and Replica Management

Ann Chervenak

USC Information Sciences Institute

Talk Outline

Brief Introduction to Globus Toolkit

Globus Tools for Data Management

The Replica Location Service (RLS)

Examples of production use of RLS

Higher-level data management services

The Data Replication Service (DRS)

Summary

The Application-Infrastructure Gap

Dynamic and/or

Distributed

Applications

Shared Distributed Infrastructure

A B

1 1

9 9

Bridging the Gap:

Service-Oriented Infrastructure

Users

Service-oriented applications Composition

Wrap applications as services

Compose applications into workflows

Service-oriented infrastructure

Workflows

Invocation

Appln

Service

Appln

Service

Provisioning

Provision physical resources to support application workloads

Globus is Service-Oriented

Infrastructure Technology

Software for service-oriented infrastructure

Service-enable new & existing resources

Uniform abstractions & mechanisms

Tools to build applications that exploit serviceoriented infrastructure

Registries, security, data management, …

Open source & open standards

Each empowers the other

Enabler of a rich tool & service ecosystem

Globus Toolkit

Core Web services

Infrastructure for building new services

Security

Apply uniform policy across distinct systems

Execution management

Provision, deploy, & manage services

Data management

Discover, transfer, & access large data

Monitoring

Discover & monitor dynamic services

Globus Tools and Services for Data Management

GridFTP

A secure, robust, efficient data transfer protocol

The Reliable File Transfer Service (RFT)

Web services-based, stores state about transfers

The Data Access and Integration Service (DAIS)

Service to access to data resources, particularly relational and

XML databases

The Replica Location Service (RLS)

Distributed registry that records locations of data copies

The Data Replication Service

Web services-based, combines data replication and registration functionality

Replica Management in Grids

Data intensive applications produce terabytes or petabytes of data

Hundreds of millions of data objects

Replicate data at multiple locations for reasons of:

Fault tolerance

Avoid single points of failure

Performance

Avoid wide area data transfer latencies

Achieve load balancing

A Replica Location Service

A Replica Location Service (RLS) is a distributed registry that records the locations of data copies and allows replica discovery

RLS maintains mappings between logical identifiers and target names

Must perform and scale well: support hundreds of millions of objects, hundreds of clients

E.g., LIGO (Laser Interferometer Gravitational Wave

Observatory) Project

RLS servers at 10 sites

Maintain associations between 6 million logical file names & 40 million physical file locations

RLS Features

• Local Replica

Catalogs (LRCs) contain consistent information about logical-to-target mappings

Replica Location Indexes

RLI

LRC LRC LRC

RLI

LRC LRC

Local Replica Catalogs

• Replica Location Index (RLI) nodes aggregate information about one or more LRCs

• LRCs use soft state update mechanisms to inform RLIs about their state: relaxed consistency of index

• Optional compression of state updates reduces communication, CPU and storage overheads

Components of RLS Implementation

Common server implementation for LRC and RLI client

Front-End Server

Multi-threaded

Written in C

Supports GSI Authentication using X.509 certificates

Back-end Server

MySQL, PostgreSQL and Oracle

Relational Database

Client APIs: C, Java, Python

Client Command line tool

LRC/RLI Server

ODBC (libiodbc) myodbc mySQL Server

DB client

RLS Implementation Features

Two types of soft state updates from LRCs to RLIs

Complete list of logical names registered in LRC

Compressed updates: Bloom filter summaries of LRC

Immediate mode

Incremental updates

User-defined attributes

May be associated with logical or target names

Partitioning

Divide LRC soft state updates among RLI index nodes using pattern matching of logical names

Not used much in practice because compressed updates are efficient

Performance Testing

Extensive performance testing reported in HPDC 2004 paper

Performance of individual LRC (catalog) or RLI (index) servers

Client program submits operation requests to server

Performance of soft state updates

Client LRC catalogs sends updates to index servers

Software Versions:

Replica Location Service Version 2.0.9

Globus Packaging Toolkit Version 2.2.5

libiODBC library Version 3.0.5

MySQL database Version 4.0.14

MyODBC library (with MySQL) Version 3.51.06

Testing Environment

Local Area Network Tests

100 Megabit Ethernet

Clients (either client program or LRCs) on cluster: dual Pentium-III 547 MHz workstations with 1.5

Gigabytes of memory running Red Hat Linux 9

Server: dual Intel Xeon 2.2 GHz processor with 1

Gigabyte of memory running Red Hat Linux 7.3

Wide Area Network Tests (Soft state updates)

LRC clients (Los Angeles): cluster nodes

RLI server (Chicago): dual Intel Xeon 2.2 GHz machine with 2 gigabytes of memory running Red

Hat Linux 7.3

LRC Operation Rates (MySQL Backend)

Operation Rates,

LRC w ith 1 m illion entries in MySQL Back End,

Multiple Clients, Multiple Threads Per Client,

Database Flush Disabled

2500

2000

1500

1000

500

0

1 2 3 4 5 6 7 8 9 10

Num ber Of Clients

Query Rate w ith 10 threads per client

Add Rate w ith 10 threads per client

Delete Rate w ith 10 threads per client

• Up to 100 total requesting threads

• Clients and server on LAN

• Query: request the target of a logical name

• Add: register a new <logical name, target> mapping

• Delete a mapping

3000

2500

2000

1500

1000

500

0

1

Bulk Operation Performance

Bulk vs. Non-Bulk Operation Rates,

1000 Operations Per Request,

10 Request Threads Per Client

2 3 4 5 6 7

Num ber of clients

Bulk Query

Bulk Add/Delete

Non-bulk Query

Non-bulk Add

Non-bulk Delete

8 9 10

For user convenience, server supports bulk operations

E.g., 1000 operations per request

Combine adds/deletes to maintain approx. constant DB size

For small number of clients, bulk operations increase rates

E.g., 1 client

(10 threads) performs

27% more queries,

7% more adds/deletes

Bloom Filter Compression

Construct a summary of each LRC’s state by hashing logical names, creating a bitmap

RLI stores in memory one bitmap per LRC

Advantages:

Updates much smaller, faster

Supports higher query rate

Satisfied from memory rather than database

Disadvantages:

Lose ability to do wildcard queries, since not sending logical names to RLI

Small probability of false positives (configurable)

Relaxed consistency model

LRC

Database

Size

Bloom Filter Performance:

Single Wide Area Soft State Update

(Los Angeles to Chicago)

Avg. time to send soft state update

(seconds)

Avg. time for initial bloom filter computation

(seconds)

Less than 1 2

Size of bloom filter (bits)

1 million 100,000 entries

1 million entries

5 million entries

1.67

6.8

18.4

91.6

10 million

50 million

RLS in Production Use: LIGO

Laser Interferometer Gravitational Wave Observatory

Currently use RLS servers at 10 sites

Contain mappings from 6 million logical files to over 40 million physical replicas

Used in customized data management system: the

LIGO Lightweight Data Replicator System (LDR)

Includes RLS, GridFTP, custom metadata catalog, tools for storage management and data validation

RLS in Production Use: ESG

Earth System Grid: Climate modeling data (CCSM, PCM,

IPCC)

RLS at 4 sites

Data management coordinated by ESG portal

Datasets stored at NCAR

64.41 TB in 397253 total files

1230 portal users

IPCC Data at LLNL

26.50 TB in 59,300 files

400 registered users

Data downloaded: 56.80 TB in 263,800 files

Avg. 300GB downloaded/day

200+ research papers being written

RLS in Production Use:

Pegasus Workflow Manager

Pegasus: Planning for Execution in Grids

Used by scientific applications to manage complex executions

Pegasus system

Maps from a high-level, abstract definition of a workflow onto a Grid environment

Maps to a concrete or executable workflow in the form of a

Directed Acyclic Graph (DAG)

Passes this concrete workflow to the Condor DAGMan execution system

Pegasus uses RLS to

Identify physical replicas of logical files specified in the abstract workflow

Register new files created during workflow execution

Scientific applications that use RLS via Pegasus include:

LIGO

Atlas High energy physics application

Southern California Earthquake Center (SCEC)

Astronomy: Montage and Galaxy Morphology applications

Bioinformatics

Tomography

Other RLS Users

QCD Grid, US CMS experiment (integrated with POOL),

Atlas via Don Quijote

Motivation for

Data Replication Services

Data-intensive applications need higher-level data management services that integrate lower-level Grid functionality

Efficient data transfer (GridFTP, RFT)

Replica registration and discovery (RLS)

Eventually validation of replicas, consistency management, etc.

Goal is to generalize the custom data management systems developed by several application communities

Eventually plan to provide a suite of general, configurable, higher-level data management services

Globus Data Replication Service (DRS) is the first of these services

The Data Replication Service

Included in the Tech Preview of GT4.0 release

Design is based on the publication component of the

Lightweight Data Replicator system

Developed by Scott Koranda from U. Wisconsin at Milwaukee

Functionality

Replicate a set of files in the Grid on a local site

Users identify a set of desired files

DRS queries Replica Location Service to discover current locations of these files

Creates local replicas of desired files using the Reliable File

Transfer Service

Registers new replicas in Replica Location Service for discovery

Relationship to

Other Globus Services

At requesting site, deploy:

Local Site

WS-RF Services

Data Replication

Service

Delegation Service

Reliable File Transfer

Service

Data

Replication

Service

Replicator

Resource

Delegation

Service

Delegated

Credential

Reliable

File

Transfer

Service

RFT

Resource

Web Service Container

Pre WS-RF Components

Replica Location

Service (Local Replica

Catalog and Replica

Location Index)

GridFTP Server

Local

Replica

Catalog

Replica

Location

Index

GridFTP

Server

Service

EPR

EPR

EPR

Resource

RPs

WSRF in a Nutshell

GetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

Service

State Management:

Resource

Resource Property

State Identification:

Endpoint Reference

State Interfaces:

GetRP, QueryRPs,

GetMultipleRPs, SetRP

Lifetime Interfaces:

SetTerminationTime

ImmediateDestruction

Notification Interfaces

Subscribe

Notify

ServiceGroups

Performance Measurements:

Wide Area Testing

The destination for the pull-based transfers is located in

Los Angeles

Dual-processor, 1.1 GHz Pentium III workstation with 1.5

GBytes of memory and a 1 Gbit Ethernet

Runs a GT4 container and deploys services including RFT and DRS as well as GridFTP and RLS

The remote site where desired data files are stored is located at Argonne National Laboratory in Illinois

Dual-processor, 3 GHz Intel Xeon workstation with 2 gigabytes of memory with 1.1 terabytes of disk

Runs a GT4 container as well as GridFTP and RLS services

DRS Operations Measured

Create the DRS Replicator resource

Discover source files for replication using local RLS

Replica Location Index and remote RLS Local

Replica Catalogs

Initiate an Reliable File Transfer operation by creating an RFT resource

Perform RFT data transfer(s)

Register the new replicas in the RLS Local Replica

Catalog

Experiment 1: Replicate

10 Files of Size 1 Gigabyte

Component of Operation Time (milliseconds)

Create Replicator Resource 317.0

Discover Files in RLS

Create RFT Resource

449.0

808.6

Transfer Using RFT

Register Replicas in RLS

1186796.0

3720.8

Data transfer time dominates

Wide area data transfer rate of 67.4 Mbits/sec

Experiment 2: Replicate

1000 Files of Size 10 Megabytes

Component of Operation Time (milliseconds)

Create Replicator Resource 1561.0

Discover Files in RLS

Create RFT Resource

9.8

1286.6

Transfer Using RFT

Register Replicas in RLS

963456.0

11278.2

Time to create Replicator and RFT resources is larger

Need to store state for 1000 outstanding transfers

Data transfer time still dominates

Wide area data transfer rate of 85 Mbits/sec

Summary

Globus Tools for Data Management

GridFTP protocol

Reliable File Transfer Service

OGSA Data Access and Integration Service

Replica Location Service

Data Replication Service

RLS used in production at large scale by a variety of scientific applications

Moving toward configurable, general higher-level data services

DRS is first of these

For More Information

RLS

“Performance and Scalability of a Replica Location Service,”

High Performance Distributed Computing Conference, 2004 http://www.isi.edu/~annc/papers/chervenakhpdc13.pdf

Documentation: http://www.globus.org/toolkit/docs/4.0/data/rls

DRS

“Wide Area Data Replication for Scientific Collaborations,”

Grid Computing (Grid2005), http://www.isi.edu/~annc/papers/grid2005final.pdf

Documentation: http://www.globus.org/toolkit/docs/4.0/techpreview/datarep

Download