Experiences of the Grid… Gavin McCance University of Glasgow

advertisement
Experiences of the Grid…
Gavin McCance
University of Glasgow
NeSC Meeting, 24 October 2001
Background
Experimental Particle Physics
background

Analysing the structure of matter
…Fortran (19)77 !
Working in ‘Grid’-like areas since
January this year
NeSC 24 October 2001
Gavin McCance, University of Glasgow
2/39
GridPP
NeSC 24 October 2001
20+ institutes…
Gavin McCance, University of Glasgow
3/39
…GridPP
£17M 3-year project
Working in collaboration with EU
DataGrid project
Middleware production
Integration of middleware technologies
into HEP experiments
Validation of Grid Software
NeSC 24 October 2001
Gavin McCance, University of Glasgow
4/39
…GridPP
Initial GridPP testbed underway
A personal snapshot of activities on
the grid…

Middleware activities we’re involved in
 Some examples


Technologies we’re using
Issues with integration of ‘Grid’ with
particle physics experiments
NeSC 24 October 2001
Gavin McCance, University of Glasgow
5/39
Middleware
What is middleware…???
Application programs – local gridopen()
Grid middleware
Data access specifics – HPSS, Castor
Job submission specifics – PBS, LSF
Specific security procedures
NeSC 24 October 2001
Gavin McCance, University of Glasgow
Layered API’s.
Transparent security.
Transparent data access.
Intelligent use of
distributed resources.
6/39
Middleware Activities
GridPP ~mirrors EU DataGrid:
Workload Management

What jobs go where?
Data Management (*)

Where’s the (best) data?
Information Services (*)

What’s the state of everything?
NeSC 24 October 2001
Gavin McCance, University of Glasgow
7/39
…Middleware Activities
Fabric Management

Interfaces to underlying systems
Mass Storage Management

How to get the data to/from the fabric e.g.
Implementing ‘file-save()’ APIs for different mass
storage systems
Security

Crops up everywhere … transparent to
applications
NeSC 24 October 2001
Gavin McCance, University of Glasgow
8/39
Data Management
Data Replication
Transparent and Secure Data Access
Meta Data Storage
Query Optimisation
NeSC 24 October 2001
Gavin McCance, University of Glasgow
9/39
Example problem:
Data Replication
Problems if data exist only in one place


Multiple accesses to the same data
overload network! Petabytes!
Funding constraints! e.g. CERN can’t store
all of the data required
Make Replica! But need to keep track of
all the files and their various replica!

Need replica catalogue!
NeSC 24 October 2001
Gavin McCance, University of Glasgow
10/39
…Catalogues
Examples solutions:
Have a globally unique Logical File Name
(LFN) mapping to multiple physical instances
of the file (PFNs).
LFN
Paris
File-1
File-1
Replica selection required


Glasgow
File-1
Chicago
Choose the ‘best’ / ‘nearest’ / ‘fastest’
File-1
Cost modelling… how time expensive to transfer
files X’ from A to B
NeSC 24 October 2001
Gavin McCance, University of Glasgow
11/39
…Data Replication
Grid Data Mirroring Package

C, C++, JAVA, command-line APIs
Replication issues:

File transfer…

Synchronisation / consistency models
 Basic middleware doesn’t enforce any policy

Scalable architectures
NeSC 24 October 2001
Gavin McCance, University of Glasgow
12/39
…GDMP
File transfer uses GridFTP


Existing IETF-approved (?RFC?) ftp additions
+ the standard grid security (GSI)
Registers new files in replica catalogue

E.g. interfaced to the existing Globus Replica
Catalogue
Basic replica manager functionality to
maintain consistency of replica sets
NeSC 24 October 2001
Gavin McCance, University of Glasgow
13/39
…Implementation issues
Structure not imposed by the
middleware software itself…

But … must think about scalable
implementations
E.g. a RC may exist on each storage
element  responsible for its own files
CERN Root RC
INFN RC
NeSC 24 October 2001
UK RC
CERN RC
Queries will propagate down
until replica information is
found…
Gavin McCance, University of Glasgow
14/39
…Longer term problems
Query / Replica Optimisation

Grid can make / delete replica
 Eg. Many people in Glasgow & Edinburgh
access the ATLAS Higgs dataset ‘A1’…


Paris
B1
Grid might re-cluster data
Paris
Glasgow
A1
A2
Autonomously make new replica in / near Scotland
based on historical information
B2
NeSC 24 October 2001
B3
A1
A2
A3
A3
Gavin McCance, University of Glasgow
Glasgow
B1
B3
B2
15/39
…longer term
real Grid...
MONARC simulation tool
…simulated Grid provides
testing arena for more
adventurous ideas!
NeSC 24 October 2001
Gavin McCance, University of Glasgow
16/39
…Integration of middleware
Many iterations of requirements and
use-cases with end-users… meetings…
Middleware solutions must be scalable
and useable by a variety of end users

HEP, Biological, Earth sciences, Astro
Always looking for common elements

E.g. replica / meta-data catalogues… data
transport… security…
NeSC 24 October 2001
Gavin McCance, University of Glasgow
17/39
…examples of common interfaces:
generic meta-data catalogue tools
SQL Database Service:




Problem: many relational databases,
diverse security, diverse wire protocols
…Solution:
Build on existing wire protocols: XML
transported over HTTP(S)
Grid standard security framework (GSI)
NeSC 24 October 2001
Gavin McCance, University of Glasgow
18/39
..examples
Leverage open-source technology



JAVA servlet based (Apache Tomcat
engine)
JDBC drivers
Utilises Oracle’s XSQL servlet (open source)
Security over HTTPS with Grid-standard
GSI mechanism
NeSC 24 October 2001
Gavin McCance, University of Glasgow
19/39
…examples
Oracle
PostgreSQL
+
+
PKI Security
Standard communication
protocols
(XML over HTTPS)
= SQL Database Service (Spitfire)
Allows any HTTP compliant system e.g. Webbrowsers / standard C++ HTTP libraries to access
any relational database…
NeSC 24 October 2001
Gavin McCance, University of Glasgow
20/39
Global Grid Forum
Global Grid Forum meetings

GGF1: Amsterdam meeting in April 2001
Helps define aspects common to all
Grid-like projects.

E.g. architectures, ‘grid’ protocols
As example… Grid Monitoring
Architecture (GMA)
NeSC 24 October 2001
Gavin McCance, University of Glasgow
21/39
Information Services - GMA
One Implementation of the GMA
 Globus MDS, currently based on
(Open)LDAP
Hierarchical directory like structure


Very fast for information retrieval if you already
know the query  designed into structure.
Bad for complex or ranged queries
NeSC 24 October 2001
Gavin McCance, University of Glasgow
22/39
..complementary implementation
Register,
re-register,
publish
Producer
Producer
API
stream
Producer
Servlet
subscribe
Registry
Servlet
Query
Schema
Servlet
Consumer
Querying
API
Implementation of GMA
Relational queries in
SQL format
NeSC 24 October 2001
Gavin McCance, University of Glasgow
Relational
Database
23/39
…relational GMA
Information is transferred in generic
SQL format…
‘Producers’ of information register
themselves…
‘Consumers’ construct (possibly
complex) SQL query and are streamed
query results directly from Producers.
NeSC 24 October 2001
Gavin McCance, University of Glasgow
24/39
…implementation
Again, uses JAVA servlets

Tomcat servlet engine
Again, communication with servlet is
over standard HTTP.
All the internal parts communicate via
HTTP and XML  modular design,
easily replaceable…
NeSC 24 October 2001
Gavin McCance, University of Glasgow
25/39
Useful Tools…
JAVA… nicely platform independent
UML Universal(?) Modelling Language

Architecture and API’s ‘should be’ defined
in this…!
CASE tools

Together Control Centre
NeSC 24 October 2001
Gavin McCance, University of Glasgow
26/39
…useful tools
Globus toolkit

Both the original and its java implementation
(CoG)
My experience of CoG so far is generally
good…!

Easy GSI authentication, Globus file transfer,
Globus job submission, MDS interface
NeSC 24 October 2001
Gavin McCance, University of Glasgow
27/39
Testbeds
For GridPP, primary testbeds are the HEP
experiment ones


CERN LHC (EU DataGrid WP8)
US experiments, e.g. Fermilab, SLAC
First software release now!!

Integration team ‘show-and-tell’ at CERN end of
this month…
NeSC 24 October 2001
Gavin McCance, University of Glasgow
28/39
...testbed work
Grid software
packaged for
release to
experiments!
Primarily packaged
using RPM
For end of October release, supported platforms are:
Linux (and Solaris on a best effort basis)
NeSC 24 October 2001
Gavin McCance, University of Glasgow
29/39
..Globus installation
Generally found the Globus software
installation OK!


Successfully deployed on a number of batch
systems in UK
Experience fed back into eScience Centres
Difficulties were setting up and recognising
each countries’ Certificate Authorities (CAs)
 Tricky legal implications to resolve!
NeSC 24 October 2001
Gavin McCance, University of Glasgow
30/39
Testbed work so far…
UK Certificate Authority set-up…

Many institutes already on testbed
Grid Status and Network monitoring
demonstrator available soon
Networking status
information provided
by GridPP and
DataGrid networking
groups!
NeSC 24 October 2001
Gavin McCance, University of Glasgow
31/39
…testbed work so far
Successful tests within ATLAS (and
others) of some middleware products

E.g. Large file transfers between Glasgow,
Italy, US and CERN
Further tests planned with new release!
NeSC 24 October 2001
Gavin McCance, University of Glasgow
32/39
…experimental integration
Work to do…

Taking the kit and trying to integrate it into
the experiments’ software frameworks
Make Grid Services
transparently
available to
ATLAS and LHCb
programs
ATLAS/LHCb software
framework (GAUDI)
GANGA framework
Grid middleware
NeSC 24 October 2001
Gavin McCance, University of Glasgow
33/39
Grid validation
Preliminary tests of basic middleware
has been successful
Now we have opportunity to see how it
performs and scales with real datasets
and real experimental users
NeSC 24 October 2001
Gavin McCance, University of Glasgow
34/39
…to do…
Preliminary grid software architectures
have been defined
Basic middleware has been delivered
Large scale validation underway
A excellent base to build on!
Much still to do!
NeSC 24 October 2001
Gavin McCance, University of Glasgow
35/39
Overall experience
Middleware development is fun!

Several good products have already been
delivered
Re-using industry standard components and
protocols where they exist



LDAP, SQL, HTTP(S), XML, SOAP
PKI security
Open Source…!
NeSC 24 October 2001
Gavin McCance, University of Glasgow
36/39
…overall
Middleware being built using a variety
of languages… JAVA, C++, C, Python
APIs should be available for all JAVA,
C++, C and command line… web
access(?)
NeSC 24 October 2001
Gavin McCance, University of Glasgow
37/39
…overall
Coordination very important
Forums for discussion:
Vital to ensure middleware is useful to a wide
range of applications
Prevent divergent technology
NeSC 24 October 2001
Gavin McCance, University of Glasgow
38/39
…finally…
Experimental testbeds in place
Testable software in place
Full integration and validation to begin
in earnest Now!
NeSC 24 October 2001
Gavin McCance, University of Glasgow
39/39
Download