The National Grid Service David Fergusson

advertisement
http://www.nesc.ac.uk/training
http://www.ngs.ac.uk
The National Grid Service
David Fergusson
dfmac@nesc.ac.uk
http://www.pparc.ac.uk/
http://www.eu-egee.org/
Policy for re-use
• This presentation can be re-used for academic purposes.
• However if you do so then please let trainingsupport@nesc.ac.uk know. We need to gather statistics
of re-use: no. of events, number of people trained.
Thank you!!
2
Acknowledgements
• Some NGS slides are taken from talks by Stephen
Pickles and Andy Richards
• Also slides from Malcolm Atkinson on the UK eScience programme
3
Overview
• e-Infrastructure in the UK
• The National Grid Service
4
The National Grid
Service
• The core UK grid, resulting from the UK's e-Science
programme.
– Grid: virtual computing across admin domains
• Production use of computational and data grid resources
– For projects and individuals
– Free at point of use to UK academics
– Note: Scalability demands universities/VOs contribute resources
• Supported by JISC: “core sites”, operations, support
– Entered 2nd phase of funding in October 2006: 2 ½ years
– Longer terms plans being laid
5
e-Science Centres in the UK
e-Science Centres
Centres of Excellence
Lancaster
Lancaster
Other Centres
Access
Access Grid
Grid
Support
Support Centre
Centre
Belfast
Belfast
Coordination & Leadership:
NeSC & e-Science
Digital
Curation
Digital Curation Centre
Centre
Directors’ Forum
NeSC
NeSC
York
York
Newcastle
Newcastle
Leicester
Leicester
Manchester
Manchester
National
National Centre
Centre for
for
Text
Text Mining
Mining
National
National Centre
Centre for
for
e-Social
e-Social Science
Science
Daresbury
Daresbury
National
National Institute
Institute for
for
Environmental
Environmental e-Science
e-Science
Cambridge
Cambridge
Birmingham
Birmingham
Oxford
Oxford
National
National Grid
Grid
Service
Service
Cardiff
Cardiff
Bristol
Bristol
Reading
Reading
RAL
RAL
+ ~2 years
Open
6
Open Middleware
Middleware
Infrastructure
Southampton London
London
Infrastructure Institute
Institute Southampton
London
London
Edinburgh
7
+5 years
OMII-UK nodes
EPCC
EPCC &
& National
National e-Science
e-Science Centre
Centre
School
School of
of Computer
Computer Science
Science
University
University of
of Manchester
Manchester
Edinburgh
School
School of
of Electronics
Electronics and
and
Computer
Computer Science
Science
University
University of
of Southampton
Southampton
Manchester
Southampton
8
+3 years
UK e-Infrastructure
http://www.nesc.ac.uk/training
Regional and
Campus grids
HPCx + HECtoR
Users get common access, tools, information,
http://www.ngs.ac.uk
Nationally supported services,
through NGS
Community Grids
Integrated
internationally
LHC
VRE, VLE, IE
ISIS TS2
http://www.pparc.ac.uk/
http://www.eu-egee.org/
The National Grid
Service
10
The National Grid
Service
• The core UK grid, resulting from the UK's e-Science
programme.
– Grid: virtual computing across admin domains
• Production use of computational and data grid
resources.
• Supported by JISC
– Entering 2nd phase of funding
11
UofD
U
of
A
H
P
C
x
Commercial
Provider
PSRE
Man. Leeds
GOSC
RAL
Oxford
C
S
A
R
U
of
B
U
of
C
NGS Core Nodes: Host core services, coordinate integration, deployment and support
+free to access resources for all VOs. Monitored interfaces + services
NGS Partner Sites: Integrated with NGS, some services/resources available for all VOs
Monitored interfaces + services
NGS Affiliated Sites: Integrated with NGS, support for some VO’s
Monitored interfaces (+security etc.)
12
National Grid Service and partners
Edinburgh
CCLRC Rutherford
Appleton Laboratory
Lancaster
Manchester York
Cardiff Didcot
Westminster
Bristol
13
+2.5 years
OU=BBSRC
OU=Cardiff
OU=Cambridge
OU=Bristol
OU=Birmingham
OU=CLRC
Users by institution
OU=
OU=York
01 August
09
2004
November
2004
45
Count of OU=
35
25
100
Linear (NGS User
Registrations)
Total
0
28 May
2005
Count of "RC"
14
160 05
September December
2005
2005
Date
140
120
40
100
30
80
20
60
15
40
5
20
0
bbsrc
cclrc
epsrc
nerc
"RC"
pparc
Sociology
NGS User
Registrations
Medicine
200
Humanities
250
PP + Astronomy
12
34
56
78
9
110
1121
3
1154
116
1187
9
2210
22
223
2245
2276
8
329
3301
3332
4
335
3376
38
4409
4421
4434
5
4467
448
5590
5521
53
5554
6
557
5689
6610
6623
4
6656
667
6789
70
7712
3
7745
776
7778
8809
1
882
8843
865
8878
899
9901
2
9934
995
9967
98
110090
1101
110032
04
110065
1107
110198
10
111121
113
1114
111156
17
111289
1120
112212
23
112245
1126
112287
139
113310
32
113343
113365
7
Number of Registered NGS Users
Env. Sci
50
Eng. + Phys. Sci
Number of Users
150
Large facilities
10
17
February
2005
biology
23 April
2004
OU=Westminster
OU=Warwick
0
14 January
2004
OU=UCL
OU=Southampton
OU=Sheffield
OU=Reading
Files stored
OU=QueenMaryLondon
~400 users
OU=QUB
OU=Portsmouth
OU=Oxford
OU=OASIS
OU=Nottingham
OU=Newcastle
OU=Manchester
OU=Liverpool
OU=Leeds
OU=Lancaster
OU=Imperial
OU=Glasgow
OU=Edinburgh
OU=DMPHB
OU=DLS
OU=CPPM
50
O=universiteit-utrecht
NGS Use
Usage Statistics (Total Hours for all 4 Core Nodes)
250000
200000
150000
Hours
User DN
300
100000
50000
0
Users (Anonymous)
CPU time by user
Total
AHRC
mrc
esrc
Users by discipline
14
Supporting Services
•
UK Grid Services
– National Services
• Authentication, authorisation, certificate management, VO registration, security, network monitoring,
help desk + support centre.
– NGS Services and interfaces
• Job submission, simple registry, data transfer, data access and integration, resource brokering,
monitoring and accounting, grid management services, workflow, notification, operations centre.
– NGS core-node Services
• CPU, (meta-) data storage, key software
– Services coordinated with others (eg OMII, NeSC, EGEE, LCG):
• Integration testing, compatibility & Validation Tests, User Management, training
•
Administration:
–
–
–
–
Policies and acceptable use
Service Level Agreements and Definitions
Coordinate deployment and Operations
Operational Security
15
Applications: 2
Systems Biology
Econometric
analysis
Example: La2-xSrxNiO4
Neutron Scattering
Climate
modelling
16
H. Woo et al, Phys Rev B 72 064437 (2005)
Membership options
Two levels of membership (for sharing resouces):
1.
Affiliates
–
–
–
run compatible stack, integrate support arrangements
adopt NGS security policies
all access to affiliate’s resources is up to the affiliate
•
2.
except allowing NGS to insert probes for monitoring purposes
Partners also
–
–
–
–
–
make “significant resources” available to NGS users
enforce NGS acceptable use policies
provide accounting information
define commitments through formal Service Level Descriptions
influence NGS direction through representation on NGS Technical Board
17
Membership
pipeline
September 2006 (not a complete list)
• Partners
– GridPP sites, initially Imperial, Glasgow
– Condor/Windows at Cardiff
– Belfast e-Science Centre (service hosting, GridSAM,...)
•
Affiliates
– NW-Grid/Manchester SGI Prism
– SunGrid
•
Data partners (early discussions)
– MIMAS and EDINA
•
Others in discussion
18
New partners
Over the last year, several new full partners have joined
the NGS:
– Bristol, Cardiff, Lancaster and Westminster
– Further details of resources can be found on the NGS web
site: www.ngs.ac.uk.
• Resources committed to the NGS for a period of at
least 12 months.
• Heterogeneity introduced by these new services
19
NGS Facilities
•
Leeds and Oxford (core compute nodes)
– 64 dual CPU intel 3.06GHz (1MB cache). Each node: 2GB memory, 2x120GB disk,
Redhat ES3.0. Gigabit Myrinet connection. 2TB data server.
•
Manchester and Rutherford Appleton Laboratory (core data nodes)
– 20 dual CPU (as above). 18TB SAN.
•
Bristol
– initially 20 2.3GHz Athlon processors in 10 dual CPU nodes.
•
Cardiff
– 1000 hrs/week on a SGI Origin system comprising 4 dual CPU Origin 300 servers with a
Myrinet™ interconnect.
•
Lancaster
– 8 Sun Blade 1000 execution nodes, each with dual UltraSPARC IIICu processors
connected via a Dell 1750 head node.
•
Westminster
– 32 Sun V60 compute nodes
•
HPCx
– …
For more details: http://www.ngs.ac.uk/resources.html
20
NGS software
• Computation services based on GT2
– Use compute nodes for sequential or parallel jobs, primarily
from batch queues
– Can run multiple jobs concurrently (be reasonable!)
• Data services:
– Storage Resource Broker:
• Primarily for file storage and access
• Virtual filesystem with replicated files
– “OGSA-DAI”: Data Access and Integration
• Primarily for grid-enabling databases (relational, XML)
– NGS Oracle service
21
NGS Software - 2
•
Middleware
– Resource Broker
– Applications Repository (“NGS Portal”)
– GridSAM – alternative for job submission and monitoring
– GRIMOIRES – registry of services (e,g,GridSAM instances)
•
Developed by partners:
– Application Hosting Environment: AHE
– P-GRADE portal and GEMLCA
•
Also deployed
– VOMS support
– WS-GRAM: GT4 job submission
•
Under development
– Shibboleth integration
22
Resource Broker
User describes job in text file
using Job Description
Language
Local
Workstation
UI (user interface)
has preinstalled
client software
Submits job to Resource Broker
UI
(pre-production use at present)
Resource
Broker
NGS nodes
23
GridSAM
User describes job in XML using Job
Submission Description Language
Local
Workstation
Web services interfaces to chosen
GridSAM instance (SAM: Submission
and Monitoriing)
UI (user interface)
has preinstalled
client software
GridSAM
GridSAM
NGS nodes
24
GridSAM
GridSAM
Managing middleware
evolution
•
Important to coordinate and integrate this with deployment and operations
work in EGEE, LCG and similar projects.
•
Focus on deployment and operations, NOT development.
EGEE…
Other software
sources
Prototypes &
specifications
‘Gold’ services
NGS
ETF
Software with proven
capability & realistic
deployment
experience
Operations
UK,
Campus
and other
grids
Feedback & future requirements
Engineering Task Force
Deployment/testing/advice
25
Gaining Access
Free (at point of use) access to
core and partner NGS
nodes
1. Obtain digital X.509
certificate
–
–
from UK e-Science CA
or recognized peer
2. Apply for access to the NGS
National HPC services
• HPCx
• Must apply separately to
research councils
• Digital certificate and
conventional (username/
password) access supported
26
Key facts
• Production: deploying middleware after selection and
testing – major developments via Engineering Task
Force.
• Evolving:
– Middleware
– Number of sites
– Organisation:
• VO management
• Policy negotiation: sites, VOs
• International commitment
• Gathering users’ requirements – National Grid Service
27
Web Sites
• NGS
–
–
–
–
http://www.ngs.ac.uk
To see what’s happening: http://ganglia.ngs.rl.ac.uk/
New wiki service: http://wiki.ngs.ac.uk
Training events: http://www.nesc.ac.uk/training
• HPCx
– http://www.hpcx.ac.uk
28
Summary
• NGS is a production service
– Therefore cannot include latest research prototypes!
– Formalised commitments - service level agreements
• Core sites provide computation and data services
• NGS is evolving
– OMII, EGEE, Globus Alliance all have m/w under assessment for the
NGS
• Selected, deployed middleware currently provides “low-level” tools
– New deployments will follow
– New sites and resources being added
29
Download